CAQA_Memory Flashcards by John Emy Bautista

This ensures that nearly all references can be found in smaller memories.

Temporal and spatial locality

How well did you know this?

Not at all

Perfectly

This gives the allusion of a large, fast memory being presented to the processor

Temporal and Spatial Locality

How well did you know this?

Not at all

Perfectly

In a memory hierarchy diagram, the _________ are put at the top while the ______ are put at the bottom

faster and smaller, slower and larger

How well did you know this?

Not at all

Perfectly

Why is the memory hierarchy design crucial in recent multi-core processors?

CPU speed increases much faster memory speed, leading to the memory wall. Caches are needed to close this gap.

How well did you know this?

Not at all

Perfectly

What does CPUs need to cope with its huge bandwidth demand?

multi-port caches
per-core L1/L2 caches
shared L3 cache

How well did you know this?

Not at all

Perfectly

High-end microprocessors have _____ on chip cache.

> 10 MB

How well did you know this?

Not at all

Perfectly

When a word/data is not found in the cache, a __________ occurs.

miss

How well did you know this?

Not at all

Perfectly

What does the cache do when a miss occurs?

fetch from lower level in hierarchy that may be another cache or memory. Fetching other words within the block taking advantage of spatial locality

How well did you know this?

Not at all

Perfectly

n-way = __________

n-blocks per set

How well did you know this?

Not at all

Perfectly

If one block per set, the cache is

direct-mapped (1-way)

How well did you know this?

Not at all

Perfectly

If the cache is 1 way but with all blocks inside it, its associativity is

fully associative

How well did you know this?

Not at all

Perfectly

Immediately update lower levels of hierarchy

Write-through

How well did you know this?

Not at all

Perfectly

Only update lower levels of hierarchy when an updated block is replaced

Write-back

How well did you know this?

Not at all

Perfectly

Writing strategies use ___________ to make writes asynchronous.

write buffer

How well did you know this?

Not at all

Perfectly

Fraction of cache access that result in a miss

Miss rate

How well did you know this?

Not at all

Perfectly

cause of a miss: first reference to a block

Compulsory

How well did you know this?

Not at all

Perfectly

cause of miss that blocks discarded and later retrieved

capacity

How well did you know this?

Not at all

Perfectly

cause of miss: program makes repeated references to multiple addresses from different blocks that map to the same location in the cache

conflict

How well did you know this?

Not at all

Perfectly

Average memory access time =

Hit time + Miss rate X Miss Penalty

How well did you know this?

Not at all

Perfectly

misses per instruction =

miss rate X memory accesses/instruction

How well did you know this?

Not at all

Perfectly

To reduce performance impact of misses

speculative and multithreaded processors may execute other instructions during miss

How well did you know this?

Not at all

Perfectly

Reduces compulsory misses and Increases capacity and conflict misses, increases miss penalty

Larger block size

How well did you know this?

Not at all

Perfectly

Increases hit time, increases power consumption

Larger total cache capacity to reduce miss rate

How well did you know this?

Not at all

Perfectly

Reduces conflict misses and increases hit time, power consumption

Higher associativity

How well did you know this?

Not at all

Perfectly

Reduces overall memory access time

Higher number of cache levels

Reduces miss penalty

Giving priority to read misses over writes

Reduces hit time

Avoiding address translation in cache indexing

Time between read request and when desired word arrives

Access time

Minimum time between unrelated requests to memory

Cycle time

Has low latency, use for cache

SRAM

Chips into many banks for high bandwidth, use for main memory

Organize DRAM

Requires low power to retain bit and 6 transistors/bit

SRAM

Must be re-written after being read and must also be periodically refreshed

DRAM

In DRAM, upper half of address is

row access strobe (RAS)

In DRAM, lower half of address is

column access strobe (CAS)

Some optimizations in memory capacity and speed to keep up with processors.

Multiple accesses to same row Synchronous DRAM Wider interfaces Double data rate (DDR) Multiple banks on each DRAM device

DDR with lower power (2.5V -> 1.8 V) and clock rates (266 -> 400 MHz)

DDR2

DDR with 1.5 V power and 800 MHz clock rate

DDR3

DDR with 1-1.2 V power and 1333 MHz clock rate

DDR4

Graphics memory based on DDR3

GDDR5

Has lower voltage and lower power mode (ignores clock, continues refresh)

SDRAMs

Achieve 2-5 X bandwidth per DRAM vs DDR3 and has wider interfaces and higher clock rate

Graphics Memory

DRAM stacked vertically with very high bandwidth and used in GPUs and HPC

High Bandwidth Memory (HBM)

Non-volatile that is slower than DRAM but faster than disk. It has two types (NAND and NOR)

Flash Memory

Must be erased in blocks before being overwritten and can use as little as zero power.

NAND Flash Memory

NAND Flash Memory is limited to _____________- number of cycles

100,00

Memory is susceptible to __________.

cosmic rays/ soft errors

Soft errors are fixed by _____

ECC

Hard Errors are permanent errors that are fixed by

replacing it by spare rows

a RAID-like error recovery technique

Chipkill

To improve hit time, predict the _____ to pre-set mux

way

To improve bandwidth

Pipelined caches

To support simultaneous access

Multibanked Caches

Allow hits while misses are outstanding

Nonblocking caches

Request missed word from memory first and send it to the processor as soon as it arrives

Critical word first

Request words in normal order and send missed work to the processor as soon as it arrives

Early restart

Swaps nested loops to access memory in sequential order

Loop interchange

Improves locality of accesses

Blocking

Fetch two blocks on miss (include next sequential block)

Hardware Prefetching

Insert prefetch instructions before data is needed

Compiler prefetching

HBM as a cache where each SDRAM row is a block index and contains set of tags and 29 segments. Hit requires a CAS.

L-H

HBM as cache that molds tag and data together and use direct mapped

Alloy cache

Keeps processes in their own memory space

Virtual Memory

Provide user mode and supervisor mode Protect certain aspects of CPU state Provide mechanisms for switching between user mode and supervisor mode Provide mechanisms to limit memory accesses Provide TLB to translate addresses

Role of architecture

Supports isolation and security. Sharing a computer among many unrelated users Enabled by raw speed of processors, making the overhead more acceptable

Virtual Machine

CAQA_Memory Flashcards

(65 cards)