16-GPU-Memory-Types Flashcards

Question 1

Q

What are the four GPU memory types?

Answer

A

Global (slow, accessible everywhere), Local (fast, work-group shared), Private (fastest, per-thread), Constant (fast, read-only)

Question 2

Q

Why do GPUs need multiple memory types?

Answer

A

High throughput design: optimize for different access patterns and performance needs

Question 3

Q

What is global memory?

Answer

A

Accessible by all work items, slower but largest capacity, not always cached

Question 4

Q

What is local memory?

Answer

A

Shared within work group, much faster than global, used for work group communication

Question 5

Q

What is private memory?

Answer

A

Per-work-item, fastest access, implemented as registers

Question 6

Q

What is constant memory?

Answer

A

Read-only, faster than global, for data that doesn’t change

Question 7

Q

What is memory coalescing?

Answer

A

GPU optimization where adjacent threads access adjacent memory locations for efficiency

Question 8

Q

What is the GPU memory hierarchy?

Answer

A

Private (registers) → Local → Global → Host memory

Question 9

Q

What is cache coherency in GPUs?

Answer

A

Ensuring consistent views when multiple work items access same location

Question 10

Q

What is false sharing in GPUs?

Answer

A

Work items accessing different data in same cache line, causing unnecessary invalidation

Question 11

Q

What is register overflow?

Answer

A

When kernel uses too many registers, spills to slower global memory

Question 12

Q

What is static local memory allocation?

Answer

A

Declared in kernel with fixed size: __local float temp[128]

Question 13

Q

What is dynamic local memory allocation?

Answer

A

Passed as kernel argument: __kernel void func(__local float *temp)

Question 14

Q

How to allocate dynamic local memory?

Answer

A

clSetKernelArg(kernel, arg_index, size, NULL)

Question 15

Q

What is the global memory access pattern?

Answer

A

Use __global qualifier, slowest but most flexible

Question 16

Q

What is the local memory access pattern?

Answer

Study These Flashcards

A

Use __local qualifier, fastest for work-group communication

Question 17

Q

What is private memory automatic?

Answer

Study These Flashcards

A

Variables declared in kernel are private by default

Question 18

Q

What is constant memory for?

Answer

Study These Flashcards

A

Read-only data that benefits from caching: __constant qualifier

Question 19

Q

What is unified memory?

Answer

Study These Flashcards

A

CPU/GPU memory appears as single address space (OpenCL 2.0+, CUDA 4.0+)

Question 20

Q

What is the advantage of unified memory?

Answer

Study These Flashcards

A

Simplifies programming, automatic data movement

Question 21

Q

What is the disadvantage of unified memory?

Answer

Study These Flashcards

A

Less control over performance optimization

16-GPU-Memory-Types Flashcards

(21 cards)