16-GPU-Memory-Types Flashcards

(21 cards)

1
Q

What are the four GPU memory types?

A

Global (slow, accessible everywhere), Local (fast, work-group shared), Private (fastest, per-thread), Constant (fast, read-only)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do GPUs need multiple memory types?

A

High throughput design: optimize for different access patterns and performance needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is global memory?

A

Accessible by all work items, slower but largest capacity, not always cached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is local memory?

A

Shared within work group, much faster than global, used for work group communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is private memory?

A

Per-work-item, fastest access, implemented as registers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is constant memory?

A

Read-only, faster than global, for data that doesn’t change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is memory coalescing?

A

GPU optimization where adjacent threads access adjacent memory locations for efficiency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the GPU memory hierarchy?

A

Private (registers) → Local → Global → Host memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is cache coherency in GPUs?

A

Ensuring consistent views when multiple work items access same location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is false sharing in GPUs?

A

Work items accessing different data in same cache line, causing unnecessary invalidation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is register overflow?

A

When kernel uses too many registers, spills to slower global memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is static local memory allocation?

A

Declared in kernel with fixed size: __local float temp[128]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is dynamic local memory allocation?

A

Passed as kernel argument: __kernel void func(__local float *temp)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to allocate dynamic local memory?

A

clSetKernelArg(kernel, arg_index, size, NULL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the global memory access pattern?

A

Use __global qualifier, slowest but most flexible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the local memory access pattern?

A

Use __local qualifier, fastest for work-group communication

17
Q

What is private memory automatic?

A

Variables declared in kernel are private by default

18
Q

What is constant memory for?

A

Read-only data that benefits from caching: __constant qualifier

19
Q

What is unified memory?

A

CPU/GPU memory appears as single address space (OpenCL 2.0+, CUDA 4.0+)

20
Q

What is the advantage of unified memory?

A

Simplifies programming, automatic data movement

21
Q

What is the disadvantage of unified memory?

A

Less control over performance optimization