15-GPU-Threads-and-Kernels Flashcards

Question 1

Q

What are OpenCL devices?

Answer

A

Belong to platforms, represent GPUs or other accelerators

Question 2

Q

What are OpenCL contexts?

Answer

A

Coordinate interaction between host and device, one per device

Question 3

Q

What are command queues?

Answer

A

Request work execution on device, usually one per device

Question 4

Q

What is simpleOpenContext_GPU()?

Answer

A

Helper function that finds first GPU and creates context/queue

Question 5

Q

What is device memory allocation?

Answer

A

Use clCreateBuffer() to allocate GPU memory

Question 6

Q

What are memory flags?

Answer

A

CL_MEM_READ_ONLY, CL_MEM_WRITE_ONLY, CL_MEM_READ_WRITE, CL_MEM_COPY_HOST_PTR

Question 7

Q

What is a kernel?

Answer

A

Function executed on GPU, preceded by __kernel

Question 8

Q

What is get_global_id(0)?

Answer

A

Returns global thread index for current work item

Question 9

Q

What is kernel compilation?

Answer

A

OpenCL kernels compiled at runtime using clCreateProgramWithSource()

Question 10

Q

What is compileKernelFromFile()?

Answer

A

Helper function that reads, compiles, and creates kernel from file

Question 11

Q

What are kernel arguments?

Answer

A

Set with clSetKernelArg() before enqueueing

Question 12

Q

What is clEnqueueNDRangeKernel()?

Answer

A

Launches kernel with specified work dimensions and sizes

Question 13

Q

What are work items?

Answer

A

Basic unit of GPU execution, maps to hardware thread

Question 14

Q

What is work group size?

Answer

A

Number of work items per work group, affects performance

Question 15

Q

What is NDRange?

Answer

A

N-dimensional range of all work items in kernel launch

Question 16

Q

What is the hierarchy: work items and work groups?

Answer

Study These Flashcards

A

Work items in work groups, communication only within groups

Question 17

Q

What are local indices?

Answer

Study These Flashcards

A

get_local_id() gives position within work group

Question 18

Q

What are global indices?

Answer

Study These Flashcards

A

get_global_id() gives position in entire NDRange

Question 19

Q

What is the advantage of local memory?

Answer

Study These Flashcards

A

Faster than global, shared within work group

Question 20

Q

What is memory coalescing?

Answer

Study These Flashcards

A

GPU optimization where adjacent threads access adjacent memory

Question 21

Q

What is the copy pattern: host to device?

Answer

Study These Flashcards

A

clEnqueueWriteBuffer() or CL_MEM_COPY_HOST_PTR

Question 22

Q

What is the copy pattern: device to host?

Answer

Study These Flashcards

A

clEnqueueReadBuffer() with CL_TRUE for blocking

Question 23

Q

What is the typical GPU program flow?

Answer

Study These Flashcards

A

Allocate memory → copy to GPU → launch kernel → copy results back

15-GPU-Threads-and-Kernels Flashcards

(23 cards)