What is task parallelism?
Executing different tasks on different processors, not just data parallel operations
What is the difference from data parallelism?
Data parallelism: same operation on different data; Task parallelism: different operations
What is latency hiding?
Overlapping communication with computation to hide network delays
What is blocking vs non-blocking?
Blocking waits for completion, non-blocking returns immediately
What is MPI_Wait()?
Blocks until non-blocking operation completes
What is MPI_Test()?
Non-blocking check if operation completed
What are command queues?
GPU queues for submitting work asynchronously
What are events?
Track dependencies between GPU operations
What is event-based synchronization?
Use events to chain operations: kernel depends on data transfer
What is the work-span model?
Estimates maximum parallel speedup based on work (total operations) and span (longest dependency chain)
What is work in work-span model?
Total operations needed (serial time)
What is span in work-span model?
Time on infinite processors (longest dependency chain)
What is the work-span speedup limit?
S ≤ work/span
What are superscalar sequences?
Runtime schedules tasks based on dependencies, not explicit synchronization
What is the advantage of task graphs?
Automatic optimization, cleaner code, runtime handles synchronization
What is the disadvantage?
Less control over execution order
What is OpenCL profiling?
Measuring kernel execution time using events
What is CL_QUEUE_PROFILING_ENABLE?
Enables timing measurements on command queue
What is clGetEventProfilingInfo()?
Retrieves start/end times for profiling
What is the timing calculation?
(end - start) * 1e-9 seconds
What is multiple command queue benefit?
Overlap different operations across queues
What is the task graph representation?
Directed acyclic graph: nodes=tasks, edges=dependencies
What is critical path?
Longest path in task graph, determines minimum execution time