Chapter 4 Flashcards

(34 cards)

1
Q

What are the four steps in the Parallel Design Process?

A
  • PARTITION
  • COMMUNICATE
  • AGGLOMERATE
  • MAP

These steps help in analyzing and designing parallel algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a dependency in the context of parallel computing?

A

A situation where one operation must complete before another can begin

Dependencies are considered the enemy of parallelism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or false: Only RAW dependencies limit parallelism.

A

TRUE

WAR and WAW dependencies can often be worked around.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the three types of data dependencies?

A
  • RAW (Read After Write)
  • WAR (Write After Read)
  • WAW (Write After Write)

Each type has different implications for parallel execution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a loop-carried dependency?

A

A dependency where iteration i depends on results from iteration i-k

These dependencies are primary barriers to loop parallelization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the key insight regarding S2 and S3 in the dependency example?

A

S2 and S3 can run in parallel because they have no dependency on each other

This illustrates how dependencies affect execution order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Bernstein’s Conditions for Parallelism?

A
  • Condition 1: No RAW
  • Condition 2: No WAR
  • Condition 3: No WAW

These conditions help determine if two tasks can run in parallel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is data decomposition in parallel computing?

A

Partitioning the data and applying the same operation

This strategy is useful for uniform workloads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the block distribution strategy?

A

Divide data into contiguous chunks

Best for uniform workloads where each element takes the same time to process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the cyclic distribution strategy?

A

Assign elements in round-robin fashion

Best for non-uniform workloads where processing time varies by element position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the block-cyclic distribution strategy?

A

Assign blocks of size B in round-robin fashion

Used in ScaLAPACK and many HPC applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the functional (task) decomposition?

A

Partitioning the computation/functions

Different processors execute different tasks, often combined in real applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the pipeline decomposition?

A

A special form of functional decomposition where stages operate on a stream of data

It helps in improving latency and throughput.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the recursive decomposition?

A

Dividing a problem into subproblems of the same type and solving them in parallel

Commonly used in divide-and-conquer algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a dependency graph show?

A

Which operations must precede others

It helps visualize dependencies and critical paths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the critical path in a dependency graph?

A

The longest path through the graph, determining the minimum time to complete all tasks

It indicates the maximum speedup possible.

17
Q

What are the three techniques to eliminate or reduce dependencies?

A
  • Variable Renaming
  • Privatization
  • Reduction

These techniques help enable parallelism by addressing different types of dependencies.

18
Q

What is the first step in the PCAM model for parallel algorithm design?

A

Partitioning

Decompose problem into tasks (Data or functional).

19
Q

In OpenMP, what directive is used to indicate a parallel for loop?

A

pragma omp parallel for

This directive allows multiple iterations of the loop to be executed in parallel.

20
Q

What does the reduction clause do in OpenMP?

A

Combines values from multiple threads into a single value

Example: #pragma omp parallel for reduction(+:sum) for summing elements.

21
Q

What is the purpose of loop fission?

A

Split loops to separate dependencies

This can allow for parallel execution of independent parts.

22
Q

What is the goal of load balancing in parallel computing?

A

Even distribution of computation

This helps to maximize resource utilization and minimize idle time.

23
Q

True or false: Fine-grained tasks have a small size and many tasks.

A

TRUE

Fine-grained tasks typically have higher overhead due to more synchronization.

24
Q

What are the communication patterns in parallel computing?

A
  • Point-to-Point
  • Broadcast
  • Scatter
  • Gather
  • All-to-All
  • Reduction

These patterns determine how tasks exchange data.

25
What does **loop interchange** achieve?
Reorders loops to change dependency structure ## Footnote This can allow for better parallelization by reducing dependencies.
26
In the context of parallel programming, what does **granularity** refer to?
The size of tasks in a parallel program ## Footnote It can be fine-grained (small tasks) or coarse-grained (large tasks).
27
What is the **OpenMP directive** for static scheduling with chunks of 4?
#pragma omp for schedule(static, 4) ## Footnote This assigns equal chunks of 4 iterations upfront to threads.
28
Fill in the blank: The **PCAM** model stands for **P**, **C**, **A**, **M**.
Partitioning, Communication, Agglomeration, Mapping ## Footnote This model provides a systematic approach to parallel algorithm design.
29
What is the **dependency type** in the following code snippet: for (i = 0; i < n; i++) { a[i] = b[i] + c[i]; }
Fully parallel ## Footnote There are no dependencies between iterations.
30
What is the **dependency type** in the following code snippet: for (i = 1; i < n; i++) { a[i] = a[i-1] * 2; }
Sequential only ## Footnote Each iteration depends on the result of the previous one.
31
What is the **OpenMP implementation** for parallelizing matrix multiplication?
#pragma omp parallel for collapse(2) ## Footnote This allows both outer loops (i and j) to be executed in parallel.
32
What is the **dependency type** in the following code snippet: for (i = 0; i < n; i++) { sum = sum + a[i]; }
Parallel (use reduction) ## Footnote The sum can be computed in parallel using a reduction clause.
33
What is the **dependency type** in the following code snippet: for (i = 2; i < n; i++) { a[i] = a[i-2] + b[i]; }
2-way parallel ## Footnote There are dependencies, but they can be parallelized in two ways.
34
What is the **dependency type** in the following code snippet: for (i = 0; i < n; i++) { a[i] = a[i] + 1; }
Fully parallel ## Footnote Each iteration is independent of the others.