CAQA6e_ILP Flashcards

(50 cards)

1
Q

Approach used in server and desktop processors and not used as extensively in PMP processes.

A

Hardware-based dynamic approaches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When exploiting instruction-level parallelism, goal is to ___________.

A

maximize CPI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pipeline CPI = ?

A

Ideal Pipeline CPI + Structural Stalls + Data Hazard Stalls + Control Stalls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the typical size of basic block in parallelism?

A

3-6 instructions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the challenges of data dependency?

A
  • Cannot execute simultaneously
  • Data Hazard
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

This determines if dependence is detected and if it causes a stall

A

Pipeline organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does data dependence conveys?

A
  1. Possibility of a hazard
  2. Order in which results must be calculated
  3. Upper bound on exploitable instruction level parallelism
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

This dependence is when two instructions use the same name but no flow of information.

A

Name dependence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Not a true data dependence but is a problem when reordering instructions.

A

Name dependence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

This dependence is when instruction j writes a register or memory location that instruction i reads

A

Antidependence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What must be preserved for antidependence?

A

Initial ordering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

This dependence is when instruction i and j write the same register or memory location

A

Output dependence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What must be preserved for output dependence?

A

Ordering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

To resolve name dependencies, what must be used?

A

Register renaming techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the data hazards?

A

RAW, WAW, WAR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

This dependence is the ordering of instruction i with respect to a branch instruction

A

Control Dependence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Instruction control dependent on branch cannot be moved before the branch so that its execution is _______________________.

A

No longer controlled by the branch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

This separates dependent instruction from the source instruction by the pipeline latency of the source instruction.

A

Pipeline scheduling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

This goals to make k copies of the loop body.

A

Strip Mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

For each branch this predicts taken or not taken.

A

Basic 2 bit predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

If the prediction is wrong two consecutive times, ______.

A

change prediction

22
Q

Multiple 2 bit predictors for each branch where one for each possible combination of outcomes of preceding n branches.

A

Correlating predictor

23
Q

Combines correlating predictor with local predictor.

A

Tournament Predictor

24
Q

Need to have predictor for each branch and history.

A

Tagged Hybrid Predictors

25
What is the solution for Tagged Hybrid Predictors creating a huge tables?
Using hash tables whose hash value is based on branch address and branch history.
26
Longer histories may lead to increased _______________, so use _________________.
chance of hash collision. multiple tables with increasingly shorter histories.
27
Rearranges order of instructions to reduce stalls while maintaining data flow.
Dynamic Scheduling
28
Advantages of Dynamic Scheduling
Compiler doesn't need to have knowledge of microarchitecture Handles cases where dependencies are unknown at compile time
29
Disadvantage of Dynamic Scheduling
Substantial increase in hardware complexity Complicates exceptions
30
What does dynamic scheduling implies?
out-of-order execution and completion
31
Introduces register renaming in hardware, minimizing WAW ad WAR hazards.
Tomasulo's Approach
32
Register renaming is provided by __________
reservation station (RS)
33
Execute instructions along predicted execution paths but only commit the results if prediction was correct.
Hardware-based speculation
34
Allowing an instruction to update the register file when instruction is no longer speculative
Instruction commit
35
Need an additional piece of hardware to prevent any irrevocable action until an instruction commits.
Hardware-Based Speculation
36
Holds the result of instruction between completion and commit
Reorder buffer
37
When a mispredicted branches head of ROB, ___________.
discard all entries/ clear ROB
38
These are not recognized until it is ready to commit
Reorder Buffer
39
To achieve CPI < 1, need to ______________.
complete multiple instructions per clock
40
Solutions to achieve multiple instructions per clock
- Statically scheduled superscalar processors - VLIW processors - Dynamically scheduled superscalar processes
41
This packages multiple operations into one instruction
VLIW Processors
42
Examples of VLIW processor?
- One integer instruction (or branch) - Two independent floating-point operations - Two independent memory references
43
Disadvantages of VLIW?
- Statistically finding parallelism - Code Size - No Hazard detection hardware - Binary code compatibility
44
What is the bottleneck in dynamically scheduled superscalars?
Issue logic
45
How to simplify RS allocation?
the number of instructions of a given class can be issued in a "bundle"
46
Adding target instruction into buffer to deal with longer decoding time required by larger buffer
Branch folding
47
Causes the buffer to potentially forget about the return address from previous calls?
unconditional branches coming from multiple sites
48
This creates return address buffer organized as a stack
Return Address Predictor
49
Design monolithic unit that performs branch prediction and instruction prefetch, and instruction memory access and buffering.
Integrated Instruction Fetch Unit
50
Deals with crossing cache lines
Instruction memory access and buffering