CAQA6e_ILP Flashcards by John Emy Bautista

Approach used in server and desktop processors and not used as extensively in PMP processes.

Hardware-based dynamic approaches

How well did you know this?

Not at all

Perfectly

When exploiting instruction-level parallelism, goal is to ___________.

maximize CPI

How well did you know this?

Not at all

Perfectly

Pipeline CPI = ?

Ideal Pipeline CPI + Structural Stalls + Data Hazard Stalls + Control Stalls

How well did you know this?

Not at all

Perfectly

What is the typical size of basic block in parallelism?

3-6 instructions

How well did you know this?

Not at all

Perfectly

What are the challenges of data dependency?

Cannot execute simultaneously
Data Hazard

How well did you know this?

Not at all

Perfectly

This determines if dependence is detected and if it causes a stall

Pipeline organization

How well did you know this?

Not at all

Perfectly

What does data dependence conveys?

Possibility of a hazard
Order in which results must be calculated
Upper bound on exploitable instruction level parallelism

How well did you know this?

Not at all

Perfectly

This dependence is when two instructions use the same name but no flow of information.

Name dependence

How well did you know this?

Not at all

Perfectly

Not a true data dependence but is a problem when reordering instructions.

Name dependence

How well did you know this?

Not at all

Perfectly

This dependence is when instruction j writes a register or memory location that instruction i reads

Antidependence

How well did you know this?

Not at all

Perfectly

What must be preserved for antidependence?

Initial ordering

How well did you know this?

Not at all

Perfectly

This dependence is when instruction i and j write the same register or memory location

Output dependence

How well did you know this?

Not at all

Perfectly

What must be preserved for output dependence?

Ordering

How well did you know this?

Not at all

Perfectly

To resolve name dependencies, what must be used?

How well did you know this?

Not at all

Perfectly

What are the data hazards?

RAW, WAW, WAR

How well did you know this?

Not at all

Perfectly

This dependence is the ordering of instruction i with respect to a branch instruction

Control Dependence

How well did you know this?

Not at all

Perfectly

Instruction control dependent on branch cannot be moved before the branch so that its execution is _______________________.

No longer controlled by the branch

How well did you know this?

Not at all

Perfectly

This separates dependent instruction from the source instruction by the pipeline latency of the source instruction.

Pipeline scheduling

How well did you know this?

Not at all

Perfectly

This goals to make k copies of the loop body.

Strip Mining

How well did you know this?

Not at all

Perfectly

For each branch this predicts taken or not taken.

Basic 2 bit predictor

How well did you know this?

Not at all

Perfectly

If the prediction is wrong two consecutive times, ______.

Study These Flashcards

change prediction

Multiple 2 bit predictors for each branch where one for each possible combination of outcomes of preceding n branches.

Study These Flashcards

Correlating predictor

Combines correlating predictor with local predictor.

Study These Flashcards

Tournament Predictor

Need to have predictor for each branch and history.

Study These Flashcards

Tagged Hybrid Predictors

What is the solution for Tagged Hybrid Predictors creating a huge tables?

Using hash tables whose hash value is based on branch address and branch history.

Longer histories may lead to increased _______________, so use _________________.

chance of hash collision. multiple tables with increasingly shorter histories.

Rearranges order of instructions to reduce stalls while maintaining data flow.

Dynamic Scheduling

Advantages of Dynamic Scheduling

Compiler doesn't need to have knowledge of microarchitecture Handles cases where dependencies are unknown at compile time

Disadvantage of Dynamic Scheduling

Substantial increase in hardware complexity Complicates exceptions

What does dynamic scheduling implies?

out-of-order execution and completion

Introduces register renaming in hardware, minimizing WAW ad WAR hazards.

Tomasulo's Approach

reservation station (RS)

Execute instructions along predicted execution paths but only commit the results if prediction was correct.

Hardware-based speculation

Allowing an instruction to update the register file when instruction is no longer speculative

Instruction commit

Need an additional piece of hardware to prevent any irrevocable action until an instruction commits.

Hardware-Based Speculation

Holds the result of instruction between completion and commit

Reorder buffer

When a mispredicted branches head of ROB, ___________.

discard all entries/ clear ROB

These are not recognized until it is ready to commit

Reorder Buffer

To achieve CPI < 1, need to ______________.

complete multiple instructions per clock

Solutions to achieve multiple instructions per clock

- Statically scheduled superscalar processors - VLIW processors - Dynamically scheduled superscalar processes

This packages multiple operations into one instruction

VLIW Processors

Examples of VLIW processor?

- One integer instruction (or branch) - Two independent floating-point operations - Two independent memory references

Disadvantages of VLIW?

- Statistically finding parallelism - Code Size - No Hazard detection hardware - Binary code compatibility

What is the bottleneck in dynamically scheduled superscalars?

Issue logic

How to simplify RS allocation?

the number of instructions of a given class can be issued in a "bundle"

Adding target instruction into buffer to deal with longer decoding time required by larger buffer

Branch folding

Causes the buffer to potentially forget about the return address from previous calls?

unconditional branches coming from multiple sites

This creates return address buffer organized as a stack

Return Address Predictor

Design monolithic unit that performs branch prediction and instruction prefetch, and instruction memory access and buffering.

Integrated Instruction Fetch Unit

Deals with crossing cache lines

Instruction memory access and buffering

CAQA6e_ILP Flashcards

(50 cards)