Microprocessor Speed
1) Pipelining: Processor moves data into a pipe with all stages of pipe processing simultaneously (One process could be fetching while the other is executed)
2) Branch Prediction: Processor looks ahead instruction code fetched from memory predicting which branch of instruction will be processed next
3) Superscalar Execution: Multiple Parallel pipelines are used
4) Data flow analysis: Processor analyzes the instructions dependent on each other to create optimized schedule of instruction
5) Speculative Execution: Using branch prediction and data flow analysis, some processors speculatively execute instructions ahead of their actual appearance in the program execution, holding the results in temporary locations, keeping execution engines as busy as possible
Improvement in Chip Organization
2) Increase size and speed of cache:
cache access time drop significantly
3) Change processor organization and architecture: Increase effective speed of instruction execution (parallelism: Several processes running simultaneously. Smaller pieces broken into large, complex ones)
Problems with Clock Speed and Logic Density
1) Power: Power density increases causing dissipating heat
2) RC delay: When resistance and capacity increases (Bus Medium thiner), delay increases.
3) Memory Latency and throughput: Waiting for memory is not good as memory os slow so isn’t dependable
Increasing Performance (without increasing clock speed)
MIC (Many Integrated Core) vs GPU (Graphics Processing Unit)
MIC: General purpose processors on a single chip. Help increase performance
GPU: Core to perform parallel operations on graphics data.
Amdahl’s Law
- Speedup = Time to execute on 1 processor / time to execute on N parallel processors T = (1/(1-f) + f/N) f = parallel execution 1-f = sequential execution N = # of cores
T = 1 / 0.6 + 0.4/k = 1/0.6 = 1.67
Assuming 0.4/k goes to infinity, maximum speed up is 1.67
Clock Cycle
CPI / Execution Time/MIPS/MFLOPS
CPI = (CPIi * Ii) / Ic
T = Ic * CPI * π
MIPS (Millions instructions per second): f / CPI * 10^6
MFLOPS (Million of floating point operations per second rate = (ππ’ππππ ππ ππ₯πππ’π‘ππ πππππ‘πππβπππππ‘ ππππππ‘ππππ ππ π πππππππ )/(πΈπ₯πππ’π‘πππ π‘πππ Γ (10) ^6 )
Arithmetic vs Harmonic Means
Depending in the harmonic means of rate is better than depending Arithmetic means of rate as the mean will be liked to be proportional to total execution time
n = number of programs
p1, p2, … pi = the values of the calculated Mflops or Mips.