What kind of scheduling policies can a scheduler have?
When does the scheduler run?
Design of run queue and scheduling algorithm are tightly coupled!
Describe priority scheduling
What is priority inversion?
When a low priority thread acquires a lock and then a higher priority thread is run and needs that lock, the low priority thread could run to completion before the higher priority thread because the high priority thread is stuck waiting for the lower one to complete.
A solution to this is to temporarily boost the priority of the thread holding the mutex to the level of the higher priority thread so it can run until it releases the mutex. Then it would re-lower the priority of the should-be-low-priority thread. This means it’s helpful to know which thread owns a mutex.
What is run to completion scheduling?
Threads are executed until completion unless they are preempted by something.
Describe round robin scheduling
Go through tasks in a round robin order. If task yields due to I/O switch to next task and round robin remaining tasks until I/O is complete.
Can also do round robin with interleaving which is known as time slicing.
Describe Timeslicing
timeslice = max amount of uninterrupted time to give a task (aka, time quantum)
task may run for less amount of time than timeslice (task must wait on I/O, so switches, or higher priority task becomes available)
The ideal time slice length is different for I/O bound vs CPU bound tasks.
What are the costs and benefits of timeslicing?
+ short tasks finish sooner
+ more responsive
+ lengthy I/O ops initiated sooner
- interrupt, schedule, context switch = overhead for each switch
Can minimize these overheads by keeping the timeslice time much»_space; context switch time
What are the ideal timeslice lengths for cpu bound and I/O bound workloads?
CPU Bound:
I/O Bound:
How do you calculate cpu utilization?
[cpu_running_time/(cpu_running_time + context_switching_overheads)] * 100
Describe the multi-level feedback queue (MLFQ).
Have three run queues, but they are treated as a single queue data structure:
1) TS = 8ms, most I/O intensive, highest priority
2) TS = 16ms, medium I/O intensive (mix I/O and CPU)
3) TS = infinity (CPU intensive, lowest priority, FCFS)
+ get timeslicing benefits for I/O bound tasks
+ avoid timeslicing overhead for CPU bound tasks
Ultimately this becomes like a decision making pipeline
Can always be bumped back up if they start yielding before timeslice expires.
Describe the linux O(1) scheduler
P3L1: 18 Linux O(1) – fill this out :)
Describe the Completely Fair Scheduler (CFS)
Scheduler for all non-real time tasks
What are the problems with the O(1) scheduler?
Why is it important to try and schedule threads back onto the same CPU?
Cache-Affinity - likely to have everything you need in the cache so you will be more efficient.
If you swap CPUs, then you’ll have a cold cache.
What is NUMA scheduling?
Non-Uniform Memory Access
What is hyperthreading?
Can even hide memory access (hundreds of cycle access) b/c a context switch to the other hyperthreaded execution context is on the order of cycles. So it can make sense to switch between the hyperthreaded executions contexts even during a memory access.
Why is it good to mix CPU and Memory bound tasks together on a single CPU?
Because you can get better overall utilization.
If both threads are CPU bound, only one can actually make progress at a time
T1: GXGXGX
T2: XGXGXG
If both threads are Mem bound, lots of wasted cycles:
T1: GXXXGXXXG
T2: XGXXXGXXX
If mixed:
T1: GXGGGXGGGX (CPU Bound)
T2: XGXXXGXXXG (Mem Bound)
How can the CPU scheduler make informed decisions about scheduling?
By use of hardware counters - there are typically several and they are different for different architectures.
Help to estimate kind of resources they need (CPU, Mem).
Can look at number of cache hits/misses from the HW counter to help guesstimate whether its a CPU or mem bound workload.
Why can’t CPI be used as a scheduling technique?
Because real workloads may not have enough variation in CPI to be deterministic of workload category. They could have similar averages with high or low std deviation which means isn’t not so cut and dry.
What is a TLB?
Translation Look Aside Buffer
What does the Hardware Memory unit do?
Describe how page tables work.
What is allocation on first touch?
The DRAM will only actually allocate space for a page the first time it is referenced. Declaring it in the VA address space is great, but it won’t be created in PA until you try to do some kind of access to it.