What is scheduling in the context of an OS?
It usually means the assignment of threads to processors
What are the problem variants for scheduling?
What are the goals and our assumptions for scheduling for multiprogramming?
Goals include:
Assumptions include:
What are the scheduling levels in relation to thread states? And what is the focus in multiprogramming scheduling?
Here we focus on short-term scheduling.
What are the standard strategies for multiprogramming scheduling? and dicsuss preference to short/long threads for each.
Categorize scheduling strategies to preemption (w/ or w/e), priorities (w/ or w/e) and service time dependency.
04_scheduling slide 27
What is the difference between static and dynamic thread priorities?
Static priorities express absolute importance of a task (low overhead, O(1) complexity). They are subject to priority inversion.
Dynamic priorities: thread priority is adapted over time so that short running tasks can get preference over long running tasks). They can be realized on top of an implementation of static priorities.
What is priority inversion?
priority inversion is a scenario in scheduling in which a high priority task is indirectly superseded by a lower priority task effectively inverting the assigned priorities of the tasks. This violates the priority model that high-priority tasks can only be prevented from running by higher-priority tasks. Inversion occurs when there is a resource contention with a low-priority task that is then preempted by a medium-priority task.
See 04_scheduling slide 32,33
What is the difference between normalized response times for shorter processes and longer processes?
Shorter Processes: priority with preemption (constant) < priority (linear) < no priority (exponential)
Longer Processes: no priority < priority < priority with preemption (all exponential)
See 04_scheduling slides 29,30,31
What are the solutions for priority inversion?
See 04_scheduling slide 34
Define and model the time related variables for scheduling.
t: mean service time (pure execution time)
w: mean waiting time
r: mean response time (including waiting times): r = w+t
r_n: normalized response time: r_n = r/t
ntt: normalized response time: ntt = response time / service time
What is CPU burst and CPU burst time?
The duration (time slice) for which a thread gets control of the CPU is the CPU burst time, and the concept of gaining control of the CPU is the CPU burst.
Similarly, an I/O burst is the concept of performing I/O operations.
How to estimate the CPU Burst?
The behavior of the most recent past is a good predictor for the short-term future. Therefore, we can estimate the CPU burst time of the i’th compute phase of a thread by looking at the execution time recorded in previous < i compute phases for that thread.
The formula is in 04_scheduling slide 40
S[n+1] = 1/n * T[n] + ((n-1/n) S[n] = (1/n) * sum_1_n(T[i])
Because younger more recent values usually characterize the future behavior better than older ones, they should obtain higher weight so we can apply exponential averaging to the formula above:
S[n+1] = a * T[n] + (1-a) S[n] with 0 < a < 1.
If a > 1/n, younger values get higher weights.
When enrolling this recursion, the weights of older measurements fade out exponentially. High a leads to better responsiveness (response to change in behavior), smaller a leads to better stability (to filter outliers). The first estimate S[1] can be initialized with 0 to give new threads high priority. See 04_scheduling slides 42 and 43 for the effect of exponential decay of weights.
What is the problem with multiprocessor scheduling?
We have to decide not only when to execute a task but also where (on which processor).
Additionally, even if processors are identical, these processors/cores might share limited resources including caches, memory bandwidth might be limited and share by cores, logical cores might share the execution units within one physical core.
What is the concept of processor affinity in multiprocessor scheduling?
If a thread was running on a processor, corresponding caches were filled with data belonging to it. If the thread is scheduled again, there is a chance that significant parts of the cache still belong to that thread so processor affinity means that the scheduler favors the recently used processor but it may happen that a thread with higher priority is waiting for its favored processor while another thread with lower priority is running.
What is Gang scheduling / Coscheduling? What are the properties of the threads in a gang/group?
Scheduling of a group of threads to run on a set of processors at the same time, on a one-to-one basis with synchronized time-slices and preemption and forbidding idle processors in the set to execute tasks not belonging to the group. Threads of the scheduling group must not relinquish the CPU and will not be blocked.
This creates the illusion that the tasks of the gang scheduled/coscheduled group are running exclusively in the system.
A group of threads is coscheduled on a set of processors
The threads in the same group are usually heavily communicating (no need to block one as it will receive a response from another in the same group soon), data sharing (sharing processor caches) and have contrary resource demands (less resource contention when they are scheduled together)
Give an example of a coscheduler?
TACO (Topology Aware Coscheduling): Builds synchronization domains as set of processors with own runqueue. Master CPU is responsible to enforce thread switch at all CPUs of the synchronization domain and each CPU picks a thread out of the run-queue and notifies master. Finally, map the sets of coschedules threads to synchronization domains.
What is the difference between centralized, decentralized, semi-decentralized and hierarchical scheduler design?
Centralized: Have one runqueue for all processors. This global knowledge makes some types of scheduling easier (like coscheduling and absolute priorities scheduling) and some harder (realizing processor affinity).
Decentralized: One runqueue per processor core. This scales to very large systems. Needs load balancing on the runqueues to be versatile. Distributed knowledge makes global decisions nearly impossible but processor affinity is very cheap. (this is like most current operating systems).
Semi-Decentralized: Multiple cores share a runqueue. Tradeoff between both extremes.
Hierarchical: A hierarchy of runqueues, where runqueues further up in the hierarchy represent larger fractions of the system. This allows for more scalable coscheduling as multiple small coscheduled sets can be processed independently. Without coscheduled sets, this is like decentralized scheduler.
What is real time scheduling? What is the additional property of threads in real time systems to be considered in scheduling?
In real time systems, the goal of scheduling is different as within short time constraints, measurements need to be evaluated and based on that, action must be taken.
Threads in real time systems are associated with deadlines, at which they must be finished. They are usually critical for the functioning of the whole system and they need to be considered in scheduling decisions.
What is the difference between hard and soft real time systems?
In hard real-time systems, violation of the deadline means failure of the system that can not be tolerated (Airbag system). Here off-line scheduling algorithms are employed to guarantee that deadlines can be met.
In soft real time systems, violation of the deadline means quality reduction but can be tolerated (transmission and processing of video streams).
What is lateness in the context of real time system threads? And what is the goal of soft real time scheduling in relation to lateness?
A thread in real time systems has a specification for the time the earliest time they can start and the latest time they can finish (deadline). Exceeding the deadline is called lateness and it is the amount of time by which the deadline is exceeded.
The goal here is to minimize the maximum total lateness so that the maximum amount of deadlines are respected.
What are periodical threads in the context of real time systems?
In real time systems, periodical threads have to be finished in due time. Each thread is characterized by its period/rate.
What are the scheduling strategies in soft real time systems to minimize maximum lateness for non-periodic threads on a single processor?
Earliest Due Date (EDD), no start times, no preemption, single processor, no thread dependencies: Threads need to be sorted according to their deadlines which can be done in O(n log n). But when introducing start times, problem becomes NP-hard.
Earliest Deadline First (EDF), start times, preemption, single processor, no thread dependencies: When a thread is assigned with an earlier deadline than the one running, then yield running thread and start assigned thread with earlier deadline.
What are the scheduling strategies in soft real time systems to minimize maximum lateness for periodic threads on a single processor?
Rate Monotonic Scheduling (RMS): Assign a static priority such that thread with smallest period gets highest priority.
Assumptions:
- T_i periodical with period length p_i
- Deadline: d_i = p_i
- T_i is ready again immediately after p_i
- T_i has a constant execution time b_i ( <= p_i)
- The smaller the period, the higher the priority.
Following RMS criterion must hold for n threads: sum_1_n (b_i / p_i) <= n * (2**(1/n - 1). Left side is the required processor capacity (must be <= 1) and right side is an upper bound that must be valid in order to find a feasible schedule.
If the criterion is satisfied, use the static priority to schedule the threads.
Time Demand Analysis (TDA): Calculate worst case time demand for all threads and compare to available time to deadline. Then find point in time where enough time is available to finish.
Earliest Deadline First (EDF): Threads are assigned a dynamic priority where the thread with the next closest deadlines gets highest priority. Preemption is allowed. Following criterion must hold for n threads: sum_1_n (b_i / p_i) <= 1