The Completely Fair Scheduler is a process scheduler which was merged into the 2.6.23 release of the Linux kernel and is the default scheduler. It handles CPUresource allocation for executing processes, and aims to maximize overall CPU utilization while also maximizing interactive performance. In contrast to the previous O scheduler used in older Linux 2.6 kernels, the CFS scheduler implementation is not based on run queues. Instead, a red–black tree implements a "timeline" of future task execution. Additionally, the scheduler uses nanosecond granularity accounting, the atomic units by which an individual process' share of the CPU was allocated. This precise knowledge also means that no specific heuristics are required to determine the interactivity of a process, for example. Like the old O scheduler, CFS uses a concept called "sleeper fairness", which considers sleeping or waiting tasks equivalent to those on the runqueue. This means that interactive tasks which spend most of their time waiting for user input or other events get a comparable share of CPU time when they need it.
Algorithm
The data structure used for the scheduling algorithm is a red-black tree in which the nodes are scheduler-specific sched_entity structures. These are derived from the general task_struct process descriptor, with added scheduler elements. The nodes are indexed by processor "execution time" in nanoseconds. A "maximum execution time" is also calculated for each process to represent the time the process would have expected to run on an "ideal processor". This is the time the process has been waiting to run, divided by the total number of processes. When the scheduler is invoked to run a new process:
The leftmost node of the scheduling tree is chosen, and sent for execution.
If the process simply completes execution, it is removed from the system and scheduling tree.
If the process reaches its maximum execution time or is otherwise stopped it is reinserted into the scheduling tree based on its new spent execution time.
The new leftmost node will then be selected from the tree, repeating the iteration.
If the process spends a lot of its time sleeping, then its spent time value is low and it automatically gets the priority boost when it finally needs it. Hence such tasks do not get less processor time than the tasks that are constantly running. The fair queuing CFS scheduler has a scheduling complexity of O, where N is the number of tasks in the runqueue. Choosing a task can be done in constant time, but reinserting a task after it has run requires O operations, because the runqueue is implemented as a red-black tree.
History
's work with scheduling, most significantly his implementation of "fair scheduling" named Rotating Staircase Deadline, inspired Ingo Molnár to develop his CFS, as a replacement for the earlier O scheduler, crediting Kolivas in his announcement. CFS is an implementation of a well-studied, classic scheduling algorithm called weighted fair queuing. Originally invented for packet networks, fair queuing had been previously applied to CPU scheduling under the name stride scheduling. CFS is the first implementation of a fair queuing process scheduler widely used in a general-purpose operating system. The Linux kernel received a patch for CFS in November 2010 for the 2.6.38 kernel that has made the scheduler "fairer" for use on desktops and workstations. Developed by Mike Galbraith using ideas suggested by Linus Torvalds, the patch implements a feature called autogrouping that significantly boosts interactive desktop performance. The algorithm puts parent processes in the same task group as child processes. This solved the problem of slow interactive response times on multi-core and multi-CPU systems when they were performing other tasks that use many CPU-intensive threads in those tasks. A simple explanation is that, with this patch applied, one will be able to still watch a video, read email and perform other typical desktop activities without glitches or choppiness while, say, compiling the Linux kernel or encoding video. In 2016, the Linux scheduler was patched for better multicore performance, based on the suggestions outlined in the paper, "The Linux Scheduler: A Decade of Wasted Cores".