Skip to content
This repository was archived by the owner on Apr 13, 2023. It is now read-only.

Commit

Permalink
sched_ext: Implement core-sched support
Browse files Browse the repository at this point in the history
The core-sched support is composed of the following parts:

* task_struct->scx.core_sched_at is added. This is a timestamp which can be
  used to order tasks. Depending on whether the BPF scheduler implements
  custom ordering, it tracks either global FIFO ordering of all tasks or
  local-DSQ ordering within the dispatched tasks on a CPU.

* prio_less() is updated to call scx_prio_less() when comparing SCX tasks.
  scx_prio_less() calls ops.core_sched_before() if available or uses the
  core_sched_at timestamp. For global FIFO ordering, the BPF scheduler
  doesn't need to do anything. Otherwise, it should implement
  ops.core_sched_before() which reflects the ordering.

* When core-sched is enabled, balance_scx() balances all SMT siblings so
  that they all have tasks dispatched if necessary before pick_task_scx() is
  called. pick_task_scx() picks between the current task and the first
  dispatched task on the local DSQ based on availability and the
  core_sched_at timestamps. Note that FIFO ordering is expected among the
  already dispatched tasks whether running or on the local DSQ, so this path
  always compares core_sched_at instead of calling into
  ops.core_sched_before().

qmap_core_sched_before() is added to scx_example_qmap. It scales the
distances from the heads of the queues to compare the tasks across different
priority queues and seems to behave as expected.

v2: * Sched core added the const qualifiers to prio_less task arguments.
      Explicitly drop them for ops.core_sched_before() task arguments. BPF
      enforces access control through the verifier, so the qualifier isn't
      actually operative and only gets in the way when interacting with
      various helpers.

Signed-off-by: Tejun Heo <[email protected]>
Reviewed-by: David Vernet <[email protected]>
Reviewed-by: Josh Don <[email protected]>
  • Loading branch information
htejun committed Apr 13, 2023
1 parent 049c8bb commit 6399c79
Show file tree
Hide file tree
Showing 7 changed files with 342 additions and 17 deletions.
21 changes: 21 additions & 0 deletions include/linux/sched/ext.h
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,24 @@ struct sched_ext_ops {
*/
bool (*yield)(struct task_struct *from, struct task_struct *to);

/**
* core_sched_before - Task ordering for core-sched
* @a: task A
* @b: task B
*
* Used by core-sched to determine the ordering between two tasks. See
* Documentation/admin-guide/hw-vuln/core-scheduling.rst for details on
* core-sched.
*
* Both @a and @b are runnable and may or may not currently be queued on
* the BPF scheduler. Should return %true if @a should run before @b.
* %false if there's no required ordering or @b should run before @a.
*
* If not specified, the default is ordering them according to when they
* became runnable.
*/
bool (*core_sched_before)(struct task_struct *a,struct task_struct *b);

/**
* set_weight - Set task weight
* @p: task to set weight for
Expand Down Expand Up @@ -628,6 +646,9 @@ struct sched_ext_entity {
struct task_struct *kf_tasks[2]; /* see SCX_CALL_OP_TASK() */
atomic64_t ops_state;
unsigned long runnable_at;
#ifdef CONFIG_SCHED_CORE
u64 core_sched_at; /* see scx_prio_less() */
#endif

/* BPF scheduler modifiable fields */

Expand Down
2 changes: 1 addition & 1 deletion kernel/Kconfig.preempt
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ config SCHED_CORE

config SCHED_CLASS_EXT
bool "Extensible Scheduling Class"
depends on BPF_SYSCALL && BPF_JIT && !SCHED_CORE
depends on BPF_SYSCALL && BPF_JIT
help
This option enables a new scheduler class sched_ext (SCX), which
allows scheduling policies to be implemented as BPF programs to
Expand Down
12 changes: 11 additions & 1 deletion kernel/sched/core.c
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,12 @@ static inline int __task_prio(const struct task_struct *p)
if (p->sched_class == &idle_sched_class)
return MAX_RT_PRIO + NICE_WIDTH; /* 140 */

return MAX_RT_PRIO + MAX_NICE; /* 120, squash fair */
#ifdef CONFIG_SCHED_CLASS_EXT
if (p->sched_class == &ext_sched_class)
return MAX_RT_PRIO + MAX_NICE + 1; /* 120, squash ext */
#endif

return MAX_RT_PRIO + MAX_NICE; /* 119, squash fair */
}

/*
Expand Down Expand Up @@ -192,6 +197,11 @@ static inline bool prio_less(const struct task_struct *a,
if (pa == MAX_RT_PRIO + MAX_NICE) /* fair */
return cfs_prio_less(a, b, in_fi);

#ifdef CONFIG_SCHED_CLASS_EXT
if (pa == MAX_RT_PRIO + MAX_NICE + 1) /* ext */
return scx_prio_less(a, b, in_fi);
#endif

return false;
}

Expand Down
Loading

0 comments on commit 6399c79

Please sign in to comment.