Commits · 1a40bb31fcf1 · Dmitry Baryshkov / msm

Sep 26, 2024

scx_flatcg: Use a user DSQ for fallback instead of SCX_DSQ_GLOBAL · c9c809f4

Tejun Heo authored 6 months ago


scx_flatcg was using SCX_DSQ_GLOBAL for fallback handling. However, it is
assuming that SCX_DSQ_GLOBAL isn't automatically consumed, which was true a
while ago but is no longer the case. Also, there are further changes planned
for SCX_DSQ_GLOBAL which will disallow explicit consumption from it. Switch
to a user DSQ for fallback.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>

c9c809f4

Sep 25, 2024

tools/sched_ext: Receive misc updates from SCX repo · a748db0c

Tejun Heo authored 6 months ago

Receive misc tools/sched_ext updates from https://github.com/sched-ext/scx


to sync userspace bits.

- LSP macros to help language servers.

- bpf_cpumask_weight() declaration and cast_mask() helper.

- Cosmetic updates to scx_flatcg.bpf.c.

Signed-off-by: Tejun Heo <tj@kernel.org>

a748db0c

sched_ext: Add __COMPAT helpers for features added during v6.12 devel cycle · 1e123fd7

Tejun Heo authored 6 months ago


cgroup support and scx_bpf_dispatch[_vtime]_from_dsq() are newly added since
8bb30798 ("sched_ext: Fixes incorrect type in bpf_scx_init()") which is
the current earliest commit targeted by BPF schedulers. Add compat helpers
for them and apply them in the example schedulers.

These will be dropped after a few kernel releases. The exact backward
compatibility window hasn't been decided yet.

Signed-off-by: Tejun Heo <tj@kernel.org>

1e123fd7

Sep 04, 2024

sched_ext: Add a cgroup scheduler which uses flattened hierarchy · a4103eac

Tejun Heo authored 6 months ago


This patch adds scx_flatcg example scheduler which implements hierarchical
weight-based cgroup CPU control by flattening the cgroup hierarchy into a
single layer by compounding the active weight share at each level.

This flattening of hierarchy can bring a substantial performance gain when
the cgroup hierarchy is nested multiple levels. in a simple benchmark using
wrk[8] on apache serving a CGI script calculating sha1sum of a small file,
it outperforms CFS by ~3% with CPU controller disabled and by ~10% with two
apache instances competing with 2:1 weight ratio nested four level deep.

However, the gain comes at the cost of not being able to properly handle
thundering herd of cgroups. For example, if many cgroups which are nested
behind a low priority parent cgroup wake up around the same time, they may
be able to consume more CPU cycles than they are entitled to. In many use
cases, this isn't a real concern especially given the performance gain.
Also, there are ways to mitigate the problem further by e.g. introducing an
extra scheduling layer on cgroup delegation boundaries.

v5: - Updated to specify SCX_OPS_HAS_CGROUP_WEIGHT instead of
      SCX_OPS_KNOB_CGROUP_WEIGHT.

v4: - Revert reference counted kptr for cgv_node as the change caused easily
      reproducible stalls.

v3: - Updated to reflect the core API changes including ops.init/exit_task()
      and direct dispatch from ops.select_cpu(). Fixes and improvements
      including additional statistics.

    - Use reference counted kptr for cgv_node instead of xchg'ing against
      stash location.

    - Dropped '-p' option.

v2: - Use SCX_BUG[_ON]() to simplify error handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Vernet <dvernet@meta.com>
Acked-by: Josh Don <joshdon@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Barret Rhoden <brho@google.com>

a4103eac

Admin message

Admin message