Expand description
CPU-affinity utilities shared across the crate.
Two helpers for reading and parsing per-task CPU affinity:
parse_cpu_listdecodes the kernel cpulist string format ("0-3,5,7-9") emitted by/proc/<pid>/status:Cpus_allowed_listand/sys/devices/system/cpu/online.read_affinitycallssched_getaffinity(2)with a dynamically-sized buffer soCONFIG_NR_CPUS > 1024hosts are handled correctly (libc’s fixedcpu_set_tis only 1024 bits).
Both produce sorted-deduped Vec<u32> of CPU ids and route
garbled / over-cap input to None. Used by the per-thread
profiler (ctprof) AND the VM topology planner
(vmm::host_topology) — the function shape is generic enough that
either subsystem could have owned it; keeping the impls here so
neither has to depend on the other for a CPU-list helper.
§Why this is NOT crate::topology::parse_cpu_list
crate::topology carries its own parse_cpu_list (returns
Result<Vec<usize>>) and parse_cpu_list_lenient (returns
Vec<usize>, never fails). The split is deliberate, not a
duplicate to consolidate:
- Threat model. This module’s parser ingests
/proc/<tid>/statusdata captured from arbitrary tasks on the host. A hostile or corruptCpus_allowed_list:value like0-4294967295would allocate 16 GiB without theMAX_CPU_RANGE_EXPANSIONcap. The topology parser ingests operator-supplied VM config — no untrusted-input concerns, no cap needed. - Return shape.
Option<Vec<u32>>here vsResult<Vec<usize>>/Vec<usize>in topology. The capture path needs to distinguish “no data” (None) from “data but garbled” (also None for now, with an explicit comment); the topology path needsanyhow::Errorfor upstream?propagation andVec<usize>to interop with sysfs APIs that speakusize. - Dedup semantics. This module dedups duplicates produced
by overlapping ranges (
0-2,1→[0,1,2]); the topology parser preserves duplicates so callers detecting operator config errors (e.g. accidentally listing the same CPU twice) can surface them.
Unifying the two behind a generic helper would require either collapsing one set of invariants into the other or carrying both behaviors through a config struct — neither produces a cleaner end result than the current cohabitation.
Constants§
- AFFINITY_
INITIAL_ BITS - Initial number of CPU bits the affinity buffer starts at.
8192 is the x86_64
CONFIG_NR_CPUSceiling (NR_CPUS_RANGE_ENDwithCPUMASK_OFFSTACK; also theMAXSMPdefault), so no x86_64 host exceeds it and the overwhelming majority resolve on the first syscall. - AFFINITY_
MAX_ BITS - Maximum number of CPU bits
read_affinityis willing to allocate for. 262144 bits = 32 KiB of mask data, well above the largest in-productionCONFIG_NR_CPUSthis project targets. Capping bounds the worst-case allocation and bounds the retry loop’s iteration count (log2(AFFINITY_MAX_BITS / AFFINITY_INITIAL_BITS)= 5 doublings).
Functions§
- parse_
cpu_ list - Parse a cpulist string of the form
"0-3,5,7-9"into a sorted deduped vec of CPU ids.Noneon empty input or any malformed token (partial results are rejected so the caller can distinguish “no data” from “data but garbled”). - read_
affinity - Read the effective CPU affinity for a task via the
sched_getaffinity(2)syscall. The kernel gatessched_getaffinityonsecurity_task_getscheduler(p)only — under the default DAC config this is unrestricted (any task may read any other task’s affinity); an active LSM (SELinux/Yama) may return EPERM. Returns sorted CPU ids.Noneon syscall failure (EPERM, ESRCH) or when the kernel’s mask exceedsAFFINITY_MAX_BITS(hosts beyond 262144 CPUs).