pub fn read_affinity(tid: i32) -> Option<Vec<u32>>Expand description
Read the effective CPU affinity for a task via the
sched_getaffinity(2) syscall. The kernel gates
sched_getaffinity on security_task_getscheduler(p) only —
under the default DAC config this is unrestricted (any task may
read any other task’s affinity); an active LSM (SELinux/Yama)
may return EPERM. Returns sorted CPU ids.
None on syscall failure (EPERM, ESRCH) or when the kernel’s
mask exceeds AFFINITY_MAX_BITS (hosts beyond 262144 CPUs).
§Dynamic buffer sizing
The kernel’s SYSCALL_DEFINE3(sched_getaffinity)
(kernel/sched/syscalls.c) rejects a caller buffer shorter
than nr_cpu_ids / BITS_PER_BYTE with EINVAL. The x86_64
CONFIG_NR_CPUS maximum is 8192 (NR_CPUS_RANGE_END with
CPUMASK_OFFSTACK; without it the max is 512); other
architectures may allow higher (large NUMA / partitioning
hardware). libc’s fixed [libc::cpu_set_t] is only 1024 bits
wide, so calling sched_getaffinity with
size_of::<cpu_set_t>() against a CONFIG_NR_CPUS > 1024
kernel fails EINVAL even when the caller has legitimate
access.
This helper avoids the cap by allocating a dynamically-sized
Vec<c_ulong> (an array of kernel unsigned long — the
wire format the syscall expects, aligned and byte-length a
multiple of sizeof(unsigned long) per the kernel’s second
validation). On EINVAL the buffer doubles and the call
retries, capped at AFFINITY_MAX_BITS = 262144 (32 KiB of
mask data — covers every real-world CONFIG_NR_CPUS setting
and bounds the worst-case allocation).
§Error-class handling
EINVAL→ buffer too small. Double and retry until the ceiling is reached, then surface None.EPERM/ESRCH→ real access / process-identity failures. Return None so the caller falls back to the procfsCpus_allowed_list:path. That field is emitted in/proc/<tid>/statusand is governed by procfs DAC (open / directory-traversal permission), not the syscall’ssecurity_task_getschedulerLSM hook, so it can succeed where an active LSM denied the syscall.- Any other error → return None. The procfs fallback will produce the correct value or its own None.
Without this split, the previous implementation collapsed every error to None indistinguishably — EINVAL on a >1024-CPU host was treated the same as EPERM, and every caller had to rely on the procfs fallback for correctness, making the syscall path effectively useless on the very hosts where affinity data matters most (1000-plus-CPU NUMA boxes).