#[non_exhaustive]pub struct PidsLimits {
pub max: Option<u64>,
}Expand description
Pids controller limits (pids.max). None is the default
(inherit from parent — typically "max", no ceiling).
Per the kernel’s pids_max_write, existing tasks are NOT killed
when the limit lands below the current task count; only future
fork() / clone() calls are blocked once the cgroup’s task
count meets the limit. Useful for fork-bomb / task-count-ceiling
tests.
§Per-WorkType thread-budget guidance
pids.max counts every task (process AND thread) inside the
cgroup. Sizing the limit below the workload’s natural task
budget produces silent fork failures that surface as
WorkloadConfig-level workers refusing to start.
Most variants spawn exactly one task per worker — their
worker_main dispatch arm neither spawns
helper threads nor forks children. Two exceptions run internal
helper threads inside the worker process: Schbench
(message_threads message threads, each spawning
worker_threads worker threads, plus a control thread) and
Taobench (client_threads client threads + slow_threads
dispatcher threads); their per-worker task counts are
config/CPU-sized, not 1. Per-worker budget therefore depends on
CloneMode (whether each worker
is a process or a thread sharing the parent’s tgid), the
variant’s internal helper-thread topology, and whether the
variant transiently forks short-lived children inside its own
loop. The columns below capture all three:
| Variant | Steady-state tasks | Transient peak |
|---|---|---|
SpinWait, YieldHeavy, Mixed | 1/worker | — |
Bursty, IdleChurn | 1/worker | — |
IoSyncWrite, IoRandRead, IoConvoy | 1/worker | — |
CachePressure, CacheYield, CachePipe | 1/worker | — |
PageFaultChurn | 1/worker | — |
AffinityChurn, PolicyChurn, NiceSweep | 1/worker | — |
NumaWorkingSetSweep, NumaMigrationChurn, CgroupChurn | 1/worker | — |
Sequence | 1/worker | — |
AluHot, SmtSiblingSpin, IpcVariance | 1/worker | — |
PipeIo, FutexPingPong, AsymmetricWaker, SignalStorm | 1/worker | — |
FutexFanOut, FanOutCompute | 1/worker | — |
ThunderingHerd, MutexContention, WakeChain | 1/worker | — |
PriorityInversion, ProducerConsumerImbalance | 1/worker | — |
RtStarvation, PreemptStorm, EpollStorm | 1/worker | — |
CrossAffinityChurn, TimerLatency, NetTraffic, IrqWake | 1/worker | — |
ForkExit | 1/worker | +1/worker (waitpid’d before next iter) |
CgroupAttachStorm | 1/worker | +1/worker (forked child per iter, _exits + auto-reaped) |
Schbench, Taobench | >1/worker (internal helper threads, config/CPU-sized) | — |
Custom | 1/worker | depends on user closure (see below) |
CloneMode::Fork (the default): each worker is a separate
process placed in the cgroup. The cgroup’s task count for one
WorkSpec is exactly num_workers; for ForkExit the
instantaneous peak is 2 × num_workers (each parent forks one
child, waitpid’s, repeats).
CloneMode::Thread: every worker is a thread sharing the
test runner’s tgid. The pids controller counts each thread as
a task, so the cgroup’s task count for one WorkSpec is
num_workers + 1 (workers + the parent task). ForkExit is
rejected at spawn time under Thread mode (see
WorkType::ForkExit).
Custom: the framework runs the user closure in a single
task per worker (1/worker, identical to every other variant).
Any fork/clone the closure issues inside its loop adds to the
cgroup’s task count for as long as the resulting child lives;
pids.max must reserve headroom equal to the closure’s peak
child count per worker. Under CloneMode::Fork the framework
reaps closure-spawned descendants at teardown via
killpg(worker_pid, SIGKILL) against the worker’s per-process
group, so transient children are bounded by the closure
itself. Under CloneMode::Thread the worker shares the test
runner’s pgid and killpg-based cleanup is unavailable, so
the closure owns whatever helpers it spawns and must reap
them explicitly before returning the
WorkerReport.
Sizing rule: pids.max ≥ Σ(steady-state + transient) for
every WorkSpec in the cgroup,
plus headroom for cgroup.procs migration scratch tasks and
any payload-binary helper processes the test attaches via
CgroupDef::workload (e.g. stress-ng spawns one task per
--cpu N). Tests with composed WorkSpec groups must sum
across every group — the framework does NOT auto-derive a
budget from the work spec.
§Parent-cgroup hierarchical charging
pids.max is a per-cgroup ceiling, but every fork/clone
charges every ancestor up to (but not including) the
unified-hierarchy root. The kernel’s pids_can_fork calls
pids_try_charge, which loops
for (p = pids; parent_pids(p); p = parent_pids(p)) and
charges each level (kernel/cgroup/pids.c) — root is NOT
charged per the loop’s parent_pids(p) termination
condition. EAGAIN propagates from the FIRST level
(leaf-to-root traversal order) whose post-charge counter
exceeds its limit, so a child cgroup with pids.max = 1024
still hits EAGAIN when a parent two levels up sits at its
own ceiling.
Sizing rule for nested test trees: the effective limit is
min(pids.max) along the path from the test cgroup up to the
pids-controlled root, NOT just the value set on the test
cgroup itself. When ktstr runs under a delegated parent slice
(systemd user.slice, container runtime cgroup, ktstr’s own
build sandbox), inspect the parent’s pids.max before sizing
the test cgroup — a generous test-cgroup setting is silently
shadowed by a tighter ancestor.
§pids.max(0) is rejected at apply_setup, not type-level
Some(0) would silently halt every fork/clone inside the
cgroup, including the worker spawn itself for CloneMode::Fork
and the ForkExit per-iteration child fork. The kernel accepts
the value (it’s a legitimate pids_max_write input), so
apply_setup adds the bail at scenario-setup time; promoting
it to a type-level invariant (e.g. NonZeroU64) would force
every numeric literal through a non-const constructor and
ripple into every test fixture. The runtime bail keeps the
surface ergonomic while still surfacing the foot-cannon at
construction time (before any worker spawns).
Set via CgroupDef::pids_max or
CgroupDef::pids_unlimited. Construct directly only when
copying a PidsLimits across CgroupDefs — the builder
methods are the preferred entry point because they route the
per-knob value through the framework’s validation seam at
apply_setup.
Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.max: Option<u64>pids.max task-count ceiling. None writes the literal
string "max" (the kernel’s PIDS_MAX_STR sentinel for
unlimited). Some(n) writes the decimal n. The kernel
rejects negative or >= PIDS_MAX (PID_MAX_LIMIT + 1, typically ~4M on 64-bit) values with
EINVAL; the framework’s apply_setup rejects Some(0)
before the syscall (a 0 limit silently halts every fork
or clone inside the cgroup, blocking both worker spawn
under CloneMode::Fork and ForkExit’s per-iteration
child fork).
Trait Implementations§
Source§impl Clone for PidsLimits
impl Clone for PidsLimits
Source§fn clone(&self) -> PidsLimits
fn clone(&self) -> PidsLimits
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for PidsLimits
impl Debug for PidsLimits
Source§impl Default for PidsLimits
impl Default for PidsLimits
Source§fn default() -> PidsLimits
fn default() -> PidsLimits
Source§impl PartialEq for PidsLimits
impl PartialEq for PidsLimits
impl Eq for PidsLimits
impl StructuralPartialEq for PidsLimits
Auto Trait Implementations§
impl Freeze for PidsLimits
impl RefUnwindSafe for PidsLimits
impl Send for PidsLimits
impl Sync for PidsLimits
impl Unpin for PidsLimits
impl UnwindSafe for PidsLimits
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key and return true if they are equal.§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more