Expand description
Type-safe wrappers for per-thread metric values.
Each registered metric in crate::ctprof_compare::CTPROF_METRICS
has a kernel-source-grounded semantic category — counter,
cumulative-time, peak high-water (ns and bytes), instantaneous
gauge, byte count, ordinal scalar, categorical, or cpuset. The
aggregation
pipeline reduces values per category: counters sum, peaks take
max, gauges take max, ordinals carry a [min, max] range,
categoricals carry the mode (most-frequent value), and cpusets
carry an affinity summary.
§Temporal window
Every counter / cumulative-time / peak / byte-count newtype
defined here represents a value that the kernel accumulates
across the THREAD LIFETIME — from thread birth to the
moment of the procfs read. All of these fields share the
same window because they live in the same task_struct and
tick along with the same task. That shared window is what
makes ratios across fields well-defined (e.g.
cpu_efficiency = run_time_ns / (run_time_ns + wait_time_ns)
is a meaningful fraction because both numerator and
denominator measure the same task’s same lifetime).
Cross-file read skew during one capture pass (the
capture pipeline reads /proc/<tid>/stat, then /sched,
then /io, etc. with a few hundred microseconds of drift
between them) is negligible against cumulative-from-birth
totals that grow over hours or days of thread runtime —
the small in-flight delta during the read is rounding noise
relative to the lifetime accumulator. The qualifier holds
relative to a lifetime accumulator that has had time to
integrate; threads captured very early in their lifetime
carry larger relative read-skew error, but their absolute
contribution to any group aggregate is correspondingly
small (a thread alive for 500 µs cannot meaningfully drag
a group total even if its individual reads are skewed by
100 µs).
crate::ctprof_compare runs in two modes that both
preserve the shared-window property: SHOW renders one
snapshot’s lifetime totals; COMPARE subtracts two snapshots
captured at different wall-clock instants to scope the values
to the (capture-A, capture-B) interval. In both modes every
field carries the same temporal window, so cross-field ratios
and per-thread totals stay well-defined.
Two newtypes break this convention deliberately: GaugeNs
(a current-instantaneous reading like the scheduler’s current
slice) and GaugeCount (a current count like
signal_struct->nr_threads) — the per-newtype docs call out
the gauge family separately.
§Type-system enforcement
Encoding the category into the type system surfaces
category-mismatched aggregation as a compile error. The
crate::ctprof_compare::AggRule dispatch routes each
variant through the typed newtype’s reduction trait — Sum*
through Summable::sum_across, Max* through
Maxable::max_across, Range* through
Rangeable::range_across, and Mode* through
Modeable::mode_across — so a registry entry that pairs a
peak field with a sum reduction (e.g. t.wait_max
(PeakNs) bound to a Sum* rule whose accessor returns a
Summable value) fails to compile rather than producing a
meaningless 1×1s ⊕ 1000×1ms aggregate. This module defines
the newtypes and traits the dispatch consumes.
§The newtypes
MonotonicCount— pure counter (only ever goes up across a thread’s lifetime). Examples:nr_wakeups,nr_migrations,voluntary_csw.DeadCounter— same wire shape asMonotonicCountbut tagged for kernel counters whose update path is permanently dead (the field exists intask_structbut no kernel writer touches it on any current code path —nr_wakeups_idle,nr_migrations_cold,nr_wakeups_passiveall match this shape today). Captured for parity with/proc/<tid>/schedline numbers but does NOT implement any reduction trait (Summable/Maxable/Rangeable/Modeable) — the value is structurally zero, so every reduction is trivially zero and rendering it through any of the live reductions implies “we measured a thing” when in fact we measured a kernel-side dead pointer. The registry-level accommodation (a no-op aggregation arm or registry removal) is the migration batch’s problem; this newtype’s job is to make the dead-counter status visible at the field declaration so the migration can’t accidentally pair it with aSummable-boundAggRulevariant.MonotonicNs— cumulative-time counter, ns. Examples:run_time_ns,wait_sum,voluntary_sleep_ns,block_sum,iowait_sum,core_forceidle_sum.PeakNs— lifetime high-water mark, ns. The kernel updates these viaif (delta > stat->max) stat->max = deltainsideupdate_stats_*wrappers (kernel/sched/stats.c) and inline schedstat updates inkernel/sched/fair.c(e.g.slice_maxinset_next_entity,exec_maxinupdate_se). Summing peaks is a category error —1 thread × 1s peakcarries different meaning than1000 threads × 1ms peak. Examples:wait_max,sleep_max,block_max,exec_max,slice_max.PeakBytes— lifetime high-water mark, bytes (per-processhiwater_rss/hiwater_vmfromstruct taskstatsvia the genetlink path). Same Maxable-only contract asPeakNsbut Bytes-typed, so it renders on the IEC byte ladder (B → KiB → MiB → GiB → TiB) instead of the ns ladder.GaugeNs— instantaneous gauge sampled at capture time, ns.fair_slice_nsis the canonical example. Summing gauges is a category error — N nearly-identical instantaneous samples sum to N×gauge with no physical meaning.GaugeCount— gauge-family unitless count (u64) that can go up AND down at runtime. Carries the same Maxable-only contract asGaugeNsbut renders as a plain count rather than a nanosecond ladder.nr_threads(the process-wide thread count fromsignal_struct->nr_threads) is the canonical example — threads spawn and exit so the value is not monotonic, and the registry reduces it by Max across a group rather than Sum. Distinct fromGaugeNsbecause “thread count” and “current slice in nanoseconds” do not share a unit; routing nr_threads through GaugeNs would render it on the ns auto-scale ladder, which is a unit lie.ClockTicks— USER_HZ-scaled time. Examples:utime_clock_ticks,stime_clock_ticks. Auto-scale ladder isticks → Kticks → Mticks(decimal SI), distinct from ns (also decimal SI, different unit) and bytes (IEC binary).Bytes— byte counts. Examples:allocated_bytes,read_bytes,wchar. Auto-scale ladder is IEC binary (B → KiB → MiB → GiB → TiB).OrdinalI32/OrdinalU32/OrdinalU64— bounded scalar, range-aggregated (no sum).OrdinalI32examples:nice([-20, 19]),priority(CFS=[0, 39], RT=[-2, -100], DL=-101),processor(last CPU the task ran on; signed for symmetry withnice— the kernel’stask_cpu()returnsunsigned int(include/linux/sched.h), but ktstr stores i32 to share theOrdinalI32wrapper with the genuinely-signed nice and priority fields).OrdinalU32is for u32-backed ordinal fields likert_priority(real-time priority, 0..99 in practice for SCHED_FIFO / SCHED_RR; the kernel declaresunsigned int task_struct::rt_priorityininclude/linux/sched.h, so au32matches the kernel field width exactly).OrdinalU64is reserved for future ordinal metrics whose kernel-side type genuinely exceedsu32::MAX; no field uses it today.CategoricalString— string-valued, mode-aggregated.policyis the only example. Thestatechar andext_enabledbool fields stay unwrapped oncrate::ctprof::ThreadState; thecrate::ctprof_compare::AggRule::ModeCharandcrate::ctprof_compare::AggRule::ModeBoolaccessors coerce them throughStringviato_string()at the call site. If a second bool field appears, promote both to a dedicatedCategoricalBoolwrapper rather than continuing the ad-hoc coercion.CpuSet—Vec<u32>of CPU IDs, affinity-aggregated.cpu_affinityis the only example.
§The marker traits
-
Summable— sum across a group. Implemented by the four counter newtypes (MonotonicCount,MonotonicNs,ClockTicks,Bytes). NOT implemented byPeakNs/GaugeNs/GaugeCount/OrdinalI32/OrdinalU32/OrdinalU64/CategoricalString/CpuSet. The trait is sealed viasealed::SummableSealedso a downstream crate cannot addimpl Summable for PeakNsto bypass the category invariant. -
Maxable— reduce by max. Implemented byPeakNs(max-of-peak is “worst peak any contributor saw across its lifetime”),GaugeNs(max-of-gauge is “longest current slice in the bucket”), andGaugeCount(max-of-count is “biggest current count any contributor carried”). NOT implemented bySummablecumulative counters (MonotonicCount/MonotonicNs/ClockTicks/Bytes) — max-across-snapshots on a lifetime accumulator reduces to “the last snapshot’s value”, which is mostly noise relative to the lifetime-integrated quantity it reports. NOT implemented by ordinals (those carry a[min, max]range, not a single max), nor byCategoricalString(string max has no useful semantic), nor byCpuSet(the affinity reduction is a custom summary, not a bare max). Sealed viasealed::MaxableSealed.max_acrossreturnsOption<Self>:Nonefor an empty iterator (so callers can distinguish “no contributors” from “all contributors had zero”),Some(largest)otherwise. The parallelSummable::try_sum_acrossreturnsOption<Self>with the same empty-iterator semantics. Thetry_prefix (rather thanchecked_) avoids colliding with the stdlib’s overflow-detection naming convention — this is an empty-iterator check, not an arithmetic check. -
Modeable— reduce by mode (most-frequent value). Implemented byCategoricalStringonly. Sealed viasealed::ModeableSealed. -
Rangeable— reduce by[min, max]. Implemented byOrdinalI32,OrdinalU32, andOrdinalU64. Sealed viasealed::RangeableSealed.range_acrossreturnsOption<Range<Self>>— theRangenewtype enforcesmin ≤ maxat construction so a downstream consumer cannot observe a swapped pair.
Reductions are exposed as trait methods on
Summable / Maxable / Rangeable / Modeable.
Callers must import the relevant trait (or use ktstr::metric_types::*;) to call T::sum_across(...) /
T::max_across(...) / T::range_across(...) /
T::mode_across(...). The traits double as compile-time
markers — a generic site that wants “any summable type” can
take T: Summable and statically reject PeakNs.
§Wire-format compatibility
Every wrapper carries #[serde(transparent)] so the JSON
representation matches the unwrapped primitive. The
crate::ctprof::ThreadState migration to these
newtypes preserves wire format — existing
snapshot files (.ctprof.zst) deserialize unchanged.
§What this module is NOT
- It is NOT a unit-of-measure system. There is no
MonotonicNs * MonotonicNs = MonotonicNs²— these wrappers carry semantic category, not algebraic dimensionality. - It is NOT a runtime-typed value enum (that lives next to
the
crate::ctprof_compare::AggRuledispatch). This module only defines the building-block newtypes.
Structs§
- Bytes
- Byte count, IEC-binary auto-scaled
(
B → KiB → MiB → GiB → TiB). Accumulated by the kernel (or jemalloc, for the per-thread TSD allocator counters) from thread birth. - Categorical
String - Categorical string-valued field. Group reduction takes the
mode (most-frequent value); ties break alphabetically per the
existing
aggregate(AggRule::Mode, ...)rule. - Clock
Ticks - USER_HZ-scaled tick counter, accumulated by the kernel from
thread birth. The kernel exposes user-mode and kernel-mode
CPU time, plus delayacct blkio delay, in ticks of the
userspace-visible
USER_HZfrequency. Auto-scale ladder isticks → Kticks → Mticks(decimal SI), kept distinct from ns and bytes so the rendered cell carries the correct unit suffix. - CpuSet
- CPU affinity set. Group reduction produces an
crate::ctprof_compare::AffinitySummarycarrying the num_cpus range plus a uniform-cpuset flag. - Dead
Counter - Kernel counter whose update path is permanently dead. The
field exists in
task_struct(and is exposed via/proc/<tid>/sched) but no kernel writer touches it on any current code path. - Gauge
Count - Gauge-family unitless count (u64). Distinct from
MonotonicCount: aMonotonicCountonly ever goes UP over a thread’s lifetime (integrated from birth), while aGaugeCountis sampled at capture time and can go up AND down at runtime as the underlying state changes. Distinct fromGaugeNs: same Maxable-only contract, but renders as a unitless count rather than a nanosecond ladder. - GaugeNs
- Instantaneous gauge sampled at capture time, nanoseconds.
Distinct from
PeakNs: a gauge is a snapshot of the CURRENT value of a kernel field, not a lifetime maximum.fair_slice_nsreads the per-threadsliceline from/proc/<tid>/sched, which carries the scheduler’s current timeslice for the task — a point-in-time reading, not a thread-lifetime accumulator. Cross-field ratios withMonotonicNs/MonotonicCount/ etc. produce a quantity with mixed temporal interpretation (numerator integrates from thread birth, denominator samples the present), so callers should treat such ratios as a rough hint rather than a well-defined fraction. - Monotonic
Count - Pure monotonic counter — only ever goes up over a thread’s lifetime, accumulated by the kernel from thread birth to the moment of the procfs read. Sum across a group; delta across snapshots scopes the value to the inter-capture interval.
- Monotonic
Ns - Cumulative-time counter, nanoseconds, accumulated by the
kernel from thread birth. Same temporal-window shape as
MonotonicCountbut tagged for the ns auto-scale ladder (ns → µs → ms → s). - Ordinal
I32 - Bounded ordinal scalar (i32). Range-aggregated across a
group: the cell carries the observed
[min, max]interval, not a sum. Sum is meaningless for ordinals — adding twonicevalues doesn’t produce a third nice value. - Ordinal
U32 - Bounded ordinal scalar (u32). Same range-aggregation contract
as
OrdinalI32but for unsigned 32-bit fields. - Ordinal
U64 - Bounded ordinal scalar (u64). Same range-aggregation contract
as
OrdinalI32but for unsigned 64-bit fields. No registry metric uses this width today; reserved for future ordinal metrics whose kernel-side type genuinely exceedsu32::MAX. - Peak
Bytes - Lifetime high-water mark, bytes. Same Maxable-only contract
as
PeakNsbut Bytes-typed so the renderer routes through the IEC binary auto-scale ladder (B → KiB → MiB → GiB → TiB) instead of the ns ladder. - PeakNs
- Lifetime high-water mark, nanoseconds. The kernel updates
these as a max-against-prior in
update_stats_*/update_se/set_next_entitypaths (kernel/sched/stats.c,kernel/sched/fair.c); the value at any procfs read is the largest single window the thread has accumulated since its birth. Group reduction takes max across contributors so the rendered cell surfaces the worst single window any thread experienced over its lifetime. - Range
- Inclusive
[min, max]interval over aRangeabletype.
Traits§
- Maxable
- Marker for newtypes that can be reduced by max across a group.
- Modeable
- Marker for newtypes reduced by mode (most-frequent value).
Implemented by
CategoricalString. - Rangeable
- Marker for newtypes reduced by
[min, max]range. Implemented byOrdinalI32,OrdinalU32, andOrdinalU64. - Summable
- Marker for newtypes that can be summed across a group.