Expand description
Group, aggregate, and render the comparison between two
CtprofSnapshots.
Design summary: the per-thread profiler emits
one snapshot per run. Comparison groups threads within each
snapshot by a single axis (pcomm, cgroup, comm, or
comm-exact), or by all pattern-aware axes at once
(GroupBy::All) — see GroupBy; aggregates every metric per
the rule on its CtprofMetricDef, then matches groups
across the two snapshots and emits one row per
(group, metric) pair. Groups present on only one side
surface as unmatched entries rather than imaginary
zero-valued rows — a row is missing because the process did
not exist, not because it did zero work.
No judgment labels. The comparison prints raw numbers and
percent delta; interpretation (regression vs improvement) is
scheduler-specific and left to the user. This deliberately
diverges from the gauntlet stats comparison in crate::stats,
which DOES classify each metric (Finding::kind,
CompareReport::{regressions, improvements, unchanged}):
ctprof_compare emits no verdict.
Structs§
- Affinity
Summary - CPU-affinity aggregation result.
- Compare
Options - Options controlling
compare. - Ctprof
Compare Args - Arguments for the
ktstr ctprof comparesubcommand. - Ctprof
Diff - Full comparison result.
- Ctprof
Metric Def - One metric exposed by the comparison pipeline.
- Derived
Metric Def - Definition of a derived metric: a function that consumes the already-aggregated input metrics for a group and produces a single scalar (with its own unit and operator-facing description).
- Derived
Row - One row in the derived-metrics table:
(matched group, derivation)with the computed scalar from both sides. - DiffRow
- One row in the comparison table:
(group, metric)pair with aggregated values from both sides. - Display
Options - Aggregate display options for the renderer. Plumbed as a
single struct through
write_diffso a future addition lands in one place without growing every signature. The show-side entry (write_showinsrc/bin/ktstr.rs) keeps a flatter signature for historical reasons but mirrors the same field semantics —--wrap,--sections,--metricsreach show viawrap/sections/metricsparameters that share the same helpers (new_wrapped_table,Section::cli_name). - Fudged
Pair - A pair of cgroup groups fudged together by thread population overlap. Fudging joins a baseline cgroup to a candidate cgroup when their per-cgroup thread-type sets share enough population (Jaccard similarity ≥ 0.90) — a renamed-but-otherwise-identical scope under a shifted path is rejoined for diffing instead of surfacing as separate orphans.
- Group
ByOr Default - Newtype wrapper around
GroupBythat defaults toGroupBy::Pcomm. Separate type soCompareOptions::default()does not need to spell out every field. - SortKey
- One key in a multi-key
--sort-byspec. Names a metric fromCTPROF_METRICSorCTPROF_DERIVED_METRICSand the sort direction for that key. Direction defaults to descending (largest delta first) so the common operator request — “show me the biggest regressions first” — is the unmarked form. - Thread
Group - Aggregated metrics for every thread matched by one group key.
Enums§
- AggRule
- Aggregation rule for a single metric.
- Aggregated
- Aggregated metric value for a single
super::ThreadGroup. - Column
- One column slot in the rendered diff/show table. The renderer
iterates the resolved
Columnvec to build both the header row and each data row, dispatching cell construction per variant. Order in the slice is the rendered order — the renderer never re-sorts. - Derived
Value - Output value of a derived metric.
- Display
Format - Per-row display layout for
write_diff. - GroupBy
- Grouping key for the ctprof compare.
- Scale
Ladder - Closed enumeration of auto-scale ladders driving format dispatch.
- Section
- One sub-table emitted by
write_diff/write_show.--sectionsfilters which sub-tables render — every section not in the filter is suppressed before its emission gate (zero-suppression, group-by-cgroup gating, etc.) runs, so a section that would otherwise emit when its data is present stays silent when omitted from the filter.
Statics§
- CTPROF_
DERIVED_ METRICS - Registry of derived metrics. Each entry consumes one or more
already-aggregated input metrics from
CTPROF_METRICSand produces a single scalar with its own unit. See the per-entry doc strings for the formula and kernel-source rationale. - CTPROF_
METRICS - Registry of per-thread metrics. Order here is the default display order for rows that have no numeric delta to sort by (ties fall back to registry order). Names are the ASCII short-form used in capture code; long-form display is the same — no translation layer.
Functions§
- aggregate
- build_
cgroup_ key_ map - Build the post-flatten-path → final-tightened-key map for
GroupBy::Cgroupunder auto-normalization. Walks the union of paths from both snapshots’ threads andcgroup_statsso that Layer 3 (tighten) sees every contributor to a given Layer-2 skeleton group. Returns the map keyed by post-flatten path; consumers (build_groups,flatten_cgroup_stats) look up the final key for any path they see. - build_
groups - cgroup_
cell - Render a
(baseline, candidate, delta)cell for the cgroup-enrichment secondary table emitted undersuper::GroupBy::Cgroup. Theladderparameter routes each scalar throughauto_scale(private to this module) so a 7.5 GiBmemory_currentrow reads7.500GiB → 8.250GiB (+768.000MiB)instead of8053063680 → 8858370048 (+805306368). Each cell scales independently — baseline, candidate, and delta may pick different prefixes when their magnitudes cross thresholds. - cgroup_
limits_ cell - Render a baseline → candidate cell for
cpu.max(quota, period)pairs. When both pairs are equal, renders once viaformat_cpu_max; otherwise renders as<a> → <b>. Mirrorscgroup_optional_limit_cell’s equality-collapse policy. - cgroup_
optional_ limit_ cell - Render a baseline → candidate cell for an
Option<u64>LIMIT (e.g.memory.max,memory.high,pids.max).Nonereads asmax(no limit) performat_optional_limit; a step from concrete tomaxbetween snapshots renders as<value> → max. - collect_
smaps_ rollup - Walk a snapshot’s threads and pull non-empty smaps_rollup maps off the leader threads (tid == tgid; non-leader threads land at empty map per the leader-dedup contract).
- collect_
smaps_ rollup_ hierarchical - color_
derived_ cells - Wrap a string-cell row in [
comfy_table::Cell]s with blue foreground so derived-metric rows render visually distinct from the per-thread primary table when stdout is a TTY. Operators scanning a long compare or show output can locate the## Derived metricsrows at a glance instead of relying on the section header alone. - color_
diff_ cell - Color a diff-table cell based on its column type, the row’s raw delta (sign of color), and the row’s delta_pct (fraction for the bold threshold). Delta/% cells: yellow for positive (increase), magenta for negative (decrease). Uptime: green/yellow/red gradient. Other columns: default.
- colored_
header - Build a colored header row — cyan foreground so headers are visually distinct from data rows.
- colored_
header_ with_ sort - compare
- Compare two snapshots and produce a
CtprofDiff. Sequences the comparison phases in data-flow order: build the per-side thread groups, emit matched + one-sided rows, fudge one-sided cgroups together (producing thefudged_key_pairsconsumed by the uptime and enrichment phases), sort the one-sided lists, fill uptime%, order the rows, and attach enrichment. The fudged-pair(bkey, ckey)threading is the only cross-phase data dependency: built byapply_cgroup_fudge, read byfill_uptime_pctandattach_enrichment. - compile_
flatten_ patterns - flatten_
cgroup_ path - Collapse dynamic segments of a cgroup path per every pattern
in
patterns. A pattern is a glob matched with glob’s defaultMatchOptions(require_literal_separator = false), so*is NOT segment-bounded — it matches across/just like**. The literal portions are preserved and the wildcard portions are replaced with the wildcard token itself. Example: pattern/kubepods/*/workloadapplied to/kubepods/pod-abc/workloadproduces/kubepods/*/workload, so two runs with different pod IDs collapse onto the same key. - flatten_
cgroup_ stats - format_
cpu_ max - Render a
cpu.maxpair as<quota>/<period>where quota is eithermax(no cap) or the auto-scaled µs value. Period is always present (default 100_000 µs perdefault_bw_period_us()atkernel/sched/sched.h:441). The<quota>/<period>separator is THIS crate’s display convention — the kernel itself emits raw integers incat cpu.max(space-separated, no auto-scale); we auto-scale viaformat_scaled_u64for human-friendly output, which also widens the visual delimiter from the kernel’s space to a slash. - format_
derived_ delta_ cell - Format the signed delta cell for a derived row. Mirrors
format_derived_value_cellbut always carries an explicit+/-sign so the operator can read directionality at a glance. Ratios render with three decimals (+0.100is +10pp); other ladders route throughauto_scaleand pick up the scaled unit suffix. - format_
derived_ value_ cell - Format a derived-metric value cell for the
## Derived metricstable. Ratio rows (is_ratio: true,ScaleLadder::None) render with three decimals (0.873); ns / B / ticks ladders route through the same auto-scale ladder as the main table. Negative values (e.g. a negativelive_heap_estimate) carry their explicit minus sign through the format. - format_
optional_ limit - Render an
Option<u64>cgroup limit as eithermax(no limit / kernel emitted the literalmaxtoken) or the auto-scaled value. Used formemory.max,memory.high,pids.max. Mirrors the kernel’s own display:cat memory.maxprintsmaxwhen no cap is set, a u64 byte count otherwise. - format_
psi_ avg_ cell - Render a baseline→candidate→delta cell for a PSI average
field.
baselineandcandidateare centi-percent (0..=10000 covering 0.00..=100.00 %); the cell renders each asN.NN%and computes a signed delta(+|-D.DD%). Mirrorscgroup_cell’s structure but does NOT route through the auto-scale ladder — a pressure percentage is dimensionless and topping out at 100 means there’s nothing to scale. - format_
psi_ avg_ centi_ percent - Convert a centi-percent value (0..=10000) to its display
form
N.NN%. The centi-percent representation is 1:1 with the kernel’sLOAD_INT.LOAD_FRAC2-decimal-digit emission atkernel/sched/psi.c:1284— preserve that precision on display. - format_
scaled_ u64 - Auto-scale a
u64value at the given ladder and render it as a cell. Helper forformat_value_cell— the Sum and Max arms share this exact logic. Also used by thectprof showrenderer for the cgroup-stats secondary table, where each scalar stands alone (no baseline/candidate pair to fold into a delta cell). - format_
value_ cell - Format a per-row baseline / candidate cell for
super::write_diff. Numeric aggregates (Aggregated::Sum/Aggregated::Max) run throughauto_scaleso large values render in a readable magnitude (1.235msinstead of1234567ns). When the scaled unit equals the ladder’s base unit (no step-up was triggered), the original integer value is rendered verbatim — this avoids polluting small numbers with a.000suffix. Non-numeric aggregates (OrdinalRange,Mode,Affinity) fall through to theAggregatedstd::fmt::Displayimpl unchanged because no scaling applies; the ladder isScaleLadder::Nonefor these and the suffix is empty. - limit_
sections - Truncate each
## <heading>section to at mostlimitlines. Sections are delimited by lines starting with##. Content before the first section header passes through untruncated (typically the file-path header row). - metric_
display_ name - Borrow the metric’s bare name from the registry. The
&'static strlifetime piggybacks onCtprofMetricDef::name’s static-string storage — callers may borrow the static name without allocation; render sites that need ownedStrings allocate at the table-cell boundary (seesuper::renderat themetric_display_name(metric_def).to_string()call site andsuper::runner::write_metric_list). - metric_
tags - Render a metric’s bracketed gating tags as a single
space-separated string. Returns the empty string when
sched_classisNone,is_deadis false, ANDconfig_gatesis empty. - parse_
columns - Parse a CLI
--columnsspec into a typedColumnvec. Format: comma-separated names matchingColumn::cli_name. Whitespace around each name is trimmed. Empty input parses to an empty Vec — caller falls back to the format default. - parse_
metrics - Parse a CLI
--metricsspec into a typedVec<&'static str>of registry names. Format: comma-separated names that must each match anamefield from eitherCTPROF_METRICSorCTPROF_DERIVED_METRICS. Whitespace around each name is trimmed. Empty input parses to an emptyVec— caller treats that as “every metric renders” viaDisplayOptions::is_metric_enabled, mirroringparse_sections’s empty-input semantic. - parse_
sections - Parse a CLI
--sectionsspec into a typedSectionvec. Format: comma-separated names matchingSection::cli_name. Whitespace around each name is trimmed. Empty input parses to an emptyVec— caller treats that as “every section renders” viaDisplayOptions::is_section_enabled. - parse_
sort_ by - Parse a
--sort-byCLI value into a list ofSortKeys. Spec format:metric1[:dir1],metric2[:dir2],...where eachmetricis a name fromCTPROF_METRICSorCTPROF_DERIVED_METRICSanddirisascordesc(case-insensitive —:DESC,:Asc,:ascall work). Direction defaults todesc(largest delta first — operator “show me the largest changes” default). - pattern_
display_ label - Compute the operator-facing display label for a pattern-aware group, given the union of baseline+candidate member comms. For buckets with ≥ 2 distinct member names, runs grex over the sorted union to emit a regex that exactly matches the constituent thread names. For singleton or all-identical buckets, returns the join key unchanged so the rendered label equals what would have shown under literal grouping.
- pattern_
key - Compute the token-normalized skeleton for a name string.
- print_
diff - Render
CtprofDiffas a table on stdout. Thin wrapper overwrite_diffso the non-test caller keeps the ergonomics of a one-line call; tests drivewrite_diffinto aStringbuffer. - print_
metric_ list - Print the metric-list discovery output to stdout. Thin
wrapper over
write_metric_listso the CLI keeps the one-line call ergonomics; tests drive the writer into aStringbuffer. - run_
compare - Entry point for the compare CLI. Parses
--sort-byfirst, then loads both snapshots, computes the diff, prints the table, and returns0on success. Exits non-zero only on I/O or parse errors; a non-empty diff is data, not a failure. - run_
metric_ list - Entry point for the
ctprof metric-listsubcommand. Always returnsOk(0)— discovery output is informational and never fails. - warn_
cgroup_ only_ sections_ under_ non_ cgroup - Emit a stderr warning when an explicit
--sectionsfilter names a cgroup-only section while--group-byis notGroupBy::Cgroup. Without the warning, the section would silently render zero rows (its outer-gate suppresses it), leaving the operator wondering whether their snapshot lacked the data or their flag was misconfigured. - write_
diff - Render
CtprofDiffintow. The formatter layer lives here so tests can inspect exactly whatprint_diffwould emit without shelling through stdout capture. Write errors propagate asstd::fmt::Error— callers that write into an infallible sink (String) can unwrap or ignore. - write_
metric_ list - Render the metric-list discovery output: a tag legend
(sched_class / config_gates /
[dead]) followed by a per-metric table whose rows showname | tags | description. Tag legend is keyed off the closed-set vocabulary the registry pin test guards (registry_tag_vocabulary_is_closed), so adding a new allowed class or gate fails the test until both the legend and the closed-set table are updated together.