Module ctprof_compare

Module ctprof_compare 

Source
Expand description

Group, aggregate, and render the comparison between two CtprofSnapshots.

Design summary: the per-thread profiler emits one snapshot per run. Comparison groups threads within each snapshot by a single axis (pcomm, cgroup, comm, or comm-exact), or by all pattern-aware axes at once (GroupBy::All) — see GroupBy; aggregates every metric per the rule on its CtprofMetricDef, then matches groups across the two snapshots and emits one row per (group, metric) pair. Groups present on only one side surface as unmatched entries rather than imaginary zero-valued rows — a row is missing because the process did not exist, not because it did zero work.

No judgment labels. The comparison prints raw numbers and percent delta; interpretation (regression vs improvement) is scheduler-specific and left to the user. This deliberately diverges from the gauntlet stats comparison in crate::stats, which DOES classify each metric (Finding::kind, CompareReport::{regressions, improvements, unchanged}): ctprof_compare emits no verdict.

Structs§

AffinitySummary
CPU-affinity aggregation result.
CompareOptions
Options controlling compare.
CtprofCompareArgs
Arguments for the ktstr ctprof compare subcommand.
CtprofDiff
Full comparison result.
CtprofMetricDef
One metric exposed by the comparison pipeline.
DerivedMetricDef
Definition of a derived metric: a function that consumes the already-aggregated input metrics for a group and produces a single scalar (with its own unit and operator-facing description).
DerivedRow
One row in the derived-metrics table: (matched group, derivation) with the computed scalar from both sides.
DiffRow
One row in the comparison table: (group, metric) pair with aggregated values from both sides.
DisplayOptions
Aggregate display options for the renderer. Plumbed as a single struct through write_diff so a future addition lands in one place without growing every signature. The show-side entry (write_show in src/bin/ktstr.rs) keeps a flatter signature for historical reasons but mirrors the same field semantics — --wrap, --sections, --metrics reach show via wrap / sections / metrics parameters that share the same helpers (new_wrapped_table, Section::cli_name).
FudgedPair
A pair of cgroup groups fudged together by thread population overlap. Fudging joins a baseline cgroup to a candidate cgroup when their per-cgroup thread-type sets share enough population (Jaccard similarity ≥ 0.90) — a renamed-but-otherwise-identical scope under a shifted path is rejoined for diffing instead of surfacing as separate orphans.
GroupByOrDefault
Newtype wrapper around GroupBy that defaults to GroupBy::Pcomm. Separate type so CompareOptions::default() does not need to spell out every field.
SortKey
One key in a multi-key --sort-by spec. Names a metric from CTPROF_METRICS or CTPROF_DERIVED_METRICS and the sort direction for that key. Direction defaults to descending (largest delta first) so the common operator request — “show me the biggest regressions first” — is the unmarked form.
ThreadGroup
Aggregated metrics for every thread matched by one group key.

Enums§

AggRule
Aggregation rule for a single metric.
Aggregated
Aggregated metric value for a single super::ThreadGroup.
Column
One column slot in the rendered diff/show table. The renderer iterates the resolved Column vec to build both the header row and each data row, dispatching cell construction per variant. Order in the slice is the rendered order — the renderer never re-sorts.
DerivedValue
Output value of a derived metric.
DisplayFormat
Per-row display layout for write_diff.
GroupBy
Grouping key for the ctprof compare.
ScaleLadder
Closed enumeration of auto-scale ladders driving format dispatch.
Section
One sub-table emitted by write_diff / write_show. --sections filters which sub-tables render — every section not in the filter is suppressed before its emission gate (zero-suppression, group-by-cgroup gating, etc.) runs, so a section that would otherwise emit when its data is present stays silent when omitted from the filter.

Statics§

CTPROF_DERIVED_METRICS
Registry of derived metrics. Each entry consumes one or more already-aggregated input metrics from CTPROF_METRICS and produces a single scalar with its own unit. See the per-entry doc strings for the formula and kernel-source rationale.
CTPROF_METRICS
Registry of per-thread metrics. Order here is the default display order for rows that have no numeric delta to sort by (ties fall back to registry order). Names are the ASCII short-form used in capture code; long-form display is the same — no translation layer.

Functions§

aggregate
build_cgroup_key_map
Build the post-flatten-path → final-tightened-key map for GroupBy::Cgroup under auto-normalization. Walks the union of paths from both snapshots’ threads and cgroup_stats so that Layer 3 (tighten) sees every contributor to a given Layer-2 skeleton group. Returns the map keyed by post-flatten path; consumers (build_groups, flatten_cgroup_stats) look up the final key for any path they see.
build_groups
cgroup_cell
Render a (baseline, candidate, delta) cell for the cgroup-enrichment secondary table emitted under super::GroupBy::Cgroup. The ladder parameter routes each scalar through auto_scale (private to this module) so a 7.5 GiB memory_current row reads 7.500GiB → 8.250GiB (+768.000MiB) instead of 8053063680 → 8858370048 (+805306368). Each cell scales independently — baseline, candidate, and delta may pick different prefixes when their magnitudes cross thresholds.
cgroup_limits_cell
Render a baseline → candidate cell for cpu.max (quota, period) pairs. When both pairs are equal, renders once via format_cpu_max; otherwise renders as <a> → <b>. Mirrors cgroup_optional_limit_cell’s equality-collapse policy.
cgroup_optional_limit_cell
Render a baseline → candidate cell for an Option<u64> LIMIT (e.g. memory.max, memory.high, pids.max). None reads as max (no limit) per format_optional_limit; a step from concrete to max between snapshots renders as <value> → max.
collect_smaps_rollup
Walk a snapshot’s threads and pull non-empty smaps_rollup maps off the leader threads (tid == tgid; non-leader threads land at empty map per the leader-dedup contract).
collect_smaps_rollup_hierarchical
color_derived_cells
Wrap a string-cell row in [comfy_table::Cell]s with blue foreground so derived-metric rows render visually distinct from the per-thread primary table when stdout is a TTY. Operators scanning a long compare or show output can locate the ## Derived metrics rows at a glance instead of relying on the section header alone.
color_diff_cell
Color a diff-table cell based on its column type, the row’s raw delta (sign of color), and the row’s delta_pct (fraction for the bold threshold). Delta/% cells: yellow for positive (increase), magenta for negative (decrease). Uptime: green/yellow/red gradient. Other columns: default.
colored_header
Build a colored header row — cyan foreground so headers are visually distinct from data rows.
colored_header_with_sort
compare
Compare two snapshots and produce a CtprofDiff. Sequences the comparison phases in data-flow order: build the per-side thread groups, emit matched + one-sided rows, fudge one-sided cgroups together (producing the fudged_key_pairs consumed by the uptime and enrichment phases), sort the one-sided lists, fill uptime%, order the rows, and attach enrichment. The fudged-pair (bkey, ckey) threading is the only cross-phase data dependency: built by apply_cgroup_fudge, read by fill_uptime_pct and attach_enrichment.
compile_flatten_patterns
flatten_cgroup_path
Collapse dynamic segments of a cgroup path per every pattern in patterns. A pattern is a glob matched with glob’s default MatchOptions (require_literal_separator = false), so * is NOT segment-bounded — it matches across / just like **. The literal portions are preserved and the wildcard portions are replaced with the wildcard token itself. Example: pattern /kubepods/*/workload applied to /kubepods/pod-abc/workload produces /kubepods/*/workload, so two runs with different pod IDs collapse onto the same key.
flatten_cgroup_stats
format_cpu_max
Render a cpu.max pair as <quota>/<period> where quota is either max (no cap) or the auto-scaled µs value. Period is always present (default 100_000 µs per default_bw_period_us() at kernel/sched/sched.h:441). The <quota>/<period> separator is THIS crate’s display convention — the kernel itself emits raw integers in cat cpu.max (space-separated, no auto-scale); we auto-scale via format_scaled_u64 for human-friendly output, which also widens the visual delimiter from the kernel’s space to a slash.
format_derived_delta_cell
Format the signed delta cell for a derived row. Mirrors format_derived_value_cell but always carries an explicit +/- sign so the operator can read directionality at a glance. Ratios render with three decimals (+0.100 is +10pp); other ladders route through auto_scale and pick up the scaled unit suffix.
format_derived_value_cell
Format a derived-metric value cell for the ## Derived metrics table. Ratio rows (is_ratio: true, ScaleLadder::None) render with three decimals (0.873); ns / B / ticks ladders route through the same auto-scale ladder as the main table. Negative values (e.g. a negative live_heap_estimate) carry their explicit minus sign through the format.
format_optional_limit
Render an Option<u64> cgroup limit as either max (no limit / kernel emitted the literal max token) or the auto-scaled value. Used for memory.max, memory.high, pids.max. Mirrors the kernel’s own display: cat memory.max prints max when no cap is set, a u64 byte count otherwise.
format_psi_avg_cell
Render a baseline→candidate→delta cell for a PSI average field. baseline and candidate are centi-percent (0..=10000 covering 0.00..=100.00 %); the cell renders each as N.NN% and computes a signed delta (+|-D.DD%). Mirrors cgroup_cell’s structure but does NOT route through the auto-scale ladder — a pressure percentage is dimensionless and topping out at 100 means there’s nothing to scale.
format_psi_avg_centi_percent
Convert a centi-percent value (0..=10000) to its display form N.NN%. The centi-percent representation is 1:1 with the kernel’s LOAD_INT.LOAD_FRAC 2-decimal-digit emission at kernel/sched/psi.c:1284 — preserve that precision on display.
format_scaled_u64
Auto-scale a u64 value at the given ladder and render it as a cell. Helper for format_value_cell — the Sum and Max arms share this exact logic. Also used by the ctprof show renderer for the cgroup-stats secondary table, where each scalar stands alone (no baseline/candidate pair to fold into a delta cell).
format_value_cell
Format a per-row baseline / candidate cell for super::write_diff. Numeric aggregates (Aggregated::Sum / Aggregated::Max) run through auto_scale so large values render in a readable magnitude (1.235ms instead of 1234567ns). When the scaled unit equals the ladder’s base unit (no step-up was triggered), the original integer value is rendered verbatim — this avoids polluting small numbers with a .000 suffix. Non-numeric aggregates (OrdinalRange, Mode, Affinity) fall through to the Aggregated std::fmt::Display impl unchanged because no scaling applies; the ladder is ScaleLadder::None for these and the suffix is empty.
limit_sections
Truncate each ## <heading> section to at most limit lines. Sections are delimited by lines starting with ## . Content before the first section header passes through untruncated (typically the file-path header row).
metric_display_name
Borrow the metric’s bare name from the registry. The &'static str lifetime piggybacks on CtprofMetricDef::name’s static-string storage — callers may borrow the static name without allocation; render sites that need owned Strings allocate at the table-cell boundary (see super::render at the metric_display_name(metric_def).to_string() call site and super::runner::write_metric_list).
metric_tags
Render a metric’s bracketed gating tags as a single space-separated string. Returns the empty string when sched_class is None, is_dead is false, AND config_gates is empty.
parse_columns
Parse a CLI --columns spec into a typed Column vec. Format: comma-separated names matching Column::cli_name. Whitespace around each name is trimmed. Empty input parses to an empty Vec — caller falls back to the format default.
parse_metrics
Parse a CLI --metrics spec into a typed Vec<&'static str> of registry names. Format: comma-separated names that must each match a name field from either CTPROF_METRICS or CTPROF_DERIVED_METRICS. Whitespace around each name is trimmed. Empty input parses to an empty Vec — caller treats that as “every metric renders” via DisplayOptions::is_metric_enabled, mirroring parse_sections’s empty-input semantic.
parse_sections
Parse a CLI --sections spec into a typed Section vec. Format: comma-separated names matching Section::cli_name. Whitespace around each name is trimmed. Empty input parses to an empty Vec — caller treats that as “every section renders” via DisplayOptions::is_section_enabled.
parse_sort_by
Parse a --sort-by CLI value into a list of SortKeys. Spec format: metric1[:dir1],metric2[:dir2],... where each metric is a name from CTPROF_METRICS or CTPROF_DERIVED_METRICS and dir is asc or desc (case-insensitive — :DESC, :Asc, :asc all work). Direction defaults to desc (largest delta first — operator “show me the largest changes” default).
pattern_display_label
Compute the operator-facing display label for a pattern-aware group, given the union of baseline+candidate member comms. For buckets with ≥ 2 distinct member names, runs grex over the sorted union to emit a regex that exactly matches the constituent thread names. For singleton or all-identical buckets, returns the join key unchanged so the rendered label equals what would have shown under literal grouping.
pattern_key
Compute the token-normalized skeleton for a name string.
print_diff
Render CtprofDiff as a table on stdout. Thin wrapper over write_diff so the non-test caller keeps the ergonomics of a one-line call; tests drive write_diff into a String buffer.
print_metric_list
Print the metric-list discovery output to stdout. Thin wrapper over write_metric_list so the CLI keeps the one-line call ergonomics; tests drive the writer into a String buffer.
run_compare
Entry point for the compare CLI. Parses --sort-by first, then loads both snapshots, computes the diff, prints the table, and returns 0 on success. Exits non-zero only on I/O or parse errors; a non-empty diff is data, not a failure.
run_metric_list
Entry point for the ctprof metric-list subcommand. Always returns Ok(0) — discovery output is informational and never fails.
warn_cgroup_only_sections_under_non_cgroup
Emit a stderr warning when an explicit --sections filter names a cgroup-only section while --group-by is not GroupBy::Cgroup. Without the warning, the section would silently render zero rows (its outer-gate suppresses it), leaving the operator wondering whether their snapshot lacked the data or their flag was misconfigured.
write_diff
Render CtprofDiff into w. The formatter layer lives here so tests can inspect exactly what print_diff would emit without shelling through stdout capture. Write errors propagate as std::fmt::Error — callers that write into an infallible sink (String) can unwrap or ignore.
write_metric_list
Render the metric-list discovery output: a tag legend (sched_class / config_gates / [dead]) followed by a per-metric table whose rows show name | tags | description. Tag legend is keyed off the closed-set vocabulary the registry pin test guards (registry_tag_vocabulary_is_closed), so adding a new allowed class or gate fails the test until both the legend and the closed-set table are updated together.