populate_run_distribution_metrics

Function populate_run_distribution_metrics 

Source
pub fn populate_run_distribution_metrics(stats: &mut ScenarioStats)
Expand description

Populate run-level DERIVED distributional metrics into stats.ext_metrics: every registered MetricKind::Distribution, MetricKind::WorstLowest, MetricKind::WakeLatencyTailRatio, and MetricKind::WorstCrossNodeRatio. This is the SOLE within-run producer of those metrics’ values — they carry no per-phase sample slice and no cross-cgroup merge fold, and their registry accessors are |_| None, so MetricDef::read reads the value written here from ext_metrics.

DISTRIBUTION (the 5 wake / run-delay aggregates): pools the RAW sample vectors held in stats.phases[].per_cgroup across EVERY phase and EVERY cgroup into one combined set, then recomputes the percentile / CV / mean / extreme over it — the statistic of the union, NOT a max or mean of per-cgroup reductions (the percentile of a union is not the max of per-source percentiles). The ns→µs scale is applied ONCE here (the carriers store raw ns, per PhaseCgroupStats::run_delays_ns). The wake pool is population-WEIGHTED: each phase carrier’s samples carry weight wake_sample_total / wake_latencies_ns.len(), so a phase whose reservoir hit the cap contributes by true population, not capped length (the cross-PHASE de-skew) — reduced via the weighted percentile / moments. The run-delay pool is unweighted (per-worker, never reservoir-capped, so length IS population). Below the wake cap every weight is 1.0, so the weighted P99 / median / mean / worst are byte-identical to the unweighted concat; the weighted CV matches only within ~1e-9 (it sums the mean in f64 where the unweighted path sums in u64 — a weighted variance cannot keep the u64 sum).

CARRIER-LESS FOLD (graceful degradation): a cgroup whose raw samples are NOT in the pool — a backdrop epoch that fell on BASELINE or the inter-step gap (no paired host bucket, so no carrier) or a cgroup whose carrier was stripped/empty (strip_phase_cgroup_samples) — is NOT dropped. Its surviving per-cgroup CgroupStats reduction folds worst-wins (max — every Distribution metric is LowerBetter, registry-gated) into the pooled value. The CgroupStats reductions are never stripped — stats.cgroups[] is the already-reduced cgroup_stats(reports) output, a SEPARATE reduction path from the per-phase carriers — so a carrier-less cgroup always has a source. When EVERY carrier is empty (a fully-stripped run) the pool is empty and the result degenerates to the max over every cgroup’s reduction — the pre-Item-7 cross-cgroup max. NOTE the value CLASS of a folded cgroup differs from a pooled one for the P99 / Median / Mean / CV reductions: a pooled cgroup contributes to the percentile of the union; a carrier-less cgroup contributes its per-cgroup reduction worst-wins (a worst-cgroup proxy, not pooled). For the SampleReduction::Worst reduction the two COINCIDE (max-of-union == max-of-per-cgroup-maxes), so the carrier-less fold is exact there, not a proxy. A second asymmetry specific to CV (from the population weighting): the POOLED CV divides variance/mean by Σ per-sample weights (the reconstructed population), while a carrier-less cgroup’s folded CV is cgroup_stats’s UNWEIGHTED CV (n = all_latencies.len()). The two coincide below the cap (all weights 1.0) and diverge above it; the mix is sound — a carrier-less cgroup has no per-phase weight data to population-weight (its carrier is absent by definition), and both feed the same LowerBetter worst-wins max. Backdrop step-phase carriers now join the pool directly (per-epoch expansion in collect_handles); only the carrier-less cases above fold worst-wins.

WORSTLOWEST (the 2 iteration efficiencies): the lowest (worst) cgroup’s efficiency, computed per-cgroup from the stats.cgroups[] COUNTERS via CgroupStats::iterations_per_worker / CgroupStats::iterations_per_cpu_sec and the None-aware lowest-wins fold (a measured Some(0.0) — starvation — wins; a no-data None is skipped; an all-None cohort writes no key, preserving absence as a missing ext entry rather than a 0.0). The counters survive stripping, so WorstLowest needs no fallback branch.

Runs post-merge at the eval layer beside populate_run_pooled_iterations_per_cpu_sec, AFTER the per-cgroup carriers are folded into stats.phases and BEFORE the sidecar write, so stats.phases[].per_cgroup is fully merged and stats.cgroups is the final per-cgroup roll-up.