pub fn populate_run_pooled_taobench(stats: &mut ScenarioStats)Expand description
Inject the whole-run taobench engine’s qps + hit Rate components into
stats.ext_metrics, pooled across the run’s WorkType::Taobench cgroups.
Each cgroup carries its workers’ merged whole-run aggregate
(crate::assert::CgroupStats::taobench_whole); this folds those across
cgroups (Σ ops, MAX wall window — the window is shared by the concurrent
cohorts, per TaobenchStats::merge) and writes the six total_taobench_*
Counter components (ops, fast_ops, slow_ops, wall_sec, plus the
command-time get_cmds / get_hits), from which
crate::stats::derive_rate_metrics derives taobench_total_ops_per_sec,
taobench_fast_ops_per_sec, taobench_slow_ops_per_sec, the response-time
taobench_hit_fraction (Σfast/Σcompleted), and the command-time
taobench_command_hit_rate (Σhits/Σcmds). The whole-run Rate keys are
registered METRICS, so — unlike the per-phase taobench_*_qps
(MetricKind::PerPhase, invisible to the whole-run cross-run fold) — they
reach the perf-delta --noise-adjust spread analysis. The open-loop
serve-latency distribution is a SEPARATE pool
(populate_run_pooled_taobench_distribution, the *_us_whole keys).
MUST run post-merge (after every cgroup-bearing merge has populated
stats.cgroups), exactly like populate_run_pooled_iterations_per_cpu_sec:
an earlier run would pool over an incomplete cgroup set. A run with no
Taobench cgroup writes nothing (the pool is None) — the keys stay absent,
keeping a non-taobench run distinct from a measured zero.
Both-or-neither (the derive_rate_metrics co-location invariant): all six
components are inserted together, gated on a measured wall window
(elapsed_ns > 0) — the qps denominator. The three per-second Rates then
derive unconditionally (wall_sec > 0); taobench_hit_fraction =
total_taobench_fast_ops / total_taobench_ops derives iff ops completed
(total_taobench_ops > 0) and taobench_command_hit_rate =
total_taobench_get_hits / total_taobench_get_cmds iff lookups issued
(get_cmds > 0); derive_rate_metrics skips a zero-denominator rate, so a
window-but-no-ops run gets qps=0 keys but no false hit fraction / hit rate.
The ns→s /1e9 is applied ONCE here (not in derive_rate_metrics, a bare
num/den), mirroring total_cpu_time_sec. Cross-RUN the components SUM-fold
(Counter), so each Rate re-pools as Σnumerator / Σdenominator over the cohort
— the aggregate throughput / hit rate, not a mean of per-run values.