pub fn build_phase_buckets_with_stimulus(
samples: &SampleSeries,
stimulus_events: &[StimulusEvent],
) -> Vec<PhaseBucket>Expand description
Phase buckets attributed against the guest stimulus timeline, then
enriched with stimulus-event-derived per-phase iteration_rate.
Unlike the plain build_phase_buckets (which groups by the
bridge-stamped step_index), this re-groups each periodic capture by
the guest step whose stimulus window contains the capture’s
workload-relative boundary offset (Sample::boundary_offset_ms).
That offset is derived from the boundary schedule rather than the
fire time, so it is immune to the deferred-fire burst that makes
every capture stamp the same late CURRENT_STEP (the
phases.len() == 1 collapse). Captures with no offset (on-demand /
fixture) fall back to their stamped step_index. Because the bucket
windows are then workload-relative, the run-relative monitor samples
are shifted by the stimulus/monitor clock skew before windowing.
Additionally synthesizes a capture-free PhaseBucket
(sample_count == 0) for any stimulus StepStart-step that
captured no periodic samples — the uniform whole-workload boundary
placement (compute_periodic_boundaries_ns) is step-agnostic, so a
short interior step can land zero captures and otherwise leave no
bucket, silently dropping its capture-independent iteration_rate.
The synthesized bucket carries the step’s full stimulus window so its
iteration_rate (from StepStart/StepEnd deltas) and
avg_imbalance_ratio (from in-window monitor samples) are still
recovered. The returned vec therefore holds one bucket per
(captured phase ∪ StepStart-step), sorted by step_index — NOT
one-per-captured-phase, so len() is no longer “number of captured
phases”.
The iteration_rate enrichment lets
crate::timeline::Timeline::from_phase_buckets render the per-phase
throughput annotation without going through the legacy
crate::timeline::Timeline::build path.
For each StepStart[k] -> StepEnd[k] pair with
total_iterations: Some(_), the per-phase rate is
(later - earlier) / duration_s where duration_s is the
elapsed-ms delta BETWEEN THE TWO STIMULUS EVENTS (guest clock),
not the PhaseBucket sample window. The rate is attributed to the
step the EARLIER event starts (prev.step_index); the attribution
loop skips any is_step_end (or is_terminal) prev, so only a
StepStart is ever the earlier member. Phases that don’t overlap a
stimulus pair keep their PhaseBucket.metrics map unchanged (no
iteration_rate key).
SEMANTICS: total_iterations is the sum of the worker handles
alive at each event (see
crate::timeline::StimulusEvent::total_iterations). Each step’s
rate is its STEP-LOCAL StepStart[k] -> StepEnd[k] delta — the
step’s own workers measured over its own hold — so a bucket is
sourced ONLY by its own pair (the is_step_end guard drops the
inter-step StepEnd[k] -> StepStart[k+1] pair entirely). This
measures BOTH fresh-per-step workers (which read ~0 at each
StepStart, so the old cross-step delta produced no rate) and
persistent (Backdrop) workers (excluding the inter-step teardown
wall-time a cross-step window would span). On a clean run the
(StepEnd[N], terminal) pair is guard-skipped and the trailing
is_terminal event is not consumed; it supplies a step’s right
boundary ONLY for legacy/synthetic data carrying a ScenarioEnd
frame but no StepEnd frames. A sched-died step has neither frame
(its early return skips both emissions), so the dead step’s
StepStart is never a prev with a successor and it reports no
rate.
iteration_rate is registered as MetricKind::Rate with the Counter
components total_phase_iterations / total_phase_duration_sec and
HigherBetter polarity (more throughput is better). The per-step
producer below emits those two components (the iteration delta and the
window seconds — the ms→s /1000 applied at the component, since
derive_rate_metrics does a bare num/den) rather than a ready ratio,
and the derive_rate_metrics post-pass re-derives iteration_rate =
Σiterations / Σseconds at every in-map aggregation level. Its per-run
run-scalar fold (one run’s per-phase values → that run’s ext_metrics)
runs through populate_run_ext_metrics_from_phases, which SUMS the
Counter components across phases (a synthesized zero-capture phase’s
components are summed in, not zero-weighted out — the run-aggregate
completion of the per-step rate handling) and re-derives the rate. The
cross-sidecar-run rollup group_and_average_by likewise re-pools via
its derive_rate_metrics post-pass. iteration_rate has no cross-cgroup
axis to re-pool: it is derived from run-level phase buckets and
host-injected into the run ext_metrics by
populate_run_ext_metrics_from_phases (the eval layer) AFTER the
cross-cgroup merge, so AssertResult::merge’s worst-case
(min/max-by-polarity) ext_metrics fold never sees its components. The rate whose components ARE per-cgroup is the separate
pooled iterations_per_cpu_sec, re-pooled across a run’s cgroups by
populate_run_pooled_iterations_per_cpu_sec (reading stats.cgroups
post-merge).
Live caller: evaluate_vm_result at src/test_support/eval/mod.rs
— has both the SampleSeries and the stimulus_events vec in scope.