build_phase_buckets_with_stimulus

Function build_phase_buckets_with_stimulus 

Source
pub fn build_phase_buckets_with_stimulus(
    samples: &SampleSeries,
    stimulus_events: &[StimulusEvent],
) -> Vec<PhaseBucket>
Expand description

Phase buckets attributed against the guest stimulus timeline, then enriched with stimulus-event-derived per-phase iteration_rate.

Unlike the plain build_phase_buckets (which groups by the bridge-stamped step_index), this re-groups each periodic capture by the guest step whose stimulus window contains the capture’s workload-relative boundary offset (Sample::boundary_offset_ms). That offset is derived from the boundary schedule rather than the fire time, so it is immune to the deferred-fire burst that makes every capture stamp the same late CURRENT_STEP (the phases.len() == 1 collapse). Captures with no offset (on-demand / fixture) fall back to their stamped step_index. Because the bucket windows are then workload-relative, the run-relative monitor samples are shifted by the stimulus/monitor clock skew before windowing.

Additionally synthesizes a capture-free PhaseBucket (sample_count == 0) for any stimulus StepStart-step that captured no periodic samples — the uniform whole-workload boundary placement (compute_periodic_boundaries_ns) is step-agnostic, so a short interior step can land zero captures and otherwise leave no bucket, silently dropping its capture-independent iteration_rate. The synthesized bucket carries the step’s full stimulus window so its iteration_rate (from StepStart/StepEnd deltas) and avg_imbalance_ratio (from in-window monitor samples) are still recovered. The returned vec therefore holds one bucket per (captured phase ∪ StepStart-step), sorted by step_index — NOT one-per-captured-phase, so len() is no longer “number of captured phases”.

The iteration_rate enrichment lets crate::timeline::Timeline::from_phase_buckets render the per-phase throughput annotation without going through the legacy crate::timeline::Timeline::build path.

For each StepStart[k] -> StepEnd[k] pair with total_iterations: Some(_), the per-phase rate is (later - earlier) / duration_s where duration_s is the elapsed-ms delta BETWEEN THE TWO STIMULUS EVENTS (guest clock), not the PhaseBucket sample window. The rate is attributed to the step the EARLIER event starts (prev.step_index); the attribution loop skips any is_step_end (or is_terminal) prev, so only a StepStart is ever the earlier member. Phases that don’t overlap a stimulus pair keep their PhaseBucket.metrics map unchanged (no iteration_rate key).

SEMANTICS: total_iterations is the sum of the worker handles alive at each event (see crate::timeline::StimulusEvent::total_iterations). Each step’s rate is its STEP-LOCAL StepStart[k] -> StepEnd[k] delta — the step’s own workers measured over its own hold — so a bucket is sourced ONLY by its own pair (the is_step_end guard drops the inter-step StepEnd[k] -> StepStart[k+1] pair entirely). This measures BOTH fresh-per-step workers (which read ~0 at each StepStart, so the old cross-step delta produced no rate) and persistent (Backdrop) workers (excluding the inter-step teardown wall-time a cross-step window would span). On a clean run the (StepEnd[N], terminal) pair is guard-skipped and the trailing is_terminal event is not consumed; it supplies a step’s right boundary ONLY for legacy/synthetic data carrying a ScenarioEnd frame but no StepEnd frames. A sched-died step has neither frame (its early return skips both emissions), so the dead step’s StepStart is never a prev with a successor and it reports no rate.

iteration_rate is registered as MetricKind::Rate with the Counter components total_phase_iterations / total_phase_duration_sec and HigherBetter polarity (more throughput is better). The per-step producer below emits those two components (the iteration delta and the window seconds — the ms→s /1000 applied at the component, since derive_rate_metrics does a bare num/den) rather than a ready ratio, and the derive_rate_metrics post-pass re-derives iteration_rate = Σiterations / Σseconds at every in-map aggregation level. Its per-run run-scalar fold (one run’s per-phase values → that run’s ext_metrics) runs through populate_run_ext_metrics_from_phases, which SUMS the Counter components across phases (a synthesized zero-capture phase’s components are summed in, not zero-weighted out — the run-aggregate completion of the per-step rate handling) and re-derives the rate. The cross-sidecar-run rollup group_and_average_by likewise re-pools via its derive_rate_metrics post-pass. iteration_rate has no cross-cgroup axis to re-pool: it is derived from run-level phase buckets and host-injected into the run ext_metrics by populate_run_ext_metrics_from_phases (the eval layer) AFTER the cross-cgroup merge, so AssertResult::merge’s worst-case (min/max-by-polarity) ext_metrics fold never sees its components. The rate whose components ARE per-cgroup is the separate pooled iterations_per_cpu_sec, re-pooled across a run’s cgroups by populate_run_pooled_iterations_per_cpu_sec (reading stats.cgroups post-merge).

Live caller: evaluate_vm_result at src/test_support/eval/mod.rs — has both the SampleSeries and the stimulus_events vec in scope.