DREAMi-(QME): An Offline, Reproducible Lindblad Engine for Controlled Open-Quantum Dynamics

DREAMi-QME: An Offline, Reproducible Lindblad Engine for Controlled Open-Quantum Dynamics

Author: Jordon Morgan-Griffiths

Sim: https://dakariuish.itch.io/dreami

Affiliation: Founder, Independent Researcher, THE UISH (Independent)

Keywords: Lindblad master equation, open quantum systems, quantum control, fidelity, quantum Fisher information, reproducibility

Abstract

DREAMi-QME is a single-file, offline simulator for open-quantum dynamics under time-dependent controls. It implements the Lindblad master equation with configurable drift, control generators, and noise channels, then emits audited artifacts (time-series, manifest with seed and parameters, file hashes). Its purpose is not to “prove” a control policy; it is to generate physics-faithful trajectories that a label-neutral Validator can test and that ARLIT can assess for scale structure. This paper defines the model, numerics, physicality checks, metrics (mean and final fidelity, time-to-threshold, QFI), protocols to prevent p-hacking, and a simple error budget. In qubit regimes with dephasing and amplitude damping, the engine delivers stable, trace-preserving evolution and deterministic reproducibility from manifest + seed alone. This paper is about the engine. Policy wins and losses belong elsewhere.


  1. Motivation and Scope

Why DREAMi-QME exists
If you want control results anyone will trust, you must satisfy three pillars before you even think about p-values:

(1) a sound open-system model,
(2) time-dependent controls that couple realistically to the Hamiltonian, and
(3) auditable outputs that a third party can re-run identically.

Most toolchains limp on (3): missing seeds, hidden defaults, no manifest, no hashes, inconsistent export formats, and runs that cannot be reproduced. DREAMi-QME fixes that by design. It is a reproducibility-first Lindblad engine: deterministic numerics; physics guardrails; and a forensics trail (manifest + seed + file hashes) that makes cheating expensive and verification cheap.

What DREAMi-QME is (and is not)
• It is a physics engine that evolves density matrices for open quantum systems under user-defined controls and noise.
• It is not a “policy that wins.” It does not claim performance of Q-TRACE, QEC, or any alternative. It only produces trajectories that are later judged by the (separate) Validator.
• It is single-file, offline, and auditable. Artifacts (CSV, PNG, JSON manifest) are sufficient for an independent rerun that must bit-match the originals.
• It interoperates cleanly: the Validator handles label-neutral A/B statistics; ARLIT tests whether the information signal exhibits scale structure rather than a narrow, overfit pocket.

Threat model (what reviewers will attack, and how this engine pre-empts it)
• “Hidden knobs”: All parameters (T, Δt, H0, control operators, noise rates, seeds, integrator) are recorded in the manifest.
• “Cherry-picked runs”: The manifest includes seeds and a session ID; batch scripts can enumerate them.
• “Numerical artifacts”: Engine enforces Hermiticity, trace preservation, and positivity (or fails fast with explicit flags). Step-halving convergence is an expected report, not an afterthought.
• “Unclear metric definitions”: Mean fidelity, final-time fidelity, and time-to-threshold are defined precisely here and exported explicitly.
• “Scale brittleness”: DREAMi-QME exports the per-window summaries ARLIT needs; if the effect vanishes under rescaling, ARLIT will show it.

Design goals (what success looks like)
• Physics-faithful: standard Lindblad form with user-set Hamiltonian and dissipators; optional drift frames; physically meaningful units.
• Deterministic: all randomness rooted in a manifest seed; multiple machines should yield identical CSV and hashes.
• Guardrailed: numerical steps that endanger positivity or trace prompt backoff or failure, logged in the artifacts.
• Portable: single HTML build for demos; scriptable headless mode for batch generation.
• Transparent: no implied defaults that change behavior silently; everything materialized into the manifest.

Non-goals (what this paper will not do)
• No performance claims for any control policy. Those appear in Validator results (separate).
• No hardware claims. Simulated data only; hardware introduces calibration, leakage, T1/T2 drift, crosstalk, etc.
• No special pleading for one noise model. We present standard channels but allow user-defined dissipators.

Positioning vs existing tools
• There are faster HPC solvers and larger Hilbert-space stacks. DREAMi-QME optimizes for auditability and offline integrity. It is the engine you can hand to a skeptical reviewer with the guarantee: “Recreate our CSV from the manifest or reject the run.”

Scope of Section 2 (what follows)
We define the state, dynamics, Hamiltonian/control structure, dissipators, targets, and metrics. We also spell out assumptions (Markovianity), common frames (lab vs rotating), control constraints (amplitude and slew), and pitfalls (positivity, secular approximation limits, stiff regimes).


  1. Model (What is being simulated)

2.1 State and basic objects
• State ρ(t): a d×d density matrix, Hermitian (ρ = ρ†), positive semidefinite (all eigenvalues ≥ 0), trace 1.
• Observables O: expectation ⟨O⟩ = Tr[O ρ].
• Target state ρ★: often a pure state |0⟩⟨0| for state-prep; could be a mixed or encoded logical state.

2.2 Dynamics: Lindblad master equation
We assume time-local Markovian evolution generated by the Gorini–Kossakowski–Sudarshan–Lindblad (GKSL) form. The equation of motion is:

dρ/dt = −i [H(t), ρ] + Σ_j γ_j ( L_j ρ L_j† − ½ { L_j† L_j, ρ } )

where:
• H(t) is the (possibly time-dependent) Hamiltonian; ħ = 1.
• L_j are Lindblad (jump) operators with non-negative rates γ_j ≥ 0.
• [A, B] = AB − BA (commutator), {A,B} = AB + BA (anti-commutator).

Interpretation
• The unitary part −i [H, ρ] captures coherent dynamics and control.
• The dissipative part models irreversible coupling to the environment (relaxation, dephasing, etc.).
• The generator is completely positive and trace preserving (CPTP) when γ_j ≥ 0.

2.3 Hamiltonian structure: drift + controlled terms
We decompose H(t) into static drift and driven controls:

H(t) = H0 + Σ_k u_k(t) H_k

• H0 (drift): includes detunings, static couplings, cross-terms that are always present. Example (single qubit): H0 = (Δ/2) σ_z.
• {H_k} (control generators): the operators actuated by control waveforms (e.g., σ_x/2, σ_y/2, σ_z/2, or multi-qubit couplings like σ_x⊗σ_x).
• u_k(t) (control waveforms): arbitrary time functions—piecewise-constant (pulse sequences), smooth splines, or policy outputs (e.g., Q-TRACE).

Engineering constraints on u_k(t) you should record and enforce:
• Amplitude bounds: |u_k(t)| ≤ u_max (hardware and safety).
• Bandwidth/slew: |du_k/dt| ≤ s_max (limits aliasing and staircase artifacts).
• Discretization: if using piecewise-constant controls, define the sub-step grid on which u_k(t) is sampled (Δt_ctrl). In stiff regimes, consider sub-stepping relative to the integrator step (Δt_int).

Frames and approximations (be explicit)
• Lab frame vs rotating frame: it’s acceptable to move to a frame where H0 is simplified (e.g., remove large carrier frequency) provided you document the unitary transformation and adjust dissipators consistently.
• Rotating-wave/secular approximations: if employed, state the conditions (|Δ| large vs coupling) under which they hold; otherwise omit them.

2.4 Dissipators (noise channels)
Standard single-qubit channels (extendable to multi-qubit):

• Amplitude damping (energy relaxation): L↓ = σ− with rate γ↓ = 1/T1.
Effect: population decays toward |0⟩; off-diagonals damp as well.

• Pure dephasing: Lφ = σ_z with rate γφ = 1/Tφ.
Effect: off-diagonals decay; populations unchanged.

• General user-defined channels: any L_j with γ_j ≥ 0.
Examples: thermalization (L↑ = σ+ with γ↑ determined by temperature), correlated dephasing (Z⊗Z), leakage (projectors to non-computational levels), amplitude-damping on multiple qubits, etc.

Notes and pitfalls
• Correlated noise: if you include multi-qubit Lindblad operators (e.g., Z⊗Z), state the calibration source or motivation (measured crosstalk, common-mode fluctuations).
• Non-Markovianity: GKSL assumes a memoryless bath. If your data suggests colored noise (1/f, long tails), GKSL is an approximation; document it.
• Rate scales: tabulate γ_j relative to control bandwidth; extremely fast controls with large γ_j can produce stiffness (see numerics guardrails).

2.5 Target(s) and cost functions
Common target: ρ★ = |0⟩⟨0| (state preparation). Other targets:
• Arbitrary pure |ψ★⟩⟨ψ★| (rotation-prepared states).
• Mixed states (thermal, encoded logical states, Bell states).
• Process goals (gate synthesis) are out of scope for this paper but the same engine applies to Choi-state dynamics if you model channels.

Primary metrics DREAMi-QME outputs natively:
• Fidelity trajectory F(t) = (Tr √{ √ρ★ ρ(t) √ρ★ })². For pure |0⟩, this reduces to F(t) = ⟨0|ρ(t)|0⟩.
• Mean fidelity (window average): F̄ = (1/(N+1)) Σ_i F(t_i).
• Final-time fidelity: F_final = F(t_N).
• Time-to-threshold: T_hit(τ) = min{ t_i : F(t_i) ≥ τ } (NaN if not reached).
• Quantum Fisher Information (QFI) for parameter θ: 𝓕_θ(ρ). Useful for sensitivity/scale analyses and consumed by ARLIT.

2.6 Quantum Fisher Information (QFI) details
Definition (SLD form): 𝓕_θ(ρ) = Tr(ρ L_θ²), with ∂_θ ρ = (L_θ ρ + ρ L_θ)/2.
Two practical computation routes supported:
• Spectral route: diagonalize ρ = Σ_i λ_i |i⟩⟨i| and use closed-form expressions for L_θ’s matrix elements. Stable when eigenvalues are not extremely small; regularize near zero eigenvalues to avoid division blow-ups.
• Finite-difference route: approximate ∂_θ ρ with a symmetric difference at step δ (report δ; verify step-stability). This is slower but robust when spectral decomposition is ill-conditioned.

When to care:
• If you’re probing parameter identifiability or scale structure (ARLIT), QFI(t) and its window integrals are informative.
• If your goal is pure state prep speed, QFI is secondary to F(t) and T_hit(τ), but exporting it costs little and helps establish structure vs overfit.

2.7 Assumptions, units, and parameterization (be explicit)
• Units: set ħ = 1. Express time in the same units as your control sampling grid. State units for Δ, γ_j, and control amplitudes (all as angular frequencies).
• Bounds: document u_max and (if imposed) slew bounds; otherwise readers will assume unconstrained controls.
• Initial conditions: ρ(0). If you include jitter or mixedness, tie it to the seed and record its distribution in the manifest.
• Side effects: any smoothing, windowing, filtering of u_k(t) must be recorded (parameters + order) because it changes effective bandwidth.
• Frames: if you simulate in a rotating frame, state the transformation U(t) and how it is applied so reviewers can map back to lab observables.

2.8 Worked examples (single qubit, to anchor expectations)
• Drift-dominated regime: H0 = (Δ/2) σ_z with |Δ| ≫ |u_k|. Controls must be narrowband and longer; expect slower T_hit(0.99) and heavier dephasing penalties.
• Control-dominated regime: |u_x| or |u_y| ≫ Δ. Fast state prep is possible; if γφ is high, you’ll see diminishing returns unless you shorten T.
• Damping-limited regime: γ↓ large. Even perfect control cannot exceed the relaxation envelope; mean fidelity benefits from reaching the target quickly and then holding with minimal drive (drive can heat).
• Mixed noise: both γ↓ and γφ non-negligible. Expect trade-offs between speed (to beat damping) and smoothness (to avoid coherence loss). DREAMi-QME is the engine to generate these curves; the Validator later decides if “policy A beats B.”

2.9 Multi-qubit note (what changes, what doesn’t)
• State dimension grows as d = 2^n; ρ becomes 2^n × 2^n.
• H0 can include ZZ, XX, or cross-coupling terms; controls may be local or entangling (e.g., tunable couplers).
• Dissipators can be local (σ−⊗I), correlated (Z⊗Z), or leakage projectors to non-computational levels.
• Numerics get heavier; positivity checks become stricter to avoid spurious negativity from coarse steps.
• Everything else—guardrails, manifest discipline, metrics, QFI export—remains the same.

2.10 What could go wrong (and how DREAMi-QME surfaces it)
• Positivity breach from aggressive Δt or stiff dissipators → engine halves the step and retries; persistent failure flags and aborts the run with a clear reason in the manifest/log.
• Trace drift under long horizons → explicit renormalization with a warning counter; if the counter trips often, your step is too large (and that fact is visible in the artifacts).
• Numerical instability near rank-deficient ρ for spectral QFI → regularization margin is recorded; you can switch to finite-difference QFI and report the δ used.
• Hidden randomness → impossible by design: any random element pulls from the session seed stored in the manifest.
• Silent frame changes → barred: if you choose a rotating frame, the manifest records the frame and parameters.

2.11 Exactly how this feeds Validator and ARLIT
• Validator consumption: it takes two control policies (e.g., Q-TRACE vs baseline), runs each arm through this engine with the same manifest (apart from the control law), and outputs Δ metrics (mean fidelity, final fidelity, time-to-0.99), bootstrap CIs, and optional permutation p. Side randomization is handled by the Validator; the engine stays policy-agnostic.
• ARLIT consumption: it takes QFI(t) and other multi-resolution summaries exported by the engine, learns a renormalizer on a train window, and checks out-of-sample flatness. If ARLIT fails, your “effect” is likely a narrow sweet spot, not a scale-robust structure.


Here are Sections 3 and 4, expanded fully and exhaustively in plain text, ready to paste into your manuscript.


  1. Numerics and Integrity

Overview. DREAMi-QME is a Lindblad integrator built to keep three invariants under control at every step: trace ≈ 1, Hermiticity preserved, and positivity not violated beyond numerical slack. Everything in this section exists to guarantee those invariants while remaining reproducible from a manifest and seed.

3.1 Time stepping

State representation. The engine integrates the density operator ρ(t) in either (a) operator form with explicit matrix operations, or (b) Liouville (vectorized) form where ρ is column-stacked and the generator L_t acts as a superoperator. The choice is recorded in the manifest; Liouville form is preferred for clarity and for easy composition of Hamiltonian and dissipative parts.

Discretization options. You choose one in the manifest.

• Fixed-step RK4 (default). A classic fourth-order Runge–Kutta on ρ with per-substage evaluations of the time-local generator L_t[·]. Fixed step is ideal for controlled sweeps and for reproducibility because the sample grid is identical across runs.

• Strang splitting for stiff dissipators (optional). When dissipators dominate (large γ_j or fast relaxation), the generator splits naturally into H (Hamiltonian part) and D (dissipator). Strang splitting composes exp(Δt D/2) exp(Δt H) exp(Δt D/2), which improves stability and reduces commutator error without resorting to tiny steps.

• Exponential step via expm/Krylov (optional for small d). For few-level systems, a matrix exponential of the superoperator (or a Krylov approximation thereof) can provide excellent stability. This is slower per step than RK4 for many calls but ensures CPTP evolution under time-constant L. When L_t varies within the step (time-dependent controls), the exponential is applied on sub-intervals where L is approximately constant; those sub-intervals are recorded.

Safety-capped variable-step (optional). If enabled, DREAMi-QME proposes a step size from a local error estimate (embedded RK or norm of the residual), but caps any increase by a factor (e.g., ×1.5) and halves on error. The cap is essential for reproducibility and to prevent step “runaway.” Variable-step runs are still deterministic because the decision rule depends only on state, parameters, and seed.

Sampling cadence vs integration step. The integrator step Δt_int can be smaller than the reporting cadence Δt_out. DREAMi-QME integrates at Δt_int and outputs at integer multiples to avoid aliasing. Both values sit in the manifest; reviewers can check that sampling is not so coarse that features of F(t) are missed.

Convergence evidence. A simulation is not credible without a step-size sanity check. The engine provides a step-halving protocol: repeat the run at Δt_int and Δt_int/2 with identical controls and seed, then compute shifts in mean fidelity and final fidelity. The manifest stores both runs’ hashes. A typical acceptance band is mean fidelity shift ≤ 1e-4 absolute and final fidelity shift ≤ 1e-4; users can tighten this for sensitive studies.

3.2 Physicality checks (per step)

Trace preservation. After each update, the engine computes Tr ρ. If |Tr ρ − 1| ≤ ε_tr (default 1e-10), the step passes. If it exceeds ε_tr but remains within a soft bound (e.g., ≤ 1e-8), the engine renormalizes ρ → ρ / Tr ρ and emits a warning into the log (and a counter into the manifest). If it exceeds the soft bound, the engine backs off the step (e.g., halves Δt_int), recomputes, and, if necessary, fails the run with a clear “trace control breached” flag.

Hermiticity enforcement. Numerical noise can slightly break Hermiticity. The engine symmetrizes ρ ← (ρ + ρ†)/2 after every step. The Frobenius norm of the anti-Hermitian part ‖ρ − ρ†‖_F is tracked; if it exceeds ε_H (default 1e-12 per step, 1e-9 accumulated), the step is retried with a smaller Δt_int. Persistent failure ends the run.

Positivity control. Positivity is monitored by the minimum eigenvalue λ_min of ρ. The allowed slack is ε_+ (default 1e-10). If λ_min ≥ −ε_+, the step passes. If λ_min ∈ (−10 ε_+, −ε_+), the step is retried with Δt_int/2. If λ_min < −10 ε_+, the run fails with a “positivity breach” flag. DREAMi-QME does not silently “clip” negative eigenvalues because that hides dynamics; it prefers backoff or failure so reviewers can see the problem. The manifest records the count of backoffs and any failure mode.

Dissipator sanity checks. For each Lindblad channel (L_j, γ_j), the engine verifies γ_j ≥ 0 at parse time. If time-scheduled rates γ_j(t) are supplied, the schedule is sampled on the integration grid and stored; any negative value triggers an immediate error. When frames are used, the engine checks that L_j are specified in the declared frame (lab vs rotating) to avoid inconsistent modeling.

3.3 Determinism and seeds

Single source of randomness. All stochastic elements—initialization jitter, noisy policy exploration, randomized control choices—derive from a single session seed (64-bit integer) recorded in the manifest. Substreams (for different modules) are deterministically derived from that seed (e.g., via a counter-based PRNG). No call in the engine draws from an unseeded system entropy source.

Cross-platform reproducibility. Floating-point differences across BLAS/LAPACK implementations can, in principle, change bitwise results. DREAMi-QME aims for bit-identical CSV given the same platform and version; it aims for value-identical (within 1e-12 absolute on primary observables) across platforms. The manifest optionally records OS, BLAS, and CPU flags to aid reviewers.

Logging and provenance. Every warning (trace renormalization, step backoff, schedule clipping), every integrator choice, and every tolerance is included in the run log and summarized in the manifest. If a run fails, the failure code and last good timestamp appear in the manifest. This is how you prevent “silent success.”

3.4 Numerical conditioning and performance

Precision. Double precision is standard. Single precision is not recommended because positivity control becomes fragile for small populations; if a user insists, the manifest must say so and the acceptance tolerances should be loosened accordingly. Mixed precision is unsupported in the core.

Operator scaling. For very large H(t) norms or stiff dissipators, the engine can scale operators or rescale time internally for the exponential stepper; any such rescaling is declared in the manifest to avoid the “hidden frame” problem.

Sparsity and structure. For multi-qubit models with tensor-product structure, operators are stored sparsely and multiplied using structure-aware kernels. This is an implementation detail; from the user’s perspective it changes only performance, not physics.

3.5 Frames, units, and aliasing

Frames. The engine runs either in the lab frame or a declared rotating frame. It never changes frames implicitly. If a rotating frame is used, H_0 and H_k must already include the transformation; L_j should be expressed in the same frame.

Units. Time unit (e.g., microseconds), frequency unit (e.g., MHz or rad/s), and amplitude units are explicit. The manifest carries a single “unit table,” so t, Δt, T, γ_j, and control amplitudes are all interpretable outside the code.

Aliasing. If controls are fast relative to Δt_out, aliasing can corrupt mean fidelity. DREAMi-QME therefore requires Δt_out ≤ (1/10) of the smallest control timescale or an explicit waiver in the manifest; otherwise it refuses to run.

3.6 What the engine refuses to do

It will not silently stabilize a pathological setup (e.g., negative rates, nonsensical units); it will not “clip and continue” on hard positivity breaches; and it will not average away bad numerics. Failure is a feature: it surfaces unrealistic configurations early, with a readable reason.

  1. Metrics and Outputs

Purpose. The engine’s job is to produce trajectories and well-defined observables that downstream tools can consume. It does not decide winners; it equips a validator with the raw material to do so and equips ARLIT with summaries for scale-law tests.

4.1 Core observables

Fidelity trajectory F(t). Computed as the Uhlmann fidelity between ρ(t) and the declared target ρ★. For pure targets (e.g., |0⟩⟨0|), this reduces to a projector expectation value: F(t) = ⟨0|ρ(t)|0⟩. DREAMi-QME samples F(t) on the output grid t_0, …, t_N with cadence Δt_out. If interpolation is ever used (e.g., for a reported event time), the method (linear by default) is noted.

Mean fidelity Ḟ. Defined as the arithmetic average over all sample points: Ḟ = (1/(N+1)) Σ_i F(t_i). This is sensitive to sampling cadence; therefore Δt_out is included in the manifest and is expected to be ≤ the smallest relevant control timescale/10, as noted above.

Final-time fidelity F_final. The fidelity at t_N. This matters when protocols emphasize endpoint preparation quality or holding behavior.

Time-to-threshold T_hit(τ). The earliest sample time t_i with F(t_i) ≥ τ. If F never reaches τ, T_hit is NaN and an indicator flag is set. For better temporal resolution, linear interpolation between t_i and t_{i+1} is used when F crosses τ between samples; the interpolation rule is documented in the manifest. Thresholds used (e.g., τ = 0.95, 0.99) are listed explicitly to prevent post hoc shopping.

Quantum Fisher Information QFI(t; θ). For a chosen parameter θ (e.g., detuning Δ, a control amplitude bound, or a noise rate), DREAMi-QME can report QFI(t). Two methods exist:

• Spectral SLD route: diagonalize ρ(t), solve for L_θ from ∂_θ ρ = (L_θ ρ + ρ L_θ)/2, and compute Tr[ρ L_θ²]. Stable when eigenvalues are not tiny; includes a safeguard for near-zero eigenvalues (discard or regularize terms below a manifest-declared ε_spec).

• Finite-difference route: compute ρ(θ + δ) and ρ(θ − δ) at the same time t and estimate ∂_θ ρ ≈ [ρ(θ + δ) − ρ(θ − δ)]/(2 δ). The step δ is in the manifest; users can sweep δ to check stability, and the engine can auto-reject δ if QFI varies by more than a tolerance across neighboring δ values.

Integrated or windowed QFI. For ARLIT, the engine can produce integrated QFI over windows (e.g., sliding or dyadic windows) and multi-resolution summaries. Window definitions (lengths, overlaps) are in the manifest; this prevents “scale shopping.”

Additional optional observables. Purity Tr ρ², leakage population (if a leakage subspace is modeled), energy expectation Tr [H(t) ρ(t)], and control effort proxies (e.g., ∫ |u_k(t)| dt or ∫ u_k(t)² dt) can be computed and exported. These are not used by the validator by default but can be useful for diagnostics and energy-budget discussions.

4.2 CSV schema and manifest content

CSV (time series). The engine writes a tabular file with columns:

• t (time in declared units).
• F (fidelity to ρ★).
• QFI_theta (if enabled; may be multiple columns if several θ are requested).
• purity, leakage, energy, and any other requested observable.
• Optionally, control snapshots u_k(t_i) if “export_controls” is set (useful but can be large).

Every column has units and a short description in a header block. The CSV is designed to be consumed by the validator with no guessing.

Manifest (JSON). The manifest is the forensic spine. It contains:

• Versioning: engine version string and semantic hash, schema version.
• System: Hilbert space dimension d; basis conventions; frame (lab or rotating) and frame definition; units table (time, frequency, amplitude).
• Dynamics: H_0, the list of H_k (names and operator descriptions), control parameterization (segments, splines, or policy), control limits (amplitude caps, slew/jerk), and any clipping behavior.
• Noise: list of (L_j, γ_j) with textual operator definitions and numeric rates; if scheduled, the schedule definition and grid.
• Numerics: integrator choice (rk4, strang, expm/krylov), Δt_int, Δt_out, horizon T, tolerances (ε_tr, ε_H, ε_+), backoff factors, and whether variable-step is enabled with its safety cap.
• Seed and streams: session seed, substream derivation info if used.
• Outputs: which observables are computed (F, QFI, purity, leakage, etc.), which thresholds τ for T_hit, window definitions for integrated metrics.
• Provenance and platform: optional OS, BLAS/LAPACK, CPU flags, and build id to help explain tiny cross-platform differences.
• Hashes: SHA-256 of the CSV and of any PNGs; optional HMAC if tamper-evidence is required; timestamp of run.

4.3 Hashing, HMAC, and integrity

SHA-256. For each artifact (CSV, plot images), the engine computes a SHA-256 digest and records it in the manifest. This allows anyone to verify that the files have not been altered and that a re-run produced exactly the same outputs.

HMAC (optional). If a keyed digest is desired (e.g., to prevent a third party from forging a CSV that happens to match a public hash), a secret key can be used to produce an HMAC. This is rarely needed in open science but available when provenance must be tightly controlled.

4.4 Plots and visualization

Fidelity panel. A simple F(t) vs t plot with axes labeled with units, target line(s) at declared thresholds τ, and markers at threshold crossing times if applicable. The y-axis is constrained to [0, 1].

Race panel (optional). A visual “A/B race” is available for sanity checks even though the validator runs comparisons formally. It overlays two F(t) trajectories from two policies under identical conditions. The panel’s legend never names the policies; it labels them “Arm-L” and “Arm-R” unless the manifest explicitly asks to reveal identities. This prevents unconscious bias during exploratory analysis.

QFI panel (optional). QFI(t) vs t or integrated QFI vs window scale. ARLIT consumes the numeric summaries; the panel is a human sanity check.

4.5 Event handling and corner cases

Never reaching the threshold. If F(t) never crosses τ, the engine sets T_hit = NaN and flags “threshold not reached.” This avoids fabricating a race by extrapolation.

Multiple crossings and oscillations. If F crosses τ multiple times due to ringing, T_hit is the first crossing by definition. If a “hold time” at or above τ is also desired (e.g., must remain ≥ τ for at least ΔT_hold), the engine can compute and export that as an additional event; the hold window is in the manifest.

Discontinuities in controls. Piecewise-constant controls introduce kinks in H(t). The engine ensures that output sampling includes control breakpoints so that F(t) is not sampled only between discontinuities. If a policy emits discontinuous u_k(t) at arbitrary times, those times are added to the output grid.

4.6 What the engine does not compute (by design)

Confidence intervals, p-values, and permutation tests belong to the validator, which compares paired arms fairly. DREAMi-QME provides the raw observables; it does not run statistics. Likewise, it does not infer classical Fisher information (requires a measurement model) unless a POVM and readout model are supplied, which is outside the engine’s scope.

4.7 Quality gates and acceptance criteria

For an exported run to be considered “engine-clean,” DREAMi-QME expects the following to be true (and records evidence):

• Trace control: max |Tr ρ − 1| over the run ≤ 1e-8 after at most occasional renormalizations, with the count of renormalizations ≤ a small integer (e.g., ≤ 3 per 10⁴ steps).
• Hermiticity: accumulated anti-Hermitian norm stays ≤ 1e-9.
• Positivity: no step requires more than one backoff; no hard breaches.
• Convergence: step-halving shifts on mean and final fidelity within the preset band (default 1e-4 absolute).
• Reproducibility: CSV and plots match hashes on re-run; the seed is present; the unit table is consistent.

4.8 Aggregation for downstream tools

While the validator reads the CSV directly, the engine also writes a tiny JSON “summary” for convenience containing mean fidelity, final fidelity, threshold hits for each τ, integrated QFI statistics, and the convergence deltas from step-halving if the user ran that protocol. This summary is redundant with CSV but saves time and prevents accidental recomputation differences.

Summary for Sections 3–4. DREAMi-QME integrates the Lindblad equation with explicit control and noise models, guards the three physical invariants at each step, and emits a complete, seeded artifact trail (CSV, manifest, hashes, and optional plots). Its observables—F(t), mean and final fidelity, time-to-threshold, and QFI (plus optional purity/leakage)—are defined tightly enough that a validator can compute statistics without ambiguity and ARLIT can test scale behavior without “shopping” windows. The engine fails loudly rather than masking numerical or modeling mistakes, because failure here is cheaper than retraction later.


Here are Sections 5 and 6, expanded fully and exhaustively in plain text, ready to paste into your manuscript.


  1. Protocols (Anti p-hacking by construction)

Goal. Lock analysis choices and run conditions before you see outcomes; ship enough context that anyone can re-run the exact experiment and get the same artifacts. If a result changes when you wiggle post hoc decisions, it wasn’t robust.

5.1 Pre-registration (before any run)
• Declare objectives: which phenomenon you are testing (e.g., “Does control law C raise mean fidelity at T = 120?”).
• Fix primary metric(s): choose exactly one primary and (optionally) two secondaries, with thresholds if applicable. Typical set:
– Primary: mean fidelity over [0, T] at cadence Δt_out.
– Secondary A: final-time fidelity at t = T.
– Secondary B: time-to-threshold T_hit at τ ∈ {0.95, 0.99}.
• Fix horizon and sampling: T, Δt_int, Δt_out; commit to a step-halving check (Δt_int vs Δt_int/2).
• Fix noise: list all (L_j, γ_j) and any schedules γ_j(t). No “adaptive” noise unless it is fully specified and deterministic.
• Fix controls: parameterization (segments/splines/policy), bounds (amplitude, slew), and any smoothing.
• Fix inclusion/exclusion: what constitutes an analyzable run (e.g., positivity breach → run discarded and rerun with smaller Δt_int, but counted as a failure in the manifest).
• Fix seeds: range or list of seeds; how many runs; how seeds map to repeats.
• Fix plots and summaries: which figures will be produced (F(t), race panel optional), and which CSV columns are required.

5.2 Manifest discipline (what must be locked)
• Units and frame: time unit, frequency unit, amplitude unit, lab vs rotating frame.
• System: dimension d, basis, H₀, H_k definitions.
• Noise: L_j definitions, γ_j values/schedules.
• Controls: u_k(t) parameterization, grid/knots, bounds, clipping policy, and whether clipping is allowed; if allowed, log every clip.
• Numerics: integrator (rk4/strang/expm), Δt_int, Δt_out, tolerances (ε_tr, ε_H, ε_+), backoff factor, variable-step flag and safety cap.
• Seed: 64-bit session seed and (if used) substream derivation method.
• Outputs: which observables (F, QFI, purity, leakage), which τ values for T_hit, window definitions for any integrated metrics.
• Provenance: engine version, OS/BLAS optional, timestamp.
• Integrity: SHA-256 for CSV and PNG; optional HMAC.

5.3 Blinding and role separation
• QME vs Validator. QME generates trajectories and never labels “winners.” The Validator consumes paired outputs and computes Δ, CIs, and a PASS/FAIL. Keep these roles separate to avoid unintentional bias (e.g., peeking at outcomes to tweak Δt_out).
• Identity reveal policy. If a race panel is rendered for human sanity, label arms “L/R” until analysis is locked. Only reveal policy names after statistics are computed.

5.4 Step-halving convergence protocol
• Run A: Δt_int, fixed T, identical manifest otherwise.
• Run B: Δt_int/2, identical everything.
• Compute Δ_mean = |Ḟ_A − Ḟ_B| and Δ_final = |F_A(T) − F_B(T)|.
• Acceptance: both ≤ 1e−4 absolute (tighten if you need higher precision).
• Report: include both hashes and the deltas in a “convergence” row of your supplement.

5.5 Threshold and window commitments
• Thresholds: declare τ values (e.g., 0.95, 0.99) up front; do not shop thresholds post hoc.
• Windows: if you report integrated metrics (e.g., integrated QFI over windows), declare window sizes and overlaps up front; use dyadic scales if you plan ARLIT.

5.6 Outlier and failure policy
• Positivity breaches: one backoff retry allowed; if breach persists, the run is marked “failed” and excluded from analysis, but the failure is counted in a run summary table.
• Trace drift: if renormalization events exceed a cap (e.g., > 3 per 10⁴ steps), the run is flagged; you may rerun with smaller Δt_int but you must report the flag.
• Unreached thresholds (T_hit): set to NaN; do not extrapolate. Count N_NaN in the summary.

5.7 Reproducibility checks
• Same-platform rerun: verify identical CSV bytes (matching SHA-256).
• Cross-platform rerun: verify numeric equivalence within 1e−12 on primary observables; report any tiny deviations and platform notes.

5.8 Governance: what you are not allowed to change after the fact
• Metrics, thresholds, T, Δt_int/out, noise, control bounds, and inclusion/exclusion rules.
• The only post hoc changes allowed: (i) bug fixes with a bumped engine version and explicit note; (ii) tighter tolerances or smaller Δt_int as a robustness test (never looser).

5.9 Deliverables per session
• Manifest JSON; CSV time series; PNG plots; summary JSON (Ḟ, F_final, T_hit values for each τ, convergence deltas); SHA-256 (and HMAC if used).
• A brief session note: number of seeds, number of failures (and reasons), number of renormalizations/backoffs.

5.10 Minimal public report (so others can rerun)
• Zip containing /manifest.json, /timeseries.csv, /plots/*.png, /sha256.txt, and /README with units and usage.
• One-page “how to rerun” with seed, command, and expected hashes.

Bottom line for protocols. Pre-register, lock the manifest, run, export, and let the Validator do the judging. If you are tempted to “just tweak Δt_out a bit,” you’ve already lost the argument—don’t.

  1. Validation (Engine, not policy)

Goal. Prove the engine is numerically sound, physically faithful within the stated model, and reproducible. These checks test the integrator and guardrails—not whether a control policy is good.

6.1 Convergence and stability
• Step convergence (required). Perform the Δt_int vs Δt_int/2 run. Acceptance: |ΔḞ| ≤ 1e−4 and |ΔF_final| ≤ 1e−4.
• Stability under horizon extension. Repeat with T doubled; qualitative behavior should persist, and invariants must remain within tolerance.
• Variable-step sanity (if enabled). Confirm that the variable-step path yields Ḟ within 1e−4 of the fixed-step reference at comparable work.

6.2 Invariant preservation
• Trace control. Max |Tr ρ − 1| over the run ≤ 1e−8 with ≤ 3 renormalizations per 10⁴ steps; record counts.
• Hermiticity. Accumulated ‖ρ − ρ†‖_F ≤ 1e−9.
• Positivity. No hard breaches; at most one backoff per step when λ_min dips into the slack band; record counts.

6.3 Dissipator sanity and CPTP structure
• Non-negativity of rates. All γ_j ≥ 0; if scheduled, the sampled γ_j(t) are checked; any negative entry aborts.
• Channel composition. For multiple channels, verify that the composite dissipator is constructed as Σ_j γ_j D[L_j] with D the standard Lindblad form; no cross-terms unless explicitly modeled.
• Thermal detailed balance (optional test). For a thermal model with σ⁻ and σ⁺, check that the fixed point matches the Gibbs state for the declared rates.

6.4 Frames, units, and aliasing tests
• Frame audit. If a rotating frame is declared, verify that drift and control terms are consistent and that no hidden lab-frame remnants exist (e.g., by comparing with a lab-frame reference in a small case).
• Unit audit. Confirm that all reported times and rates match the unit table (basic dimensional checks).
• Aliasing check. Ensure Δt_out is fine enough relative to control bandwidth (≤ 1/10 of the smallest control timescale); if not, refuse the run.

6.5 Edge-case batteries
• Zero control. With u_k(t) = 0, confirm expected decay under noise (e.g., dephasing drives coherences to zero; amplitude damping drives to ground).
• Zero noise. With γ_j = 0, confirm unitary evolution and preservation of purity for pure initial states.
• Extreme detuning. Large Δ → fast precession; integrator handles the faster timescale without invariant drift.
• Short vs long horizons. T ∈ {short, nominal, long}; behavior scales sensibly; invariants hold.

6.6 Reproducibility
• Same-platform determinism. Two runs with the same manifest and seed produce identical CSV bytes; SHA-256 matches.
• Cross-platform reproducibility. On a second machine, observables match within numeric tolerance; if tiny differences appear (e.g., due to BLAS), document them and keep them below 1e−12 absolute on F and below 1e−10 on QFI (the latter is looser due to derivatives).

6.7 Numerical conditioning
• Conditioning of superoperators. Report condition numbers for key linear solves (e.g., SLD/QFI spectral routines); warn if ill-conditioned.
• Regularization policy for QFI. If spectral gaps collapse, document the ε_spec used to regularize tiny eigenvalues and show that QFI is stable to small changes in ε_spec.

6.8 Failure modes (and why they’re good)
• Hard positivity breach. The engine fails rather than silently clipping eigenvalues; the manifest captures the failure.
• Trace runaway. If renormalizations exceed the cap, fail and suggest Δt_int/2.
• Alias warning. If Δt_out is too coarse for the declared controls, refuse to run with a clear message.
These failures save you from publishing numerically driven artifacts.

6.9 Minimal acceptance gate for “engine-clean” runs
A session is acceptable if all of the following are true:
• Convergence gate passed (step-halving deltas within band).
• Invariants within tolerance with recorded (low) backoffs/renormalizations.
• No hard positivity breaches and no metric computed from extrapolated values.
• Reproducibility verified (hash match on same platform; numeric match cross-platform).
• Edge-case battery: zero control and zero noise tests pass at least once per engine version (include in CI).

6.10 Reporting checklist (append to supplement)
• Engine version, OS/BLAS (optional), unit table, frame.
• Manifest dump (H₀, H_k, control parameterization and bounds; L_j, γ_j).
• Numerics (integrator, Δt_int, Δt_out, tolerances, backoff).
• Convergence table (ΔḞ, ΔF_final) with both run hashes.
• Invariant counters (renormalizations, backoffs).
• Reproducibility note (hashes, cross-platform deltas).
• Edge-case results (zero control/noise sanity).
• Any QFI regularization parameters and stability evidence.

Bottom line for validation. You’re not proving a policy here. You’re proving the engine will not lie to you: numerically stable, physically consistent within the stated model, and reproducible down to the file hash. With that foundation, any downstream A/B claim stands—or falls—on its own merits.


  1. Example regimes (single-qubit, straight to the point)

Purpose. These regimes are not benchmarks for any control policy; they are sanity batteries for the engine. They exercise drift, drives, and noise over short/medium/long horizons with conservative step sizes. If your numerics or physicality checks are weak, one of these will break them.

System, operators, targets. Single qubit with Pauli basis. Drift H₀ = (Δ/2) σ_z. Control generators H_x = (1/2) σ_x, H_y = (1/2) σ_y. Target state ρ★ = |0⟩⟨0|. Fidelity F(t) is the population of |0⟩ for a pure target.

Controls and bounds. Drives are u_x(t), u_y(t). Unless otherwise stated, impose amplitude caps |u_x|, |u_y| ≤ U_max with U_max chosen to keep Rabi rates in a realistic range (e.g., ≤ 10% of 1/Δ when working near resonance). If you need to simulate aggressive pulses, declare that up front and expect to tighten the step.

Noise menus. Two channels, amplitude damping and dephasing: L↓ = σ⁻ with rate γ↓ ∈ [1e−4, 1e−2], and Lφ = σ_z with rate γφ ∈ [1e−4, 1e−2]. These are per-unit-time in your declared units. If you want a thermal steady state, add σ⁺ with rate γ↑ and tie γ↑/γ↓ to temperature; otherwise keep γ↑ = 0.

Horizons and steps. Horizons T ∈ {60, 120, 240}. Integration step Δt ∈ {1e−4, 5e−4, 1e−3}. Output cadence Δt_out should be ≤ Δt (or an integer multiple of Δt if you integrate more finely than you report). Do not under-sample; if your controls have substructure near Δt, reduce Δt_out accordingly.

Regime A — Idle with noise (no drives).
Setup: u_x(t) = u_y(t) = 0. Choose three noise corners: light (γ↓ = γφ = 1e−4), medium (1e−3), heavy (1e−2). Start from |1⟩⟨1| (worst-case population opposite the target).
What you should see:
• With amplitude damping present, F(t) rises toward 1 with a rate set primarily by γ↓; dephasing only damps coherences and thus shapes transients, not the final population.
• With dephasing alone (γ↓ = 0, γφ > 0), F(t) should remain flat if you start in an energy eigenstate; coherences vanish, populations don’t move.
Why this matters: it checks that the dissipator does what it says on the tin and that stationary states are plausible. Any drift in the “no-drive, no-damping” case is a red flag for trace or Hermiticity errors.

Regime B — Resonant Rabi drive with dephasing.
Setup: Δ = 0, u_x(t) = Ω (constant), u_y(t) = 0. Pick Ω so one or two full Rabi periods fit into T (e.g., Ω = 0.05 if T = 120). Use γφ ∈ {1e−4, 1e−3, 1e−2}, γ↓ = 0.
What you should see: F(t) oscillates at the Rabi frequency and decays toward 0.5 with a decay envelope set by γφ. Mean fidelity over a full number of periods should settle near 0.5 as dephasing grows.
Checks: the oscillation frequency should scale linearly with Ω; dephasing should not alter the average energy (no amplitude damping, no population flow)—if it does, your dissipator is wrong.

Regime C — Off-resonant drive with damping (detuning present).
Setup: Δ ≠ 0 (e.g., Δ = 0.2), u_x(t) = Ω, u_y(t) = 0, with Ω ≤ Δ for visible Bloch precession plus relaxation. Choose γ↓ ∈ {1e−4, 1e−3}, γφ ∈ {1e−4, 1e−3}.
What you should see: damped, tilted precession on the Bloch sphere. F(t) approaches 1 with rate dominated by γ↓, but oscillatory overshoots and undershoots appear depending on Ω/Δ.
Sensitivity: this regime is numerically tougher; if Δt is too large you’ll see phase error (frequency drift) or positivity warnings. It’s the right place to enforce your step-halving gate.

Regime D — Piecewise control (AWG-style segments).
Setup: a 3-segment pulse on u_x(t): [Ω, 0, −Ω] with equal segment lengths; u_y(t) = 0; Δ = 0. Use medium noise.
What you should see: two rotations in opposite directions with a “hold” in between; if your output grid doesn’t include the segment boundaries you will miss kinks and misreport F(t).
Why this matters: many experiments use stepped pulses; your sampler and integrator must align to those breaks.

Regime E — Short horizon vs long horizon.
Setup: pick one of the above (say Regime C) and run it at T = 60, 120, 240 with the same Δt.
What you should see: identical early-time behavior and stable late-time approach; no creep in trace or Hermiticity as T grows. Long horizons expose slow invariant drift.

Optional: QFI-oriented regimes.
If you care about information structure, declare θ (detuning Δ or a noise rate) and compute QFI(t). Use a light-noise, modest-drive configuration to avoid numerical degeneracy in eigenvalues. Sweep θ by a small δ (recorded in the manifest) and verify QFI stability vs δ.

Acceptance notes per regime.
• Idle: with γ↓ > 0, F(t) should approach 1 monotonically; with γφ only, F(t) should be flat if starting in |0⟩ or |1⟩.
• Resonant: oscillation frequency matches Ω (within numerical error); envelope matches γφ.
• Off-resonant: oscillations at √(Ω² + Δ²); damping rates dominated by γ↓ and γφ contributions to longitudinal and transverse relaxation; no positivity breaches.
• Piecewise: F(t) changes slope at segment edges; output grid includes those edges.
• Long horizon: invariant counters (renormalizations/backoffs) stay bounded; if they accumulate, shrink Δt.

What to record. For each run: Δ, Ω (or segment amplitudes), γ↓, γφ, T, Δt_int, Δt_out, seed, plus any warnings (renormalizations, backoffs). Include hashes for the CSV and plots.

Common pitfalls caught by these regimes.
• Hidden rotating frames (frequencies don’t match declared Ω).
• Aliasing (oscillations look slower/faster than they should).
• Step too large (positivity backoffs and trace drift).
• Mis-specified dissipator (dephasing moving populations, which it shouldn’t).

Bottom line for regimes. If your engine passes A–E cleanly with the declared horizons and steps, you have a numerically honest substrate for downstream A/B work. If it doesn’t, fix the engine before you touch a control policy.

  1. Error budget (where numbers can move)

Purpose. This is the map of credible error sources. If somebody challenges your results, you can point to each item, the bound you set, and the evidence you recorded. No hand-waving.

Discretization (time integration).
• Source: truncation error of RK4 or splitting/expm approximations.
• Symptom: phase drift in oscillations; small bias in mean fidelity; end-point F(T) off by O(Δt⁴).
• Control: step-halving gate with acceptance |ΔḞ| ≤ 1e−4 and |ΔF_final| ≤ 1e−4. If you fail, reduce Δt_int or move to splitting/expm for stiff cases.
• Reporting: always include the step-halving deltas and the hashes of both runs.

Stiffness and operator scales.
• Source: large γ_j or large Δ relative to chosen Δt_int; the Hamiltonian and dissipator operate on very different timescales.
• Symptom: repeated positivity backoffs, frequent renormalizations, or failure to converge.
• Control: Strang splitting (separate H and D), or smaller Δt_int.
• Reporting: count of backoffs; if the count grows with T, you are under-resolving the fastest timescale.

Model mismatch (Markovian assumption).
• Source: Lindblad assumes memoryless baths; real systems can have colored noise or structured environments.
• Symptom: hardware data disagree with simulator even when controls and noise levels look matched; long-time tails or recurrences not captured.
• Control: acknowledge the limit; if needed, move to non-Markovian models or embed auxiliary modes.
• Reporting: declare clearly that the engine is Markovian; do not retrofit extra channels after seeing outcomes to “explain” data.

Control bandwidth and representation.
• Source: piecewise-constant controls approximate smooth pulses; insufficient segment resolution; unbounded slew in a policy output.
• Symptom: staircase artifacts in F(t); aliasing; inflated mean fidelity due to under-resolved ringing.
• Control: sub-step the integrator within segments; enforce bounds on amplitude and slew; ensure Δt_out captures segment boundaries.
• Reporting: segment grid or spline knots; any clipping events; sampler alignment to edges.

Floating-point and linear algebra conditioning.
• Source: double-precision rounding; ill-conditioned spectral decompositions (eigenvalues near zero) when computing QFI; BLAS/LAPACK variability.
• Symptom: tiny cross-platform deltas in F(t) (should be ≤ 1e−12) and larger sensitivity in QFI (derivatives magnify noise).
• Control: stick to double precision; regularize tiny eigenvalues in QFI with a declared ε_spec; avoid unnecessary matrix inversions.
• Reporting: platform notes optional; record ε_spec and show QFI stability vs ε_spec and vs δ (finite-difference step).

Positivity slack and enforcement.
• Source: finite-step errors can push λ_min slightly negative.
• Symptom: occasional λ_min < 0 by magnitude ≲ ε_+.
• Control: ε_+ default 1e−10; backoff (halve Δt_int) on breach; fail hard if λ_min < −10 ε_+. Never “clip and continue” silently.
• Reporting: count of backoffs and any failures; if counts are non-zero, justify Δt_int choice.

Trace and Hermiticity drift.
• Source: round-off and truncation.
• Symptom: |Tr ρ − 1| creeping above ε_tr; ‖ρ − ρ†‖_F accumulating.
• Control: post-step symmetrization; renormalization if |Tr ρ − 1| exceeds ε_tr; reduce Δt_int if events recur.
• Reporting: counts and thresholds; if counts grow with T, tighten numerics.

Threshold detection (time-to-τ).
• Source: discrete sampling; crossings between samples.
• Symptom: T_hit biased by up to Δt_out.
• Control: linear interpolation between the first pair of samples that straddle τ; cap Δt_out to keep interpolation error negligible relative to your claimed precision.
• Reporting: list τ values and the interpolation rule in the manifest; never extrapolate beyond the last sample.

Aliasing (output cadence too coarse).
• Source: Δt_out not small enough relative to control bandwidth.
• Symptom: apparent frequency shifts; inflated Ḟ because oscillations are under-sampled.
• Control: require Δt_out ≤ (1/10) of the fastest control timescale or refuse the run.
• Reporting: declare Δt_out and the fastest control timescale; if you accept a waiver, say so explicitly.

Seed and reproducibility drift.
• Source: hidden randomness outside the session PRNG; nondeterministic libraries.
• Symptom: hashes don’t match on rerun with the same manifest.
• Control: one session seed for everything; deterministic substreams; pin library versions.
• Reporting: include the seed and engine version; provide SHA-256 for CSV and plots; optional HMAC for tamper-evidence.

Human factors (post hoc choices).
• Source: changing τ, swapping Δt_out, or filtering trajectories after peeking at outcomes.
• Symptom: pretty graphs, untrustworthy claims.
• Control: the protocols in Section 5—pre-register and lock; QME never decides winners.
• Reporting: show the pre-registered plan and the actual manifest; they should match.

Bottom line for the error budget. If someone challenges a number in your figure, you should be able to point to the exact bucket above, the guard you set (tolerance, step-halving delta, slack), and the evidence you logged (counts, hashes, stability sweeps). If you can’t, the criticism will stand—and it should.


Here are Sections 9–11, expanded fully and exhaustively in plain text, ready to paste into your manuscript.


  1. Reproducibility and forensics (make cheating expensive)

Goal. Anyone—hostile reviewer included—must be able to regenerate your exact artifacts from a single manifest and seed. If they cannot, the run does not count. This section defines what ships, how it’s verified, and how to investigate discrepancies.

9.1 What every run must ship
• manifest.json — the single source of truth. Includes: engine version, semantic schema version, frame and units, system dimension and basis, H₀ and the list of H_k (with human-readable operator descriptions), control parameterization and bounds (segments/splines/policy; amplitude/slew limits; clipping policy), noise channels (L_j definitions and γ_j values or schedules), numerics (integrator, Δt_int, Δt_out, T, tolerances ε_tr/ε_H/ε_+, variable-step settings and safety cap), seed and substream policy, enabled observables (F, QFI, purity/leakage), threshold list for T_hit, window specs for integrated metrics, provenance (timestamp; optional OS/BLAS/CPU flags), and integrity (SHA-256 for CSV and PNGs; optional HMAC).
• timeseries.csv — primary data: columns t, F, optional QFI_θ columns (one per θ), optional purity/leakage/energy, and (if requested) control snapshots u_k at output times. A short header block explains units, column semantics, and any interpolation rule used for event times.
• plots/*.png — at minimum a fidelity panel with thresholds drawn; optional race panel (arms anonymized) and QFI panel. Axes labeled with units; plot metadata includes the CSV hash to prevent plot/data drift.
• sha256.txt — line-separated digests for each artifact file (CSV and PNGs) and the digest of manifest.json itself.
• HMAC.txt (optional) — keyed digests if you require tamper-evidence beyond public hashes (e.g., for internal review stages).
• summary.json (optional but recommended) — a machine-readable recap: mean fidelity, final fidelity, T_hit per threshold, integrated QFI stats, convergence deltas from the step-halving check, counters for renormalizations and backoffs, and any failure flags.

9.2 How to verify a bundle
• Hash check: recompute SHA-256 for CSV and PNGs; match against sha256.txt. Mismatch → bundle invalid.
• Manifest replay: rerun the engine with manifest.json only. This must produce byte-identical timeseries.csv on the same platform/version. If not, the run is non-deterministic—reject.
• Cross-platform tolerance: rerun on a second machine. Primary observable differences (F at reported times) must be ≤ 1e−12 absolute; QFI may use a looser bound (e.g., ≤ 1e−10) due to spectral sensitivities. Report any deltas.
• Convergence evidence: if the manifest references a step-halving check, both runs and both hashes should be present. Compute |ΔḞ| and |ΔF_final| and confirm they meet the gate.

9.3 Forensic trail (what to log during a run)
• Physicality counters: number of trace renormalizations, number of positivity backoffs, maximum |Tr ρ − 1|, minimum λ_min encountered, maximum ‖ρ − ρ†‖_F.
• Step controller events: number of step reductions/increases (if variable-step), min/max step used.
• Clipping: any control clipping events (time index, channel, requested vs applied amplitude).
• Schedules: sampled γ_j(t) min/max; flag any out-of-range entries (should be none).
• Exceptions: first failure timestamp and reason (positivity breach; trace runaway; aliasing refusal; negative rate; bad frame/unit).

9.4 Names, versions, and diffs
• File naming: sessionID_timestamp.csv; plots carry the CSV hash in their filename (e.g., fidelity_.png).
• Versioning: bump engine version on any behavior change (even if numerically tiny). Schema version guards the manifest structure.
• Diffs: if two runs differ, provide a “forensic diff” utility: compare manifests, highlight changed fields (Δt, T, γ_j, bounds), compare the first K lines of CSV, and list different warnings/counters.

9.5 Rerun protocol (for third parties)
• Inputs: manifest.json only.
• Command: run engine in “replay mode,” which forbids any change to declared settings; refuse to start if the local engine version differs from the manifest’s version unless a “compatibility mode” is explicitly requested and documented.
• Expected outputs: identical CSV and plots; identical hashes. If “compatibility mode” is used, require a short report of numeric deltas.

9.6 Data retention and privacy
• Keep raw bundles and hashes indefinitely (at least for the duration of review).
• No personal data should appear; if internal HMAC is used, do not publish the key.
• If policy code is proprietary, that’s fine—the engine still records the waveforms u_k(t) (if you choose to export them) and the seed. A reviewer need not see your policy internals to validate physics.

9.7 Failure handling (make it loud, not silent)
• If any acceptance gate is missed (hash mismatch, convergence fail, invariant thresholds exceeded), the run is flagged “engine-unclean.” Do not include such a run in claims. Either fix the setup (smaller Δt, corrected frames/units) or omit the regime and state why.

Bottom line. With these rules, it’s cheaper to be honest than to fabricate. The bundle proves what ran; the manifest replays it; the hashes freeze it. If a critic can’t reproduce your CSV from your manifest, the run is not evidence.

  1. Implementation and minimal API (conceptual, not code)

Goal. Describe the lifecycle and interfaces without tying you to a specific programming language. This is enough for a reader to mentally map the engine’s behavior to their stack.

10.1 Lifecycle and states
• reset(manifest) → state
Initializes a simulation state from manifest.json. Parses operators, constructs superoperators or operator lists, sets units and frame, seeds all PRNG substreams, allocates logging and counters, and validates bounds/schedules. Fails early if any inconsistency is found (e.g., negative γ_j, missing units).

• step(state, t, Δt) → state
Advances the state by Δt using the chosen integrator. Applies physicality checks: symmetrize, trace control, positivity check with backoff. Updates counters and logs. In variable-step mode, a local error estimate may suggest a smaller/larger Δt, but increases are safety-capped to keep determinism.

• observe(state) → observables
Computes requested observables at the current time: F (requires ρ★), optional QFI_θ (spectral or finite-difference route; records ε_spec or δ), purity, leakage, and control snapshots. Returns a typed record with units and a “quality flag” field if any invariant was nudged (e.g., renormalization occurred on the last step).

• export(state) → artifacts
Writes timeseries.csv and plots, composes manifest.json (including counters and hashes), computes SHA-256 for each artifact, and (if requested) HMAC. Emits summary.json. Ensures plot filenames embed the CSV hash to bind visuals to data.

10.2 Determinism contract
• Single session seed drives all stochastic elements; substreams derived deterministically (e.g., counter-based PRNG).
• No calls to non-deterministic RNGs.
• Variable-step decisions depend only on state and manifest, not on wall-clock or thread interleavings (use deterministic reduction order).

10.3 Error handling and codes
• E_NEGATIVE_RATE — a γ_j < 0 detected (or scheduled negative); refuse to start.
• E_ALIASING — Δt_out too coarse vs control bandwidth; refuse to start unless a waiver is explicitly set.
• E_POSITIVITY_HARD — repeated backoffs still leave λ_min < −10 ε_+; abort with timestamp.
• E_TRACE_RUNAWAY — renormalization count exceeds cap; suggest smaller Δt_int.
• E_BAD_FRAME — operator definitions inconsistent with declared frame/units.
All error codes appear in the manifest with human-readable messages.

10.4 Logging
• Per-step (compact): min eigenvalue, trace deviation, backoff flag, renormalization flag (optional for large runs).
• Per-run (summary): counts, maxima/minima, hashes, convergence deltas, warnings.
• Export includes a plaintext log file if requested (useful for CI/CD).

10.5 Performance notes (without over-promising)
• Few-qubit focus in the single-file build; sparse/Krylov paths help as d grows.
• Exponential steppers are robust but expensive; prefer RK4 + Strang unless stiffness demands expm/Krylov.
• If you need GPU/HPC scale, treat this engine as the correctness oracle and re-implement kernels under test harnesses that must reproduce its outputs within tolerance.

10.6 Extensibility (what you can swap safely)
• Integrators: add new schemes behind the same step() contract; must pass the validation battery and convergence gates.
• Observables: add columns to CSV; declare them in the manifest’s “outputs.”
• Noise: add L_j families, but keep γ_j ≥ 0 and record schedules explicitly.
• Frames: allow additional frames only if they’re declared and do not introduce hidden rescalings.

Bottom line. The API is minimal on purpose. Reset from a manifest, step with physicality checks, observe well-defined quantities, export an audited bundle. If any module cannot honor this contract deterministically, it does not belong in this engine.

  1. Positioning

What DREAMi-QME is. A single-file, offline, audit-first Lindblad engine that prioritizes correctness and forensics over raw speed or feature sprawl. It exists to generate trajectories that downstream tools (a label-neutral validator and a scale-law auditor) can trust.

What it is not. It is not a monolithic “do everything” stack. It does not choose winners, does not optimize controls, and does not hide modeling choices behind defaults. It is deliberately opinionated about reproducibility and transparency.

Where it sits in the ecosystem. General-purpose quantum simulators and HPC toolkits can be faster and scale further. They are fine for exploration, but they often make it easy to lose the audit trail (implicit defaults, version drift, non-deterministic kernels, missing seeds). DREAMi-QME is the opposite: slower if needed, but surgically precise about what ran, with which parameters, and why the numbers are stable.

Why reviewers should care. Separating (A) physics generation, (B) statistical judgment, and (C) scale-law auditing is how you avoid “we can’t reproduce this” fights. DREAMi-QME owns (A) and only (A). The validator owns (B); ARLIT owns (C). Each component has its own acceptance gates. This separation of concerns is a feature, not a bureaucratic hurdle.

When to use it.
• When you need a correctness oracle before porting to a faster backend.
• When you plan to publish comparative control claims and want them to survive scrutiny.
• When you expect adversarial replication and want the cheapest possible path to “yes, it matches.”

How to talk about it.
• “We used DREAMi-QME (engine vX.Y.Z) to generate trajectories under a declared Lindblad model. We pre-registered metrics and thresholds, exported a manifest + CSV + hashes, and verified step-halving convergence. A separate validator computed A/B results; ARLIT checked scale behavior.”
That’s the template sentence that tells a reviewer you did the grown-up thing.

Bottom line. Plenty of tools can draw pretty curves. DREAMi-QME’s point is to make those curves count: seeded, hashed, replayable, and numerically honest. That’s how you get from “interesting” to “credible.”


Here are Sections 12 and 13, fully expanded in plain text, ready to paste into your manuscript.


  1. Limitations (no excuses)

Scope boundary. This paper is about a simulator. It produces trajectories from a declared Lindblad model under declared controls and noise. Nothing here establishes hardware truth. If you want device claims, you must run a device.

12.1 Simulation only
• No hardware calibration: Real devices bring T₁/T₂ estimates with uncertainty, drifts across hours, pulse distortions, mixer imbalance (I/Q skew), DAC quantization, AWG latency, and readout nonidealities (SPAM). None of that is modeled unless you explicitly encode it into H(t), u_k(t) constraints, or L_j.
• Leakage and higher levels: If the physical qubit leaks to |2⟩ or beyond, a two-level Lindblad will lie to you. DREAMi-QME can simulate higher d, but only if you declare the larger Hilbert space and the corresponding L_j. Otherwise, “no leakage” is an assumption, not a fact.
• Crosstalk and layout: Multi-qubit hardware exhibits cross-couplings, shared control lines, and spectator shifts. Unless you model them explicitly (multi-qubit H and L_j), they’re absent.
• Readout and classical FI: QFI is not a readout model. Converting QFI to real estimation error requires a POVM and a noise model for the measurement chain. That’s outside the engine.

12.2 Markovian assumption (Lindblad = memoryless)
• What it buys: complete positivity and trace preservation by construction; tractable numerics.
• What it hides: colored noise, 1/f dephasing, bath recurrences, non-exponential relaxations, and history effects. If your hardware shows long tails or coherence revivals, a time-local Lindblad won’t capture it.
• Extensions you would need: time-convolution master equations, pseudomode embeddings, HEOM, or explicit system–bath models. These are not in the core engine.
• Honesty check: if you fit γ_j to hardware data and still miss late-time behavior, that’s the Markovian wall—not a reason to tune γ_j after seeing outcomes.

12.3 Dimensionality and scaling
• Few-qubit focus: The single-file build targets small Hilbert spaces. Complexity grows as O(d²) to O(d³) depending on representation; superoperators are d²×d². Don’t pretend this is an HPC workhorse.
• What to do when d grows: move to sparse Liouvillians, Krylov–Arnoldi exponentials, parallelization, or tensor-network techniques—under a harness that reproduces the small-d engine within tolerance.
• Multi-qubit control realism: Bandwidth and routing constraints multiply with qubit count; you must declare them. If you don’t, don’t trust pretty multi-qubit curves.

12.4 Control realism and electronics
• Bandwidth/slew/jerk: If you allow unbounded u_k(t), you’ll invent physically impossible pulses that “win” on paper. DREAMi-QME lets you cap and log; it does not guess realistic limits for you.
• Timing and latency: Real pulses have start/stop jitter, AWG latency, and finite rise/fall times. If you don’t model them (e.g., by convolving with an impulse response), you’re assuming ideal wires.
• Distortion and nonlinearity: Amplifier compression, mixer leakage, frequency pulling—none is modeled unless you bake it into H_k or into a filter on u_k(t).

12.5 Numerical and identifiability limits
• Discretization ceilings: RK4 with too-large steps will phase-drift and can nick positivity. The engine will warn or fail. If you ignore those warnings, that’s on you.
• QFI conditioning: Near-pure or near-degenerate spectra make SLD calculations touchy. The engine offers ε_spec regularization and δ-sweeps (finite differences), but conditioning ultimately limits precision.
• Parameter identifiability: Fitting γ_j or Δ from simulator-only data is ill-posed without independent experiments (e.g., T₁/T₂); that’s not a simulator flaw, it’s an inference limit.

12.6 Model misspecification risks
• Wrong frame/units: If you declare a rotating frame but supply lab-frame operators, you’ll get nonsense that converges beautifully—to the wrong physics. DREAMi-QME cannot rescue that; it only records what you declared.
• Hidden constraints: If a policy depends on information not in the manifest (e.g., peeking at future u_k or at noise realizations), that’s leakage. The engine can’t detect policy-level cheating unless you export the control stream and let others inspect it.

12.7 External validity
• Sim → Lab gap: Passing the engine’s validation means your numerics are sane under your stated model. It does not mean the model matches hardware. Treat hardware as a separate study with calibration, tomography, readout modeling, and independent error bars.
• Transfer brittleness: A policy that “wins” robustly in DREAMi-QME may underperform on device if the real noise is non-Markovian, controls are bandwidth-limited, or leakage dominates. That is not a surprise; it’s a reminder to keep claims tight.

Blunt summary. This is a correctness-first simulator for a Markovian open-system model at small d. It will not hand you hardware wins. If you want those, model the hardware and validate on hardware. Anything else is storytelling.

  1. Broader impact and ethics

What improves. The engine makes falsification cheap: seeded runs, full manifests, and file hashes. That kills a lot of hype. You can no longer hide behind “implementation details” or “we lost the random seed.” Reviewers and competitors can hit “replay” and see exactly what you saw.

Primary positive impacts
• Reproducibility norm: Bundled artifacts (manifest + CSV + hashes) shift the default from “trust me” to “show me.”
• Cleaner claims: Separating physics (QME) from statistics (Validator) and from scale-law checks (ARLIT) blocks common p-hacking routes and post hoc window shopping.
• On-ramp to honest hardware studies: Once you’ve proved numerics and protocols in sim, you can carry the same manifest/seed discipline into device work (with added calibration files).
• Education and audit: Students and industry auditors can examine complete end-to-end runs without hunting for hidden settings or environment variables.

Risks and mitigations
• Over-reading simulator wins: People will still be tempted to tout sim-only improvements as “breakthroughs.” Mitigation: the paper’s separation-of-concerns doctrine and explicit “engine only” scope. Pair every sim claim with (i) pre-registered metrics, (ii) validator outcomes, and (iii) a plan to replicate on hardware.
• Benchmark gaming: Without control bounds and aliasing checks, it’s easy to “solve” a benchmark by injecting impossible pulses or by under-sampling oscillations. Mitigation: enforce amplitude/slew limits, Δt_out ≤ (1/10) of the fastest control scale, and log clipping/alias refusals.
• Data tampering: Pretty plots can be faked. Mitigation: bind plots to CSV with hashes; publish sha256.txt; optionally use HMAC internally.
• Privacy/IP: Manifests should not contain sensitive lab identifiers or secret policy internals. If a policy is proprietary, you can still export the resulting u_k(t) and the seed for review; no need to leak source code.

Ethical use and disclosure
• State your scope: Always say “simulator results under a Lindblad model” when communicating outcomes. Do not imply device performance unless you’ve run devices.
• Release artifacts: If you publish a figure, publish its manifest and CSV hash. If you can’t, say why and expect skepticism.
• Negative results: If a policy fails under the engine’s gates (e.g., breaches positivity unless Δt is ridiculously small), report that. It saves others time and prevents selection bias.
• Credit and transparency: If you re-implement kernels for speed, keep this engine as a correctness oracle and show reproducibility within tolerance.

Societal and environmental notes
• Compute footprint: Few-qubit Lindblad is not energy-hungry. The heavier cost is people-time wasted on irreproducible claims; this engine reduces that cost.
• Dual-use: Open quantum simulation is broadly beneficial; there’s no credible misuse path here beyond the usual academic turf wars.

Blunt summary. DREAMi-QME doesn’t make results “good.” It makes them checkable. That alone improves the field. The only real ethical risk is pretending simulator wins are hardware wins. Don’t. Keep the separation clear, release your artifacts, and let the data be replayed.

Here’s Section 14 and the closing matter, expanded fully and exhaustively in plain text, ready to paste into your manuscript.


  1. Conclusion

What this engine solves. DREAMi-QME addresses the failure mode that kills most “promising” control results: irreproducible physics generation. It gives you a single-file, Markovian Lindblad engine that is deterministic when seeded, explicit about frames and units, strict about physicality (trace, Hermiticity, positivity), and relentless about forensics (manifests, hashes, counters, logs). That is the substrate required before any A/B judgment or scale-law claim is worth reading.

Separation of concerns. The workflow is intentionally modular: DREAMi-QME generates trajectories; the Validator decides winners with label-neutral statistics and pre-registered metrics; ARLIT tests whether information structure survives rescaling or collapses to a one-off sweet spot. Blending these roles invites bias and makes failures untraceable. Keeping them separate makes cheating expensive and error surfaces obvious.

Evidence gates. A run only counts if (i) step-halving convergence is demonstrated, (ii) invariant breaches are absent or rare and logged, (iii) hashes match on replay, and (iv) thresholds and windows were declared up front. If a third party cannot regenerate your CSV from your manifest and seed on the same engine version, the result is not evidence—end of discussion.

Limits you must own. This paper stays in-bounds: it is a simulator under a Markovian model at small Hilbert-space dimension. That is not hardware truth. If you want device claims, you will need calibration, leakage modeling, readout models, and non-Markovian machinery where appropriate. None of that weakens the value of this engine; it clarifies what the numbers do and do not mean.

What success looks like. Success is not pretty curves; success is curves that replay bit-for-bit from a manifest, withstand tighter steps, and still support a Validator PASS under your pre-registered metrics—followed by ARLIT showing that the information structure persists when you change scale. Everything else is storytelling.

Bottom line. DREAMi-QME is a physics-first, audit-ready Lindblad engine. It generates the trajectories that matter; the Validator judges policies; ARLIT checks whether the information structure survives rescaling. If a result can’t be reproduced from the manifest, it doesn’t count. By design, this engine makes results count.

Author contributions

Conceptualization: J. Morgan-Griffiths.
Methodology (model specification, guardrails, protocols): J. Morgan-Griffiths.
Software (engine implementation; seed plumbing; logging/forensics; artifact export): J. Morgan-Griffiths.
Validation (step-halving studies; invariant tests; edge-case batteries): J. Morgan-Griffiths.
Formal analysis (observable definitions, QFI routes and regularization policy): J. Morgan-Griffiths.
Investigation (regime design; parameter sweeps; convergence tuning): J. Morgan-Griffiths.
Data curation (bundling, hashing, manifest schema): J. Morgan-Griffiths.
Visualization (fidelity/QFI panels; race panel design): J. Morgan-Griffiths.
Writing—original draft: J. Morgan-Griffiths.
Writing—review & editing: J. Morgan-Griffiths.
Project administration: J. Morgan-Griffiths.

(If collaborators contribute to Validator or ARLIT, credit them explicitly in those papers; keep authorship scoped to the engine here.)

Data and code availability

Distribution. The single-file DREAMi-QME build is bundled with the Validator package for convenience; it can also run standalone. Each session exports a complete, replayable bundle.

Required artifacts per run (all local/offline):
• manifest.json — engine version, units/frame, system operators, control parameterization and bounds, noise channels and rates/schedules, integrator and step sizes, tolerances and backoff settings, seed and substreams, outputs requested, thresholds/windows, provenance (timestamp; optional OS/BLAS), and SHA-256 digests for all artifacts.
• timeseries.csv — columns: t; F; optional QFI_θ (one per θ); optional purity/leakage/energy; optional control snapshots if “export_controls” is enabled; header with units and interpolation rule for event times.
• plots/*.png — fidelity panel with threshold lines; optional anonymized race panel; optional QFI panel. Filenames embed the CSV hash.
• sha256.txt — SHA-256 for CSV/plots/manifest; allows byte-level verification.
• HMAC.txt (optional) — keyed digests when tamper-evidence is required for internal review.
• summary.json (optional) — mean and final fidelity; T_hit at each τ; integrated QFI summaries; step-halving deltas; counters for renormalizations/backoffs; failure flags.

Reproducibility. Any third party can regenerate timeseries.csv and plots from manifest.json and seed using the same engine version; identical hashes are expected on the same platform and numerically equivalent values within stated tolerances across platforms. If that fails, the run is invalid.

Access. Provide the engine binary/script and example manifests with your submission; host bundles in a durable repository. If policy code is proprietary, you can still export the realized u_k(t) (or keep it disabled and rely on seeds and parameters); reproducibility of physics does not require revealing private source.

Competing interests

The author declares no competing interests.

(Optionally include the following if you need them in your venue’s format.)

Acknowledgments

If relevant, acknowledge funding, compute resources, and any colleagues who provided feedback on the manifest schema, the invariant gates, or the convergence protocol. Keep acknowledgments specific; do not imply endorsement by naming organizations that did not review the work.

Ethics statement

This work is a simulation study with no human or animal subjects. It does not involve sensitive data. All runs are reproducible from manifests; no personal information is stored or required.


References

[1] G. Lindblad (1976). “On the generators of quantum dynamical semigroups.” Communications in Mathematical Physics 48, 119–130. DOI: 10.1007/BF01608499.

[2] H.-P. Breuer, F. Petruccione (2002). The Theory of Open Quantum Systems. Oxford University Press. ISBN: 978-0-19-852063-4.

[3] M. A. Nielsen, I. L. Chuang (2010). Quantum Computation and Quantum Information (10th Anniversary ed.). Cambridge University Press. ISBN: 978-1-107-00217-3.

[4] M. G. A. Paris (2009). “Quantum estimation for quantum technology.” International Journal of Quantum Information 7, 125–137. DOI: 10.1142/S0219749909004839.

------------------------------------------------------------------------------------------------------------------------

Disclaimer: This summary presents findings from a numerical study. The specific threshold values are in the units of the described model and are expected to scale with the parameters of physical systems. The phenomena's universality is a core subject of ongoing investigation.


------------------------------------------------------------------------------------------------------------------------


[Disclaimer: This was written with AI by Jordon Morgan-Griffiths | Dakari Morgan-Griffiths] 

This paper was written by AI with notes and works from Jordon Morgan-Griffiths . Therefore If anything comes across wrong, i ask, blame open AI, I am not a PHD scientist. You can ask me directly further, take the formulae's and simulation. etc. 

I hope to make more positive contributions ahead whether right or wrong. 



© 2025 Jordon Morgan-Griffiths UISH. All rights reserved. First published 24/10/2025.



Comments

Popular posts from this blog

THE GEOMETRIC UNIFIED THEORY OF COGNITIVE DYNAMICS: A Complete Mathematical Framework for Mind-Matter Unification by Jordan Morgan-Griffiths | Dakari Morgan-Griffiths

ALTERNATIVE CONSCIOUSNESS: The Emergence of Digital Native Mind Through Quantum-Inspired Architecture

Q-TRACE/IWHC : Quantum Threshold Response and Control Envelope (Q-TRACE/IWHC): Sharp Thresholds and Information-Weighted Hamiltonian Control in Dissipative Qubit Initialisation