Appendix B. Numerical protocol details (gates, seeds, sampling)

Appendix B. Numerical protocol details: A Run is a single execution unit identified by the following tuple. The same run_id must reproduce the same outputs; this is judged by the Gate (G-REP). An Artifact is the file bundle produced by a Run.

A Run is a single execution unit identified by the following tuple. The same run_id must reproduce the same outputs; this is judged by the Gate (G-REP). Ω=0,…,M-1.

B.0 Execution unit and logging unit (definitions)

Run

A Run is a single execution unit identified by the following tuple.

\begin{equation} \mathrm{run\_id} := \bigl(\mathrm{code\_version},\mathrm{registry\_snapshot\_id},\mathrm{protocol\_id},\mathrm{seed\_id},\mathrm{dataset\_id}\bigr). \end{equation}

The same run_id must reproduce the same outputs; this is judged by the Gate (G-REP).

Artifact

An Artifact is the file bundle produced by a Run.

\mathrm{artifact} := \{\mathrm{logs},\mathrm{metrics},\mathrm{figures},\mathrm{tables},\mathrm{manifests},\mathrm{checksums}\}.

An Artifact must be sealed with manifest+checksums+registry_snapshot; if sealing is missing, the Gate revokes the conclusion eligibility.

SSOT

SSOT (Single Source of Truth) means that no constant/definition/threshold of the same meaning exists in more than one place. SSOT violations are classified as (i) duplicate definitions of the same key, (ii) defining the same concept with different values in different files, and (iii) “ghost constants” that exist only in logs but not in the registry.

B.1 Seed convention (determinism)

seed_id

\begin{equation} \mathrm{seed\_id}:=\mathrm{SHA256}(\mathrm{seed\_namespace}\Vert \mathrm{seed\_payload}) \end{equation}

Here, Vert denotes byte-string concatenation. seed_namespace is a fixed string (e.g., "v4.seed"), and seed_payload is the byte sequence formed by concatenating the following in order.

\mathrm{seed\_payload}:= \mathrm{UTF8}(\mathrm{code\_version})\Vert \mathrm{UTF8}(\mathrm{registry\_snapshot\_id})\Vert \mathrm{UTF8}(\mathrm{protocol\_id})\Vert \mathrm{UTF8}(\mathrm{dataset\_id})\Vert \mathrm{UTF8}(\mathrm{user\_tag}).
user_tag is an experiment label. It cannot be changed after seeing the result (only via version bump); changing it generates a new seed_id.

Definition B.1.2 (mapping to 32-bit / 64-bit seeds)

Split the 32-byte hash digest into the upper 8 bytes and the lower 8 bytes,

s_{64}^{(0)}:=\mathrm{uint64}(\mathrm{digest}[0:8]),\quad s_{64}^{(1)}:=\mathrm{uint64}(\mathrm{digest}[8:16])

and define the initial state accordingly. Thereafter, the RNG uses only this state as input.

Deterministic RNG (standard independent of external libraries)

To eliminate implementation differences across external RNG libraries, we lock the following xorshift128+ as the standard generator.

\begin{equation} \begin{aligned} &\texttt{uint64 next():}\\ &s_1\leftarrow s_0,\;\; s_0\leftarrow s_1\\ &s_1 \leftarrow s_1 \oplus (s_1 \ll 23)\\ &s_1 \leftarrow s_1 \oplus (s_1 \gg 17)\\ &s_1 \leftarrow s_1 \oplus s_0\\ &s_1 \leftarrow s_1 \oplus (s_0 \gg 26)\\ &\text{return } (s_1+s_0)\bmod 2^{64} \end{aligned} \end{equation}

The state (s₀,s₁) consists of two 64-bit integers, and the initial state is set only by the convention (AppB_seedid)(AppB_xorshift).

B.2 Sampling convention (windows / seeds / repetitions)

Definition B.2.1 (window partition)

The time axis or event axis is partitioned into windows of length W, and the partition rule is fixed as

\mathrm{window}(j):=[jW,(j+1)W)

as above. The unit of W (ticks/seconds/event-count) is locked in analysis_lock.

Definition B.2.2 (replica index)

Lock the number of repetitions R∈N and let the replica index be r∈0,…,R-1. Each replica shares the same run_id, but branches the seed by appending replica=r to seed_payload.

Definition B.2.3 (sample selection function)

For a finite population Ω=0,…,M-1, define the function S that selects a sample of size K by

\mathcal{S}(\Omega,K; s_0,s_1):=\text{the first }K\text{ elements of the permutation generated by }\texttt{RNG}

as follows. Specifically,

  1. Generate M 64-bit values via xorshift128+: u₀,…,u_(M-1).
  2. Sort key-value pairs (u_i,i) by increasing key.
  3. Take the first K indices of the sorted list as the sample.

The stability of sorting (tie handling) is fixed by the comparison rule locked in analysis_lock (e.g., lexicographic order on (u_i,i)).

B.3 Numerical operation conventions (tolerance / stability)

Definition B.3.1 (relative/absolute tolerance)

The closeness test for two reals x,y is fixed by the following convention.

\begin{equation} \mathrm{close}(x,y;\varepsilon_{\mathrm{abs}},\varepsilon_{\mathrm{rel}}) \;\Longleftrightarrow\; |x-y|\le \varepsilon_{\mathrm{abs}}+\varepsilon_{\mathrm{rel}}\max\{|x|,|y|\}. \end{equation}

ε_abs,ε_rel are locked in gate_lock.

Definition B.3.2 (summation stabilization: compensated sum)

For a long sum Σ_i a_i, Kahan compensated summation is adopted as the standard for numerical stability.

sum = 0.0
c = 0.0
for a in a_list:
    y = a - c
    t = sum + y
    c = (t - sum) - y
    sum = t
return sum

The scope of this procedure (e.g., event-rate estimation, frequency accumulation, energy/tension accumulation) is locked in analysis_lock.

B.4 Gate evaluation conventions (order / logging / sealing)

Gate DAG

A Gate is a directed acyclic graph (DAG); each Gate G_i has inputs (artifact, locks, metrics) and an output status. The Gate stack is evaluated only in topological order.

Definition B.4.2 (Gate output)

The output of each Gate is restricted to

\mathrm{status}\in\{\texttt{PASS},\texttt{FAIL},\texttt{INCONCLUSIVE}\}

as shown. INCONCLUSIVE means deferral and cannot be used as evidence for conclusions.

Definition B.4.3 (Gate log)

Each Gate generates a log object containing the following keys.

\begin{equation} \mathrm{gate\_log}:=\{ \mathrm{gate\_id},\mathrm{inputs},\mathrm{thresholds},\mathrm{metrics},\mathrm{status},\mathrm{timestamp},\mathrm{hashes} \}. \end{equation}
inputs must include the lock_id used, the manifest hashes, and the schema version.

B.5 FAIL triggers (falsification-trigger record format)

Definition B.5.1 (FAIL code)

FAIL is recorded as a single string code, and the code must exist in the pre-registered list in gate_lock. Example format:

\texttt{F-LOCK-MIX},\texttt{F-NOTUNING},\texttt{F-REP-MISSING},\texttt{F-SYM-UNIT}.

The code list and meanings are maintained as SSOT; temporary codes that exist only in logs are forbidden.

Definition B.5.2 (falsification-trigger object)

When a FAIL occurs, record the following object.

\mathrm{falsify\_trigger}:= \{\mathrm{fail\_code},\mathrm{fail\_evidence},\mathrm{scope},\mathrm{lock\_ids},\mathrm{artifacts}\}.
scope specifies the impact range (single section/chapter/global).