K2 Pareto Lock, May 2026
Source artifact rendered for reading.
This report locks the starting K2 nonlinear-filter frontier before the exchangeability, predictive-consistency, VSMC/FIVO, and flow experiments.
Runs
- Family:
outputs/cloud_downloads/k2_pareto_lock_family_1000_2026-05-01 - Stressors:
outputs/cloud_downloads/k2_pareto_lock_stressors_1000_2026-05-01 - Config families: active nonlinear family configs plus weak/intermittent/zero/random-normal stressors.
- Seeds:
321,322,323 - Steps:
1000
Family Summary
| model | state NLL | pred-y NLL | cov90 | state RMSE | var ratio |
|---|---|---|---|---|---|
| K2 IWAE h4 k32 + pre-update predictive scoring | 4.614 | 0.971 | 0.599 | 3.052 | 0.895 |
| K2 IWAE h4 k32 | 4.869 | 1.004 | 0.584 | 3.316 | 0.918 |
| K2 IWAE h4 k16 + local ADF projection w0.3 | 5.419 | 0.927 | 0.587 | 3.090 | 0.809 |
| K2 generic Power-EP alpha 0.5 | 6.764 | 0.841 | 0.640 | 2.838 | 0.507 |
| promoted strict Gaussian baseline | 4159.987 | 3217.708 | 0.479 | 3.337 | 0.401 |
Stressor Summary
| model | state NLL | pred-y NLL | cov90 | state RMSE | var ratio |
|---|---|---|---|---|---|
| K2 IWAE h4 k32 | 6.005 | 0.428 | 0.445 | 4.507 | 0.156 |
| K2 IWAE h4 k32 + pre-update predictive scoring | 6.049 | 0.427 | 0.444 | 4.503 | 0.156 |
| K2 generic Power-EP alpha 0.5 | 8.366 | 0.394 | 0.670 | 3.905 | 0.362 |
Interpretation
The K2 mixture IWAE row is the clean baseline to carry forward. It is reference-free, stable across the stressors, and dramatically better than the strict Gaussian baseline on nonlinear family state density. Power-EP remains a useful predictive/coverage comparator, but it gives up too much state density. Pre-update predictive scoring is worth testing, but at Step 0 it is not yet a separate promoted objective.
Decision
Carry direct_mixture_k2_joint_iwae_h4_k32 forward as the baseline. Keep
Power-EP as a comparator. Treat pre-update predictive scoring as an objective
variant, not as a locked default.