Repairing a Nonlinear Strict Filter Without Reference Targets

The nonlinear sine-observation benchmark exposed ELBO under-dispersion, then a joint ELBO, predictive-y, and masked-y objective partially repaired it.

Series: VBF Experiments, April 2026

After the scalar benchmark, the work moved to a nonlinear sine-observation model:

\[ z_t = z_{t-1} + w_t,\quad w_t \sim \mathcal{N}(0,Q) \]\[ y_t = x_t \sin(z_t) + v_t,\quad v_t \sim \mathcal{N}(0,R) \]

The strict filtering contract stayed the same:

\[ q^F_t = \operatorname{update}(q^F_{t-1}, x_t, y_t) \]

No hidden sequence state was allowed in the headline rows. The filter had to export an explicit online filtering marginal at each time step.

The Initial Failure

The first nonlinear branch established grid references, cached diagnostics, learned strict Gaussian filters, and stressor configs for weak, intermittent, zero, random-normal, and clean sinusoidal observations. The early fully unsupervised ELBO rows failed in a consistent way: they became too narrow, then self-fed the next update from a bad prior.

Reference-assisted rows showed that the architecture was not hopeless. Direct moment distillation from the grid reference reached state NLL near 2.77 with coverage near 0.84 across the robustness suite. A structured horizon-4 rollout distillation diagnostic also worked in weak and zero-observation settings. Those rows were useful controls, but they were not fully unsupervised.

The key diagnosis was narrower than “use a bigger model”:

A strict Gaussian target was not obviously doomed, because a moment-matched Gaussian projection of the grid posterior was much closer to the grid reference than the learned Gaussian.

That pointed at the objective before the posterior family.

Objective Repair

The promoted unsupervised row combined three pieces:

structured_joint_elbo_h4_w005_predictive_y_masked_y_spans_h4

The pieces had different jobs:

ComponentPurpose
short-window joint ELBOmake neighboring edge factors jointly explain a coherent latent path
causal predictive-y scorescore \(y_t\) under the pre-assimilation belief before using \(y_t\) to update
masked-y span trainingforce the carried belief to survive missing or withheld measurements

The windowed ELBO used the carried filtering marginal and learned backward conditionals to score a latent path against the generative model. For a window ending at \(s+H\), the posterior shape was:

\[ \begin{aligned} q(z_{s-1:s+H}) &= q^F_{s+H}(z_{s+H}) \prod_{t=s}^{s+H} q^B_t(z_{t-1}\mid z_t) \end{aligned} \]

The important constraint was that the objective used only \(x\), \(y\), the known transition, the known observation model, and the prior. No grid moments or latent states were used for the headline row.

Nonlinear robustness sweep

Robustness Result

The final robustness run compared structured ELBO, direct ELBO, the promoted combined objective, and reference-distilled controls across five stressors with seeds 321,322,323.

Conditionstructured ELBO NLLpromoted NLLstructured cov90promoted cov90promoted var ratio
sinusoidal52.98954.9300.3470.3420.083
weak sinusoidal20.86514.6720.3320.3960.090
intermittent sinusoidal37.85322.9920.3270.3710.060
zero13.4748.4140.2820.3880.107
random normal113.95860.1090.3150.3580.040

This supported a partial-success claim. The candidate materially improved weak, intermittent, zero, and random-normal stressors. It also improved variance ratio on every condition in the table. But it regressed slightly on clean sinusoidal state NLL and coverage, and absolute calibration remained poor. The best fully unsupervised variance ratios stayed below 0.11, far below the original 0.50 gate.

What Changed After This

The objective repair did enough to justify continuing, but not enough to call the nonlinear filter solved. It also sharpened the next question:

  1. Was the remaining failure caused by the ELBO-style divergence?
  2. Was a single Gaussian posterior family too restrictive?
  3. Did objective and posterior family have to change together?

That became the next branch: IWAE/FIVO-style multi-sample objectives, alpha/power-EP style updates, and small strict mixtures.

Source artifacts: