4D-Var Adjoint Data Assimilation — PM2.5 Nowcast

The Question

Can we recover today's air-quality state from the past 12 hours of monitor data?

Air-quality nowcasting is a state-estimation problem: given a physics model, a first guess, and a stream of noisy ground-based monitor observations, what is the best estimate of the true PM2.5 field right now? 3D-Var uses one snapshot of data. 4D-Var uses the entire time window and enforces the forward model as a hard constraint — mathematically, it's the MAP estimate of the initial condition given all observations.

This investigation runs a twin experiment on a 1D upwind-advection + decay model with 6 monitor stations reporting every 2 hours over a 12-hour window. We compare background (no assimilation), 3D-Var (end-of-window data only), and 4D-Var (full trajectory).

Assimilation ladder

From persistence to adjoint optimization

Persistence (t = now) No assimilation; forecast = observation at t=0.

n/a

baseline

Optimal interpolation Weight observations by distance to grid cell; static B matrix.

n/a

classical

3D-Var (end-of-window) Minimize J using only observations at t = 12h. RMSE_init = 4.92 µg/m³.

iters

4D-Var (this investigation) Minimize J over 6 time slices using adjoint gradient. RMSE_init = 2.07 µg/m³ — a 57.9% improvement over 3D-Var.

iters

Ensemble 4D-Var (hybrid) Flow-dependent B matrix from ensemble forecast; next-generation operational system (ECMWF, NOAA).

—

future

Initial condition recovery

Does the adjoint find the plume?

The true plume (white) is centered at 400 km with sigma=120 km. The background (gray) is wrong in both location and width. 3D-Var (gold) pulls toward the observations at the end of the window but can't resolve the upwind shape. 4D-Var (green) uses the time-evolution of the plume to reverse-engineer its initial position — recovering the shape to within 2 µg/m³ RMSE.

6.05

Background RMSE (µg/m³)

4.92

3D-Var RMSE

2.07

4D-Var RMSE

57.9%

4D vs 3D improvement

18-hour forecast

Better initial condition → better forecast

Running the forward model another 6 hours past the assimilation window, 4D-Var's forecast RMSE is 0.13 µg/m³ vs 0.49 µg/m³ for 3D-Var. Decay smooths everything so absolute errors shrink, but the 4D-Var forecast still tracks the truth almost exactly while 3D-Var retains a visible offset.

Cost-function convergence

L-BFGS with adjoint gradient

4D-Var's J dropped 88.84% in 12 iterations; 3D-Var converged in 2 iterations because the cost surface is simpler. The adjoint model provides the exact gradient so L-BFGS super-linear convergence kicks in after a few steps.

Adjoint derivation is hand-written here. Production systems use automatic differentiation (TAPENADE, OpenAD) or pre-coded adjoints (WRF-DA, GEOS-Chem Adjoint).

Decision implication

Assimilate the hourly stream, not the snapshot

Recommendation: CEC operational forecasting should assimilate the full hourly AQS monitor stream, not just the latest snapshot. The 4D-Var framework's cost (adjoint model maintenance + L-BFGS) is justified by the 2-3× RMSE reduction on both analysis and forecast. This matters for exceedance nowcasting and for validating the Inv 18 MFMC uncertainty bounds.

Caveats

What this demo does not show

Synthetic twin experiment on a simplified 1D model; production 4D-Var requires 3D mesoscale adjoint (e.g., WRF-Chem 4D-Var or GEOS-Chem Adjoint).
Observation error assumed Gaussian and uncorrelated; real AQS data has correlated instrument/siting error.
Strong-constraint 4D-Var assumes perfect model; weak-constraint 4D-Var (also known as long-window 4D-Var) is the next step for WRF-Chem.
B matrix assumed diagonal; a flow-dependent ensemble B would further reduce analysis RMSE.

← Previous BOCA-Inspired Monitor Placement Hub RFAQ Study Home Next → PINN WRF-Chem Surrogate

Using 12 hours of monitor data to reconstruct today's PM2.5

Can we recover today's air-quality state from the past 12 hours of monitor data?

From persistence to adjoint optimization

Does the adjoint find the plume?

Better initial condition → better forecast

L-BFGS with adjoint gradient

Assimilate the hourly stream, not the snapshot

What this demo does not show