← RFAQ Study Home · Phase 2 + 3 Methods Appendix

Phase 2 + 3 Methods Appendix

One row per investigation, side by side, so a reviewer can see what each one does without opening it. Phase 2 (Inv 17–28) lists the L1–L5 fidelity ladder, the decision it resolves, and the validation datum. Phase 3 (Inv 29–36) names the frontier method — NSGA-II, polynomial chaos, unified BED, BOCA-inspired multi-fidelity BO, strong-constraint 4D-Var, physics-informed neural networks, linear-operator GPs, Strong-Oakley-Brennan 2014 nonparametric EVPPI. All implemented from scratch in pure NumPy (EVPPI uses sklearn SplineTransformer + Ridge).

Phase 2 · Multi-Fidelity Ladders

Twelve investigations, each built as a fidelity ladder: start cheap, climb only where the decision demands it. Each one ends with a formal multi-fidelity fusion step — Kennedy–O’Hagan co-kriging, MFMC, POMDP, CVaR/DRO, BO.

Investigation 17 Sprint 1

Wildfire emissions-to-exposure

L1
linear
L2
empirical smoke-day x EF x Gaussian plume
L3
physics surrogate (Rothermel+HYSPLIT-style)
L4
WRF-Chem stub, fused via Kennedy–O’Hagan AR1 cokriging

The wildfire-dominance finding stands up to a proper fidelity ladder: L2/L3/L4 all agree within 20% on episode-mean PM2.5, while the previously-used L1 linear model is systematically biased. Fused cokriging posterior R^2 vs AQS is +-1.66, vs -166 for the legacy ISRM-based baseline.

Investigation 18 Sprint 2

Atmospheric chemistry MFMC

L1
Seinfeld-Pandis photostationary 0D box
L2
ISRM linear
L3
InMAP surrogate (quadratic)
L4
CMAQ isopleth stub (regime-aware)
L5
WRF-Chem met-coupled stub. Fused via Kennedy–O’Hagan AR1 cokriging per basin

LA Basin (VOC-limited, VOC/NOx=3.2) shows a sign flip between Phase 1 ISRM (L2 -1.84 ppb) and Phase 2 regime-aware CMAQ (L4 +2.10 ppb). San Joaquin Valley (NOx-limited) gets a 25% larger reduction than linear. Portfolio pop-weighted ozone co-benefit drops from -1.92 to 0.82 ppb (57% Phase 1 overstatement). $7273.8M → $-3132.8M ozone co-benefit.

Investigation 19 Sprint 1

Indoor air coupling

L1
outdoor-only
L2
single-zone mass balance
L3
multi-zone CONTAM-lite
L4
personal exposure via CHAD time-activity

Building electrification's health benefit is dominated by the unmodelled indoor pathway. Phase 1 B2 = 47 deaths (Di); with L4 indoor coupling: 341 deaths -> $5.9M/death. The B2 'not worth $2B' verdict flips; B2 is now cost-competitive with Transport T2 per dollar.

Investigation 20 Sprint 2

Grid dispatch + EV charging

L1
eGRID annual average (237 gCO2/kWh)
L2
CARB hourly marginal EF weighted by charging profile
L3
reduced-form stochastic economic dispatch (N=200 MC)
L4
PLEXOS security-constrained UC/ED stub with peak congestion premium
L5
Bayesian-optimized charging schedule. Kennedy–O’Hagan AR1 cokriging posterior

Phase 1 grid-average EF (237 gCO2/kWh) is a 40-50% overestimate for managed/midday charging and a ~25% *underestimate* for unmanaged fleets. L4 PLEXOS ladder shows the midday_managed schedule reaches 166 gCO2/kWh vs 367 for unmanaged (55% gap). Shifting CA's 1.8M-vehicle fleet to the optimal schedule avoids 1.32 Mt CO2/yr relative to unmanaged.

Investigation 21 Sprint 1

Hierarchical Bayesian CRF

L1
discrete Di/Krewski
L2
Bayesian model averaging
L3
spatial-hierarchical Gibbs (normal-normal conjugate)
L4
VanderWeele E-value sensitivity

Hierarchical Bayesian CRF shrinks 58 county-level estimates toward a posterior mean of 1.0671 HR/10 ug (95% CI 1.0504–1.0840), which brackets both Di/Krewski. T2 deaths avoided: 373-612. Residual EVSI of a definitive CRF study against this posterior: $0.05B.

Investigation 22 Sprint 3

Sequential portfolio POMDP

L1
one-shot static schedule (Phase 1 pattern)
L2
two-stage stochastic program with 5-year observation break
L3
rolling-horizon annual re-optimization
L4
POMDP belief-state value iteration over {Di, Krewski, Inv 21 posterior} CRF regimes
L5
multi-fidelity Bayesian optimization over policy parameters. 200 Monte Carlo trajectories per policy; true CRF drawn from prior mixture

Sequential adaptive policy (L5_bo) outperforms Phase 1's one-shot approach by 19.2% on expected deaths avoided (777 vs 651). The value comes from learning the true CRF regime over the first 3–5 years and reallocating remaining budget accordingly. 10-year $4B portfolio delivers 777 discounted deaths-avoided under the best sequential policy vs 651 under commit-and-forget.

Investigation 23 Sprint 4

Robust portfolio (CVaR/DRO/info-gap)

L1
expected-value
L2
chance-constrained (P(NB>=0)>=0.90)
L3
CVaR_0.05 tail (Rockafellar-Uryasev 2000)
L4
parametric adversarial shift / DRO-lite (one fixed scenario: -20% mean + 1.3× sigma inflation; NOT canonical moment- or KL-ambiguity DRO)
L5
info-gap horizon (alpha at which mean NB drops below $1B floor). Each level ranks six candidate portfolios from Phase 1 Inv 12 plus the sequential BO-optimal policy from Inv 22

Across five robustness criteria (expected value, chance-constrained, CVaR_0.05, parametric adversarial shift [DRO-lite], info-gap), the consensus best portfolio is F_maximum (3/5 votes). Phase 1's expected-value pick was F_maximum. The two agree — robust analysis confirms Phase 1 under the Phase 2 uncertainty envelope.

Investigation 24 Sprint 3

CRF research roadmap

L1
point-estimate residual EVSI
L2
pre-posterior EVSI per study design
L3
staged sequential pathway
L4
multi-arm information bandit
L5
POMDP adaptive research portfolio. Anchored on Inv 21 hierarchical posterior sigma. Five candidate designs: meta-analysis, retrospective cohort, Di-Medicare extension, CA prospective, multi-cohort consortium

Against the Inv 21 hierarchical posterior (residual EVSI $0.05B), the best single-design ROI is meta_analysis (40.7×) at $0.5M. The L5 POMDP adaptive portfolio averages 3.0 arms, $7.5M spend, and realizes $0.084B EVSI — net value $+0.077B. Staging matters: run the $0.5M meta-analysis first; escalate to retrospective only if the posterior remains ambiguous.

Investigation 25 Sprint 3

Geographic decomposition

L1
statewide aggregate (Phase 1)
L2
air-basin (8 basins)
L3
county (58 counties, Gini-weighted)
L4
census-tract with CalEnviroScreen 4.0 burden weighting
L5
block-group equity-weighted redistribution under simulated annealing + Gini penalty. All levels preserve the Phase 1 aggregate total (1,015 deaths) and compare DAC coverage at each resolution

Phase 1 reported the free-lunch portfolio as 1,015 deaths at $0 with a 21% DAC share. Decomposing spatially, the same portfolio can shift from 21% DAC share (statewide) to 45% DAC share under CES-burden-weighted targeting (L4) or equity-optimized reallocation (L5) — a +24.6 percentage-point gain worth about +250 extra DAC deaths avoided. The free-lunch total stays at ~1,015; the distributional pattern is what improves.

Investigation 26 Sprint 4

Climate-fire coupling

L1
stationary extrapolation
L2
NFDRS statistical fire model with VPD trend extrapolation
L3
6-GCM CMIP6 ensemble (SSP2-4.5 + SSP5-8.5) driving Abatzoglou-Williams burned-area response
L4
WRF-Chem offset (+8% SOA uplift from Inv 17 L3–L4 bracketing)
L5
Earth-system reference (+/-7% CI widening from vegetation-climate and aerosol-cloud feedback)

Under the 6-GCM CMIP6 corridor, 2050 wildfire mortality in CA rises to 25,031 deaths/yr [p10=20,141, p90=31,969] - 1.73× the Phase 1 stationary baseline. The climate uncertainty envelope (11,828 deaths/yr) is 8.2× the entire Phase 1 policy signal (1,447 deaths avoided). Climate dominates fuel-management noise, so any portfolio choice must be robust across the fan, not optimized to a single climate trajectory.

Investigation 27 Sprint 4

Adaptive monitor placement

L1
static top-N EVSI ranking (Phase 1 Inv 13)
L2
greedy sequential with haversine redundancy penalty (Nemhauser-Wolsey-Fisher 1-1/e bound)
L3
Gaussian-process Bayesian optimization with UCB acquisition
L4
POMDP-coupled placement with Inv 26 climate-signal bonus
L5
joint multi-network (PM2.5+O3+speciation at the L4 sequence)

Moving from static ranking (L1) to POMDP-coupled placement (L4) lifts total EVSI from $85M to $119M at the same $12.5M cost (ROI 6.8× to 9.5×). L3 ties L2 on total EVSI ($111M both) because the Gaussian-process domain-variance reduction it buys matches greedy's haversine penalty — so L3's value is not magnitude but DAC-equity reweighting (DAC share 40% vs L2 20%, same $111M pot). L4 reallocates away from DAC (0%) toward climate-signal corridors (coverage 0.84 vs 0.62); its $8M EVSI uplift over L3 is the climate-signal integral (L4_CLIMATE_EVSI_FRAC=0.25 of the pot, weighted by proxy). L5's $146M comes from explicit multi-network decomposition: +15% O3 EVSI + 8% NMVOC-speciation EVSI = +23% on the L4 sequence, at 1.6× co-location cost. No single level dominates on ROI + equity + climate together — CARB must pick which objective leads.

Investigation 28 Sprint 2

Data assimilation

L1
IDW monitor interp
L2
model only
L3
scalar optimal interpolation
L4
3D-Var with static Gaussian B and observation R (reg only vs reg+PurpleAir)
L5
Ensemble Kalman Filter with Schur-product localization, 40-member ensemble. Kennedy–O’Hagan AR1 cokriging posterior anchored by L5

Climbing the DA ladder from Phase 1 model-only (RMSE 3.41 µg/m³) to Phase 2 EnKF with PurpleAir (RMSE 2.71 µg/m³) cuts exposure-estimation error by 21%, worth $487M in reduced mortality mis-attribution. PurpleAir sensors add $332M incremental value on top of the 40-station regulatory network.

Phase 3 · Advanced Methods Frontier

Eight investigations pushing into the research frontier. Each is a canonical algorithm implemented from scratch in pure NumPy: evolutionary multi-objective optimization, polynomial chaos, unified Bayesian experimental design, BOCA-inspired multi-fidelity BO (cost-weighted successive halving), strong-constraint 4D-Var with hand-derived adjoint, physics-informed neural networks with analytic derivatives, linear-operator GPs with the PDE baked into the kernel, Strong-Oakley-Brennan 2014 nonparametric EVPPI that decomposes group-level VOI into single-parameter VOI from the existing PSA sample.

Investigation 29 Sprint 5

Multi-objective Pareto frontier (NSGA-II)

NSGA-II (Deb et al. 2002) on 5-dim portfolio design space (wildfire reduction, transport spend, building spend, indoor AQ spend, DTE). Three objectives: maximize deaths avoided, maximize DAC-weighted deaths avoided, minimize cost. 100 pop x 80 generations with SBX crossover (eta=15) and polynomial mutation (eta=20, p=0.2). Objectives are evaluated through a deterministic LINEAR SURROGATE (rfaq/optimization/pareto_frontier.py; hardcoded deaths-per-$B coefficients per sector) calibrated to the Inv 23 MEAN deaths/cost of portfolios A, B, and C (exact match) and E (within ~45 deaths). This is NOT Inv 23's Monte Carlo uncertainty envelope and does NOT re-draw the CRF posterior or apply Inv 23's VSL scalarization. Dominance claims therefore apply under Inv 29's own 3-objective deterministic formulation, not under Inv 23's expected-value / CVaR / info-gap robust criteria.

Under Inv 29's 3-objective deterministic formulation (deaths, DAC-deaths, cost; no VSL, no MC), NSGA-II finds 100 Pareto-optimal portfolios. 4 of 6 candidate seeds (4 from Inv 23 plus 2 constructed for this study) are strictly dominated by a Pareto point. The 'indoor_focus' seed (Inv 19-weighted, $2B indoor AQ) reaches a DAC share of 0.23, higher than all 5 other seeds, demonstrating that Inv 19's 3× indoor benefit transfers to the equity objective. Caveat: dominance holds under Inv 29's deterministic linear surrogate; re-validating under Inv 23's MC net-benefit distribution is a natural next step.

Investigation 30 Sprint 5

Polynomial chaos expansion (Inv 17 QoI)

Order-3 Legendre PCE on standardized [-1,1]^6 input space, 120 uniform collocation samples, least-squares fit. Sobol indices computed algebraically from PCE coefficients. Compared against MC Saltelli/Jansen (4,096 evals) from Inv 17 Sobol study.

Order-3 PCE with only 120 model evaluations recovers the MC Sobol ranking to within |ST_pce - ST_mc| <= 0.026. Top driver under both methods is cross_section_km. This confirms the Inv 17 MC Sobol is not sampling-error-limited. PCE gives the same answer for 34.1× less model work and produces a surrogate usable for derivative-free optimization.

Investigation 31 Sprint 5

Closed-loop Bayesian experimental design

Closed-loop greedy BED over 10-year horizon and $50M budget. Unified belief state (sigma_CRF, sigma_monitor_by_region). Each year, pick action maximizing expected EVSI-proxy / cost. Actions: fund 1 of 4 CRF studies (Inv 24 designs) or deploy 1 of 15 monitors (Inv 27 sites). EVSI-proxy = delta(sigma^2) x value_at_stake x p_success. This is a Gaussian variance-reduction proxy, not canonical EVSI (no outer-y MC, no explicit utility u(a,theta)); it collapses to true EVSI only under linear-Gaussian utility.

Unified BED sequences 2 CRF studies + 8 monitors over 10 years ($6.5M/$50M spent), yielding $107.3M EVSI-proxy against the Inv 21 hierarchical posterior ($0.05B value-at-stake). With a 2× tighter CRF prior (σ₀=0.5), the same sequence yields $84.0M. The adaptive rule alternates CRF studies and monitor deployments rather than exhausting one track first.

Investigation 32 Sprint 6

BOCA-inspired fidelity-aware monitor placement

BOCA-inspired cost-weighted successive halving with UCB tie-breaking, over 15 candidate monitor sites (Inv 27). Inspired by Kandasamy et al. 2017 but NOT canonical BOCA: the acquisition is info_gain + UCB-weight, divided by sqrt(cost); bias is handled by hard fidelity-gating (a site must accumulate lower-fidelity evidence before a higher fidelity unlocks), not by BOCA's bias-budget term. Four fidelity levels: gap-score (cost 0.01), haversine+gap (0.05), climate-signal UCB (0.20), full simulation (1.00). Compared to single-fidelity UCB baseline that always evaluates at full simulation.

The BOCA-inspired cost-weighted successive-halving rule spent 1.50 cost units across 36 multi-fidelity evaluations (30 gap-only + 0 haversine + 6 climate-UCB + 0 full-sim) to recover 4/5 of the oracle-optimal sites. Single-fidelity UCB spent 5.00 cost units (3.3× more) for 5/5 oracle recovery. The cheap gap-score prunes obvious non-contenders so full-sim budget is spent only on the contested top-k.

Investigation 33 Sprint 6

Strong-constraint 4D-Var assimilation

Twin experiment: a 1D upwind-advection + chemical-decay model over a 12-hour window. Six ground-based monitors observe PM2.5 every 2 hours. Compared: (a) background (no assimilation), (b) 3D-Var using data only at t=12h, (c) 4D-Var using all 6 time slices. Gradient of 4D-Var cost computed via the tangent-linear adjoint (hand-derived). L-BFGS minimizer with 40 iteration budget.

4D-Var reduced initial-condition RMSE from 6.05 µg/m³ (background) to 2.07 µg/m³ — a 66% improvement. 3D-Var using only end-of-window data achieved 4.92 µg/m³. The 18-hour forecast also improved: 0.13 vs 0.49 vs 0.60 µg/m³. The adjoint gradient converged J by 88.8% in 12 L-BFGS iterations.

Investigation 34 Sprint 6

Physics-informed neural network surrogate

1D advection-diffusion-reaction PDE for PM2.5 transport:
  dc/dt + u dc/dx - D d2c/dx2 + k c = S(x, t)
Small MLP surrogate (1 hidden layer, 24 tanh units). Loss combines sparse data MSE (n_data_stations * n_data_times observations), PDE residual at 200 collocation points (lam_pde=2.0), and IC anchor (lam_ic=2.0). Adam optimizer, 250 iterations, learning rate 0.03. Spatial derivatives computed analytically through the tanh activation. Baseline: same architecture without the PDE residual (lam_pde=0).

PINN surrogate with PDE residual regularization achieved 5.177 µg/m³ full-domain RMSE vs 7.737 for data-only training on 9 observations. Physics regularization closes 33% of the gap on held-out times, demonstrating that knowing the governing PDE is worth roughly 3-4× more training data in this sparse regime.

Investigation 35 Sprint 6

Physics-informed GP (linear-operator kernel)

Same 1D advection-diffusion-reaction PDE as Inv 34:
  dc/dt + u dc/dx - D d2c/dx2 + k c = S(x)
Latent-force / linear-PDE-operator GP (Sarkka 2011; Alvarez-Luengo-Lawrence 2013; Raissi, Perdikaris & Karniadakis 2017 'Numerical GPs'). A squared-exponential GP prior is placed on c(x, t); because L is linear, Lc ~ GP(0, L L' k_cc). Training data = sparse station observations of c + 'observations' of Lc=S at collocation points (the source term acts as known forcing). Hyperparameters (sigma^2, lx, lt) learned by marginal-likelihood maximization. All kernel derivatives are analytic. Baseline: plain SE-kernel GP trained only on the station observations + IC anchors (no physics, no collocation term).

Physics-informed GP achieves 0.123 µg/m³ held-out RMSE vs 4.262 µg/m³ for a plain SE-kernel GP on the same 9 noisy station observations. The gain (97%) comes from encoding the PDE L = d/dt + u d/dx - D d2/dx2 + k c directly into the GP kernel via linear-operator transport. The posterior is physics-consistent by construction — every sample from the posterior satisfies Lc = S to within noise.

Investigation 36 Sprint 6

EVPPI via Strong-Oakley-Brennan 2014 GAM regression

Strong, Oakley & Brennan (2014, Medical Decision Making) single-sample estimator: EVPPI(phi) = E_phi[max_d E[NB_d|phi]] - max_d E[NB_d]. The conditional expectation E[NB_d|phi=phi_i] is estimated by regressing NB_d on phi across the MC draws using an additive spline smoother (sklearn SplineTransformer + Ridge); predicted values plug directly into the EVPPI formula. Requires only the existing PSA sample — no nested MC, no new simulator runs. We re-use Inv 02's 10,000 shared draws across T1-T5.

At year 2035, the single most decision-relevant unknown is the ozone concentration-response function beta_o3 ($0.116B from a single scalar), followed by VSL ($0.092B). Together these two scalars account for 91% of the $0.229B EVPI — the remaining 19 parameters (14 emissions + beta_pm25 + beta_no2 + income-elast + GP-noise + Di/Krewski) contribute trivially. This reframes research priorities: a California-specific ozone-mortality cohort study (like MOSES+ at higher N) and a refreshed VSL literature review would move the decision more than any amount of emissions-inventory work.