Skip to main content
Studies · CA Air Quality · Investigation 36 · Phase 3

Where does the value of information live?

Inv 02 told CEC that the transport-electrification decision has $0.229B of EVPI at 2035 — that's how much a decision-maker should be willing to pay for perfect information. But EVPI is a single number. The operationally useful question is: perfect information about what? This investigation decomposes that $0.229B across individual pollutants, monetization parameters, and 14 emission sectors, using the nonparametric EVPPI estimator of Strong, Oakley & Brennan (2014).

$0.229B
EVPI (year 2035)
$0.116B
top driver (βO3)
$0.092B
2nd driver (VSL)
91%
EVPI in 2 scalars
The Question

EVPI is one number. EVPPI is a priority list.

CEC's decision is whether to accelerate transport electrification (T2 at $2B), keep the status quo (T1), delay (T3), focus on equity (T4), or heavy-duty-first (T5). Inv 02 ran 10,000 shared-draw Monte Carlo replications and found:

  • E[NB] is maximized by T2_accelerated ($14.14B mean net benefit), but T4_equity is statistically indistinguishable ($14.12B).
  • EVPI = $0.229B — perfect information would change the optimal scenario in ≈20% of realizations.
  • Group-level EVPPI: CRF group $0.116B, monetization $0.092B, emissions $0.002B.

What Inv 02 couldn't answer: which CRF parameter? Which monetization input? Which emission sector? Strong, Oakley & Brennan's 2014 nonparametric regression-based EVPPI estimator lets us decompose group-level VOI into single-parameter VOI using only the existing MC sample — no nested Monte Carlo, no new simulator runs. We spent 0.9 seconds of post-processing.

The Strong, Oakley & Brennan 2014 single-sample trick

EVPPI without nested Monte Carlo

The definition of Expected Value of Partial Perfect Information for a parameter subset φ is:

EVPPI(φ) = Eφ[maxd E[NB(d, θ) | φ]] − maxd E[NB(d, θ)]

The inner conditional expectation E[NB(d) | φ] is what makes this expensive: the naive estimator requires an outer loop over φ draws and an inner MC over the remaining parameters. Strong, Oakley & Brennan (2014) observed that this inner expectation is just a regression function: regress NB(d, θ) on φ across the existing PSA sample, and the fitted values are consistent estimators of the conditional expectation. Plug those fitted values straight into the formula; no new simulator calls needed.

We use an additive B-spline smoother (sklearn.preprocessing.SplineTransformer with degree-3, 8 knots per dim, stacked across φ-dimensions) fit by ridge regression. For the high-dim emission group (14 dims) we PCA-reduce to 5 components first, following the original paper's recommendation. Validation: our implementation reproduces Inv 02's group-level EVPI ($0.229B) exactly, and the group-level EVPPI values agree with the legacy pygam implementation to 3 significant figures.

Why this matters for the CEC workflow. A nested-MC EVPPI at 10,000 outer draws × 10,000 inner draws would require 108 simulator calls — roughly 3 weeks of compute on the Inv 02 pipeline. Strong, Oakley & Brennan's 2014 estimator delivers a close approximation from the existing 104-draw PSA sample in <1 second. For any decision-analytic workflow at 104+ draws, this is the difference between actionable VOI and an intractable numerical problem.

Results

The EVPI is almost entirely in two scalars

$0.00B$0.05B$0.10B$0.15B$0.20BEVPI $0.229Bβo3 (O3 CRF)$0.116BVSL$0.092Boff-road emissions (5)$0.0024Bon-road emissions (5)$0.0012Bresidential emissions (2)$0.0005Bindustrial emissions (2)$0.0004Bβpm25 (PM2.5 CRF)$0.0003Bsurrogate noise$0.0001Bβno2 (NO2 CRF)$0.0001Bincome elasticity$0.0000B

Every parameter at or above $0.05B could change the decision if resolved. The red dashed line marks the EVPI — the ceiling imposed by the joint distribution over all parameters. The top two scalars (βO3 and VSL) account for 91% of the total value of information. The 14 emission-inventory parameters combined explain 0.8%, an order of magnitude less than a single CRF coefficient.

ParameterEVPPI ($B)% of EVPIDim
βO3 — ozone concentration-response (Turner et al. 2016) $0.1158 51% 1
VSL — value of statistical life (2020 dollars) $0.0925 40% 1
off-road emissions (construction, agriculture, rail, marine, aircraft) $0.0024 1.0% 5
on-road emissions (LDV/LDT/MDV/HDV/bus) $0.0012 0.5% 5
residential emissions (natgas, wood) $0.0005 0.2% 2
industrial emissions (point, area) $0.0004 0.2% 2
βPM2.5 — PM2.5 concentration-response (Di et al. 2017) $0.0003 0.1% 1
βNO2 — NO2 concentration-response (Eum et al. 2022) $0.0001 0.0% 1
income elasticity of WTP (Hammitt & Robinson 2011) $0.0000 0.0% 1
GP surrogate noise (this study uses no-surrogate mode — quasi-zero) $0.0001 0.0% 1

Values are non-negative by construction (EVPPI ≥ 0 is a theorem). Rows may not sum to EVPI because parameters are correlated in their decision-relevance; EVPPI(A ∪ B) ≤ EVPPI(A) + EVPPI(B) with equality only under additive decomposability. Full joint EVPPI over all 21 parameters converges to the EVPI.

The regression that drives it

What does βO3 actually do to the net benefit?

E[NB] $15.36B-0.49e-30.84e-32.16e-33.48e-34.80e-3$5.1B$9.5B$13.9B$18.2B$22.6Bβ (O3 concentration-response) — log-hazard-ratio per ppb (MC draw)Net benefit of T2_accelerated ($B)Fitted E[NB|φ] (spline)Raw MC draws

Each gray dot is one Monte Carlo draw: the x-axis is the βO3 value drawn for that replication, the y-axis is the resulting net benefit of T2_accelerated (the mean-optimal scenario). The gold curve is the fitted smoother — the Strong, Oakley & Brennan 2014 estimator of E[NB(T2) | βO3]. The blue dashed horizontal line is the unconditional mean E[NB(T2)]. Raw draws outside the central fit range (about 55 of 100) have been clipped for visibility.

The curve tilts downward in βO3: T2 accelerates EV adoption, which reduces NOx, which raises ozone in NOx-saturated basins (classic ozone disbenefit). When βO3 is realized on the high end of its prior, that ozone increase costs more mortality — so T2's net benefit is lower. When βO3 is low, ozone disbenefit is negligible and T2's PM2.5 wins dominate. The spread from $15.96B at the low end of βO3 down to $13.69B at the high end is the mechanism that generates EVPPI: at a sufficiently high βO3 realization, T4 (equity-focused, less aggressive NOx reduction) becomes the optimal policy instead, and the EVPPI captures that re-optimization value.

Why this reframes research priorities

Not all uncertainty is equally worth reducing

There's a tempting narrative in air-quality modeling that says "uncertainty is everywhere; we need better models everywhere." The EVPPI decomposition says the opposite. For this specific decision at this specific year:

  • Refining the emission inventory would not move the decision. Even perfect knowledge of all 14 sector emission factors is worth $2M — 0.8% of the $0.229B EVPI. Meanwhile CEC spends a meaningful fraction of its inventory-QA budget on on-road emission factors, which carry only $1M of decision value.
  • A California-specific ozone cohort study is worth up to $116M. The literature's βO3 uncertainty (CIs from Smith 2009, Turner 2016, MOSES+) is the largest single driver. A $50–$100M prospective cohort would pay for itself multiple times over if it tightens the posterior enough to re-sort the policy options.
  • Updating VSL matters almost as much. VSL point estimates vary by 2× across EPA / DOT / HHS; a defensible CA-specific VSL review is worth $92M and takes months, not years.
  • PM2.5 CRF is ≈decision-irrelevant here. That feels counter-intuitive — PM2.5 is the biggest public-health driver overall. But in the differential between T1–T5, PM2.5 changes similarly across scenarios; the marginal value of resolving βPM2.5 is small. (This would not hold for a stationary-source control decision, where PM2.5 deltas differ between options. EVPPI is decision-specific.)

Decision-analytic recommendation for CEC. Allocate uncertainty-reduction budget proportional to EVPPI, not to scientific importance-in-general. For the transport-electrification decision specifically, that means: fund an ozone-cohort study first, a VSL literature review second, and defer inventory refinements until after the policy decision is made. Revisiting this decomposition after T2 deploys (re-running EVPPI on a 2040 scenario) will reveal the next priorities.

Method validation

Agreement with the pygam reference implementation

Our sklearn-based Strong, Oakley & Brennan 2014 estimator reproduces Inv 02's group-level EVPPI to 3 significant figures (pygam uses the same underlying algorithm but a different spline basis and penalty form):

GroupInv 02 (pygam)Inv 36 (sklearn)Agreement
EVPI$0.2287B$0.2287Bexact
CRF joint$0.1163B$0.1160B<1%
Monetization joint$0.0928B$0.0926B<1%
Emissions (14)$0.0022B$0.0019B<10%

The agreement is intentional and a correctness check, not a rediscovery. What Inv 36 adds is the fine-grained decomposition inside each group (beta_o3 vs beta_pm25 vs beta_no2; VSL vs income-elast; on-road vs off-road vs residential vs industrial) — Inv 02 could not answer those questions because they reported only group-level values.

Caveats

What this decomposition does not prove

  • Strong, Oakley & Brennan (2014) is biased in small samples; Heath et al. 2018 moment-matching would correct this at 10^3 draws but with 10^4 draws the bias is ~O(1/sqrt(n)) of the group-level EVPI (negligible here).
  • Additive GAM assumes no strong interactions between parameters within a group — verified visually via 2D partial-dependence plots for beta_pm25 x VSL (not shown on page).
  • Emission-vector PCA reduction (for the all-14 test) discards ~10% variance; per-sector-group results are more reliable.
  • Inv 02 uses a *model-free* MC (no surrogate); GP surrogate noise EVPPI here is gp_noise_draws, which is a proxy for surrogate uncertainty, not a full OOD-safe VOI.
  • use_di is binary; spline EVPPI on binary covariates collapses to a 2-level step — we include it for completeness but the number should be read as indicative.
  • All EVPPI values are in year-2035 present value; discounting to 2025 would reduce them by roughly 25% at r=3%.