MMM.compute_counterfactual_contributions_dataset#

MMM.compute_counterfactual_contributions_dataset(central_tendency='median')[source]#

Full-posterior counterfactual contributions as an xr.Dataset.

For each component \(j\) with value \(v_j(t)\) in the linear predictor, the per-draw contribution is:

\[\text{contribution}_j^{(d)}(t) = \text{inv}\bigl(\mu^{(d)}(t)\bigr) \cdot s - \text{inv}\bigl(\mu^{(d)}(t) - v_j^{(d)}(t)\bigr) \cdot s\]

where \(\text{inv}\) is the inverse link function, \(s\) is target_scale, and \(d\) indexes a posterior draw. The difference is taken inside each draw and only afterwards averaged, i.e. the estimand is \(\mathbb{E}\bigl[\text{inv}(\mu) \cdot s - \text{inv}(\mu - v_j) \cdot s\bigr]\), not \(\mathbb{E}[\text{inv}(\mu)] \cdot s - \mathbb{E}[\text{inv}(\mu - v_j)] \cdot s\). By linearity of expectation the two posterior means coincide, but only the per-draw form yields correct credible intervals.

Identity link (\(\text{inv} = \text{id}\)):

\[\text{contribution}_j^{(d)}(t) = v_j^{(d)}(t) \cdot s\]

Log link (\(\text{inv} = \exp\)):

\[\text{contribution}_j^{(d)}(t) = \bigl[\exp\!\bigl(\mu^{(d)}(t)\bigr) - \exp\!\bigl(\mu^{(d)}(t) - v_j^{(d)}(t)\bigr)\bigr] \cdot s\]

Here \(\exp(\mu)\) is the conditional median of the LogNormal response, not its mean. With central_tendency="mean" each per-draw contribution is multiplied by \(\exp(\sigma^2 / 2)\) (the LogNormal mean/median ratio), giving a counterfactual on the conditional-mean scale \(\mathbb{E}[y \mid \mu, \sigma]\). The factor cancels in component shares and does not change the budget-optimisation optimum, but it does shift absolute contributions.

Note also that \(\mathbb{E}[\cdot]\) above denotes averaging over posterior draws, not the likelihood expectation.

The returned dataset retains the full (chain, draw) dimensions so that downstream code can compute arbitrary summaries (HDI, quantiles, etc.).

This is the counterfactual decomposition: per-component what-if-removed lifts that, under the log link, do not sum to \(\hat y\) (interactions are counted by every component they touch). For a conserving decomposition whose components sum exactly to \(\hat y\), see MMMIDataWrapper.get_contributions().

Parameters:
central_tendency{“median”, “mean”}, default “median”

Response summary the counterfactual is expressed on. For the identity link the two are identical (Normal mean == median). For the log link, "median" uses \(\exp(\mu)\) and "mean" applies the \(\exp(\sigma^2 / 2)\) correction.

Returns:
xr.Dataset

One data variable per component (channels, controls, yearly_seasonality, any mu_effects, intercept). Dimensions are (chain, draw, date, ...) where ... are any extra model dimensions (e.g. geo).

Raises:
ValueError

If the model has not been fitted (no idata).

See also

compute_mean_contributions_over_time

Convenience wrapper returning the posterior-mean as a pd.DataFrame.

Examples

mmm.fit(X, y)
ds = mmm.compute_counterfactual_contributions_dataset()

# Posterior mean (same as compute_mean_contributions_over_time)
ds.mean(("chain", "draw"))

# 94 % HDI per component
import arviz as az

az.hdi(ds)