PIEModel#
- class pymc_marketing.pie.model.PIEModel(*, pre_determined_features=FieldInfo(annotation=NoneType, required=True, description='Feature columns known before the campaign runs.'), post_determined_features=FieldInfo(annotation=NoneType, required=True, description='Feature columns known only after the campaign runs.'), target_column=FieldInfo(annotation=NoneType, required=False, default='y', description='Label for the target variable in idata.'), model_config=FieldInfo(annotation=NoneType, required=False, default=None), sampler_config=FieldInfo(annotation=NoneType, required=False, default=None))[source]#
Predicted Incrementality by Experimentation model.
Trains a Bayesian BART regression on a corpus of past RCTs mapping campaign features to measured incrementality, then predicts incrementality for non-experimental campaigns.
- Parameters:
- pre_determined_features
list[str] Feature columns known before the campaign runs (e.g. objective, vertical, budget, audience_type). In the current alpha implementation this list is concatenated with
post_determined_featuresand fed identically into BART; the distinction is recorded for future versions that gate prediction on feature availability but has no effect on the model graph today.- post_determined_features
list[str] Feature columns known only after the campaign runs (e.g. exposure_rate, ctr, last_click_conversions_per_dollar, avg_treated_outcome). See note above — treated identically to
pre_determined_featuresin this release.- target_column
str Name used for the target variable in the PyMC graph and in saved idata groups (
posterior_predictive[target_column],fit_data[target_column]). Does not select a column from X — X and y are always passed separately. Defaults to"y".- model_config
dict, optional Override default priors / BART settings. Top-level keys merge with
default_model_config(); nested dicts (e.g."bart") are replaced wholesale, so a partial"bart"override must restate every required key (m,alpha,beta). Keys:"bart": dict withm(int),alpha(float),beta(float), and optionalresponse—"constant"(default, piecewise-constant leaves),"linear", or"mix"(the latter two fit linear models in the leaves, which can help on smooth response surfaces)."sigma":pymc_extras.prior.Priorfor the noise std."categorical_split":"onehot"(default) or"continuous". Controls how label-encoded categorical columns are split by BART — see Notes.
- sampler_config
dict, optional Passed to
pymc.sample(). Defaults to{}.
- pre_determined_features
Notes
This module is alpha — the API and defaults may change. Tracked deviations from the paper [1]:
The paper uses a random forest fit to 2,226 RCTs; this implementation uses Bayesian Additive Regression Trees (PyMC-BART) for native posterior uncertainty.
The paper’s decision-theoretic framework (Type I/II error rates, disagreement vs RCT-based go/no-go decisions; paper §6) is not implemented.
Within-campaign sample splitting (paper §4.2) — which breaks the mechanical correlation between post-determined features and the target — is not implemented.
Extrapolation / cold-start diagnostics across advertiser segments (paper §5.3) are not implemented.
The footnote-2 measurement-error layer
y_observed ~ Normal(y_true, se_rct)for per-RCT standard errors is not implemented.
Categorical columns (
objectorcategorydtype) are label-encoded inbuild_model. Withcategorical_split="onehot"(default), BART usespymc_bart.split_rules.OneHotSplitRulefor those columns so that splits are “level X vs not-X” rather than “encoded value < c” — this avoids imposing the encoder’s alphabetical ordering on unordered categories. Setcategorical_split="continuous"to fall back to ordered splits.References
[1]Gordon, B. R., Moakler, R., & Zettelmeyer, F. (2026). Predicted Incrementality by Experimentation (PIE) for Ad Measurement. NBER Working Paper No. 35044.
Examples
import pandas as pd from pymc_marketing.pie import PIEModel # Corpus of past campaigns, each labelled with the incrementality # measured by its RCT. X = pd.DataFrame( { "objective": ["conversions", "traffic", "awareness", "traffic"], "vertical": ["retail", "travel", "finance", "retail"], "budget": [50_000, 12_000, 80_000, 30_000], "exposure_rate": [0.42, 0.71, 0.33, 0.55], } ) y = pd.Series([0.81, 0.34, 1.12, 0.49]) # incrementality per dollar model = PIEModel( pre_determined_features=["objective", "vertical", "budget"], post_determined_features=["exposure_rate"], ) model.fit(X, y, random_seed=42) preds = model.sample_posterior_predictive(X)
Methods
PIEModel.__init__(*[, ...])Initialize model configuration and sampler configuration for the model.
PIEModel.approximate_fit(X[, y, ...])Fit a model using Variational Inference and return InferenceData.
Reconstruct constructor kwargs from saved idata attrs.
PIEModel.build_from_idata(idata)Rebuild the model from saved inference data.
PIEModel.build_model(X, y, **kwargs)Build the PyMC model graph.
PIEModel.create_fit_data(X, y)Create the fit_data group based on the input data.
Extend the base idata attrs with PIEModel-specific fields.
PIEModel.fit(X[, y, progressbar, random_seed])Fit a model using the data passed as a parameter.
PIEModel.graphviz(**kwargs)Get the graphviz representation of the model.
Create the model configuration and sampler configuration from the InferenceData to keyword arguments.
PIEModel.load(fname[, check])Create a ModelBuilder instance from a file.
PIEModel.load_from_idata(idata[, check])Create a ModelBuilder instance from an InferenceData object.
Perform transformation on the model after sampling.
PIEModel.predict(X[, extend_idata])Use a model to predict on unseen data and return point prediction of all the samples.
PIEModel.predict_posterior(X[, ...])Posterior predictive draws for
Xas a single DataArray.PIEModel.predict_proba(X[, extend_idata, ...])Alias for
predict_posterior, for consistency with scikit-learn probabilistic estimators.PIEModel.sample_posterior_predictive(X[, ...])Sample posterior predictive draws and return in the original target scale.
PIEModel.sample_prior_predictive(X[, y, ...])Sample from the model's prior predictive distribution.
PIEModel.save(fname, **kwargs)Save the model's inference data to a file.
PIEModel.set_idata_attrs([idata])Set attributes on an InferenceData object.
PIEModel.table(**model_table_kwargs)Get the summary table of the model.
Attributes
default_model_configDefault BART hyperparameters, noise prior, and categorical split mode.
default_sampler_configDefault sampler configuration (empty — PyMC auto-assigns PGBART + NUTS).
fit_resultGet the posterior fit_result.
idGenerate a unique hash value for the model.
output_varName of the target variable in the PyMC graph and saved idata.
posteriorAccess the 'posterior' attribute of the InferenceData object.
posterior_predictiveAccess the 'posterior_predictive' attribute of the InferenceData object.
predictionsAccess the 'predictions' attribute of the InferenceData object.
priorAccess the 'prior' attribute of the InferenceData object.
prior_predictiveAccess the 'prior_predictive' attribute of the InferenceData object.
versionidatasampler_configmodel_config