Meridian Exploratory Data Analysis Report
Meridian Exploratory Data Analysis Report
Summary
Category Finding Recommended Next Step
Summary 2 review(s) Review the health of your dataset below. Resolve all FAILS and investigate REVIEW flags in the detailed sections to ensure your data is ready for modeling.
Spend and Media Unit 1 review(s) See Spend and Media Units. Where applicable, verify that spend and media units align across channels, and review outliers in cost per media unit.
Individual Explanatory/Response Variables 1 review(s) See Individual Explanatory/Response Variables. Where applicable, review any variables with low signal or with outliers.
Population Scaling of Explanatory Variables Info No automated issues detected. See Population Scaling for more details.
Relationship Among the Variables Info No automated issues detected. See Relationship Among the Variables for more details.
Prior Specifications Info No automated issues detected. See Prior Specifications for more details. Assess the likelihood of a negative baseline occurring.
Spend and Media Unit info icon

Please review the channel's share of spend. Channels with a very small share of spend might be difficult to estimate. You might want to combine them with other channels.

info icon

As a rough guidance, please review the ratio of data points to
parameters, where
  • the number of data points = n_geos * n_times,
  • the number of parameters = (n_geos-1) + n_knots + n_controls + n_treatments.

A very small ratio indicates insufficient data for estimation.
In that case, consider dropping or combining channels,
or reducing the number of knots with `knots` argument in `ModelSpec`.
For more details, please refer to this documentation page:
https://developers.google.com/meridian/docs/pre-modeling/amount-data-needed.

Metric Value
Ratio 55.71
n_geos 40
n_times 156
n_knots 64
n_controls 2
n_treatments 7
n_parameters 112
n_data_points 6240
info icon

Please review the patterns for spend, media units, and cost-per-media unit. Any erratic or unexpected patterns warrant a data review.

warning icon

There are outliers in cost per media unit across time. Please check for any possible data input error.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_cost_per_media_unit() to review outliers for 5 channels in 115 times and 39 geos.)

Geo Time Channel Spend Media Units Cost Per Media Unit Abs Cost Per Media Unit
Geo0 2023-11-20 Channel1 1238.571 128466 0.010 0.010
Geo26 2023-01-09 Channel1 1061.191 110068 0.010 0.010
Geo22 2021-06-21 Channel1 1267.967 131515 0.010 0.010
Geo25 2022-04-11 Channel1 1313.493 136237 0.010 0.010
Geo4 2023-03-27 Channel1 2338.818 242585 0.010 0.010
Individual Explanatory/Response Variables info icon

Please review the variability of the explanatory and response variables illustrated by the boxplots. Note that variables with very low variability could be difficult to estimate and could hurt model convergence. Consider merging or replacing them with other variables, dropping them if they are negligibly small, or using a custom prior if you have relevant information. If outliers exist, check your data input to determine if they are genuine or erroneous.

warning icon

There are outliers in the scaled treatment or control variables in certain geos. Please check for any possible data errors.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_std() to review outliers for 9 channels in 153 times and 40 geos.)

Geo Time Var Outliers Abs Outliers
Geo19 2021-10-18 Promo 6.206 6.206
Geo16 2021-07-26 Channel2 5.569 5.569
Geo23 2021-10-18 Channel2 5.367 5.367
Geo28 2021-10-11 Channel2 5.236 5.236
Geo18 2021-11-22 Organic_channel0 5.220 5.220
Population Scaling of Explanatory Variables info icon

Please review the Spearman correlation between population and raw paid and organic media variables. These raw media variables are expected to have positive correlation with population. If there is low or negative correlation, please check your data input.

info icon

Please review the Spearman correlation between population and scaled treatment units or scaled controls.

For controls and non-media channels: Meridian doesn't population-scale these variables by default. High correlation indicates that users should population-scale these variables using the `control_population_scaling_id` or `non_media_population_scaling_id` argument in `ModelSpec`.

For paid and organic media channels: Meridian automatically population-scales these media channels by default. High correlation indicates that the variable may have been population-scaled before being passed to Meridian. Please check your data input.

Relationship Among the Variables info icon

Please review the computed pairwise correlations. Note that high pairwise correlation may cause model identifiability and convergence issues. Consider combining the variables if high correlation exists.

info icon

This check regresses each variable against time as a categorical variable. In this case, high R-squared indicates low geo variation of a variable. This could lead to a weakly identifiable and non-converging model if a large number of knots are used. Consider dropping the variable with very high R-squared or reducing `knots` argument in `ModelSpec`.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_variable_geo_time_collinearity() to review R-squared time for 9 channels.)

Channel R Squared
Channel3 0.330
competitor_sales_control 0.315
Channel4 0.272
sentiment_score_control 0.255
Organic_channel0 0.216
info icon

This check regresses each variable against geo as a categorical variable. In this case, high R-squared indicates low time variation of a variable. This could lead to a weakly identifiable and non-converging model due to geo main effects. Consider dropping the variable with very high R-squared.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_variable_geo_time_collinearity() to review R-squared geo for 9 channels.)

Channel R Squared
Channel0 -0.000
competitor_sales_control -0.001
Promo -0.001
Organic_channel0 -0.001
Channel3 -0.001
Prior Specifications info icon

Negative baseline is equivalent to the treatment effects getting too much credit. Please review the prior probability of negative baseline together with the bar chart for channel-level prior mean of contribution. If the prior probability of negative baseline is high, consider custom treatment priors. In particular, a custom `contribution prior` type may be appropriate.

Prior Probability of negative baseline: 0.006