| Category | Finding | Recommended Next Step |
|---|---|---|
| Summary |
|
Review the health of your dataset below. Resolve all FAILS and investigate REVIEW flags in the detailed sections to ensure your data is ready for modeling. |
| Spend and Media Unit |
|
See Spend and Media Units. Where applicable, verify that spend and media units align across channels, and review outliers in cost per media unit. |
| Individual Explanatory/Response Variables |
|
See Individual Explanatory/Response Variables. Where applicable, review any variables with low signal or with outliers. |
| Population Scaling of Explanatory Variables |
|
No automated issues detected. See Population Scaling for more details. |
| Relationship Among the Variables |
|
No automated issues detected. See Relationship Among the Variables for more details. |
| Prior Specifications |
|
No automated issues detected. See Prior Specifications for more details. Assess the likelihood of a negative baseline occurring. |
Please review the channel's share of spend. Channels with a very small share of spend might be difficult to estimate. You might want to combine them with other channels.
As a rough guidance, please review the ratio of data points to
parameters, where
• the number of data points = n_geos * n_times,
• the number of parameters = (n_geos-1) + n_knots + n_controls + n_treatments.
A very small ratio indicates insufficient data for estimation.
In that case, consider dropping or combining channels,
or reducing the number of knots with `knots` argument in `ModelSpec`.
For more details, please refer to this documentation page:
https://developers.google.com/meridian/docs/pre-modeling/amount-data-needed.
| Metric | Value |
|---|---|
| Ratio | 55.71 |
| n_geos | 40 |
| n_times | 156 |
| n_knots | 64 |
| n_controls | 2 |
| n_treatments | 7 |
| n_parameters | 112 |
| n_data_points | 6240 |
Please review the patterns for spend, media units, and cost-per-media unit. Any erratic or unexpected patterns warrant a data review.
There are outliers in cost per media unit across time. Please check for any possible data input error.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_cost_per_media_unit() to review outliers for 5 channels in 115 times and 39 geos.)
| Geo | Time | Channel | Spend | Media Units | Cost Per Media Unit | Abs Cost Per Media Unit |
|---|---|---|---|---|---|---|
| Geo0 | 2023-11-20 | Channel1 | 1238.571 | 128466 | 0.010 | 0.010 |
| Geo26 | 2023-01-09 | Channel1 | 1061.191 | 110068 | 0.010 | 0.010 |
| Geo22 | 2021-06-21 | Channel1 | 1267.967 | 131515 | 0.010 | 0.010 |
| Geo25 | 2022-04-11 | Channel1 | 1313.493 | 136237 | 0.010 | 0.010 |
| Geo4 | 2023-03-27 | Channel1 | 2338.818 | 242585 | 0.010 | 0.010 |
Please review the variability of the explanatory and response variables illustrated by the boxplots. Note that variables with very low variability could be difficult to estimate and could hurt model convergence. Consider merging or replacing them with other variables, dropping them if they are negligibly small, or using a custom prior if you have relevant information. If outliers exist, check your data input to determine if they are genuine or erroneous.
There are outliers in the scaled treatment or control variables in certain geos. Please check for any possible data errors.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_std() to review outliers for 9 channels in 153 times and 40 geos.)
| Geo | Time | Var | Outliers | Abs Outliers |
|---|---|---|---|---|
| Geo19 | 2021-10-18 | Promo | 6.206 | 6.206 |
| Geo16 | 2021-07-26 | Channel2 | 5.569 | 5.569 |
| Geo23 | 2021-10-18 | Channel2 | 5.367 | 5.367 |
| Geo28 | 2021-10-11 | Channel2 | 5.236 | 5.236 |
| Geo18 | 2021-11-22 | Organic_channel0 | 5.220 | 5.220 |
Please review the Spearman correlation between population and raw paid and organic media variables. These raw media variables are expected to have positive correlation with population. If there is low or negative correlation, please check your data input.
Please review the Spearman correlation between population and scaled treatment units or scaled controls.
For controls and non-media channels: Meridian doesn't population-scale these variables by default. High correlation indicates that users should population-scale these variables using the `control_population_scaling_id` or `non_media_population_scaling_id` argument in `ModelSpec`.
For paid and organic media channels: Meridian automatically population-scales these media channels by default. High correlation indicates that the variable may have been population-scaled before being passed to Meridian. Please check your data input.
Please review the computed pairwise correlations. Note that high pairwise correlation may cause model identifiability and convergence issues. Consider combining the variables if high correlation exists.
This check regresses each variable against time as a categorical variable. In this case, high R-squared indicates low geo variation of a variable. This could lead to a weakly identifiable and non-converging model if a large number of knots are used. Consider dropping the variable with very high R-squared or reducing `knots` argument in `ModelSpec`.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_variable_geo_time_collinearity() to review R-squared time for 9 channels.)
| Channel | R Squared |
|---|---|
| Channel3 | 0.330 |
| competitor_sales_control | 0.315 |
| Channel4 | 0.272 |
| sentiment_score_control | 0.255 |
| Organic_channel0 | 0.216 |
This check regresses each variable against geo as a categorical variable. In this case, high R-squared indicates low time variation of a variable. This could lead to a weakly identifiable and non-converging model due to geo main effects. Consider dropping the variable with very high R-squared.
(Due to space constraints, this table only displays the 5 most severe cases. Please use EDAEngine.check_variable_geo_time_collinearity() to review R-squared geo for 9 channels.)
| Channel | R Squared |
|---|---|
| Channel0 | -0.000 |
| competitor_sales_control | -0.001 |
| Promo | -0.001 |
| Organic_channel0 | -0.001 |
| Channel3 | -0.001 |
Negative baseline is equivalent to the treatment effects getting too much credit. Please review the prior probability of negative baseline together with the bar chart for channel-level prior mean of contribution. If the prior probability of negative baseline is high, consider custom treatment priors. In particular, a custom `contribution prior` type may be appropriate.
Prior Probability of negative baseline: 0.006