加入新推出的
Discord 社区,展开实时讨论,获得同行支持,并直接与 Meridian 团队互动!
因果图
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
必要假设部分中指出,如果假设因果图满足后门标准,则条件可交换性假设成立。
因果图显示了变量之间的关系。变量会被划分到集合(节点)中,节点之间的箭头表示箭头方向可能存在因果效应。箭头不一定表示每对变量之间存在因果关系,但确实表示任何一对变量之间都不可能存在反向因果关系。
后门标准(Pearl, J.,2009 年)指出,给定一个因果图,如果一组变量 \(Z\) 同时符合以下条件,则相对于处理变量 \(X\) 和响应变量 \(Y\) ,此组变量满足后门标准:
- \(Z\) 中的任何节点都不是 \(X\)的后代,且
- \(Z\) 会阻断 \(X\) 和 \(Y\) 之间包含指向 \(X\)的箭头的每条路径
营销组合建模分析 (MMM) 用于估算付费媒体、自然媒体和非媒体变量对某个 KPI(例如销售额)的因果效应。因此付费媒体、自然媒体和非媒体变量是处理变量 (\(X\)),而 KPI 是响应变量 (\(Y\))。为了通过 MMM 回归估算这种因果效应,MMM 必须以一组精心选择的满足后门标准的控制变量为条件。以下是对后门标准条件的简述:
- 不得控制任何中介变量。中介变量是位于 \(X\) 和 \(Y\)之间的因果路径中的变量。
- 必须控制所有混杂变量。混杂变量是指对 \(X\) 和 \(Y\)都有因果效应的变量。
MMM 处理变量是付费媒体、自然媒体和非媒体处理变量的任意组合的集合,这些变量会基于地理位置和时间进行指数化处理。在一个图表中表示整个处理变量不太方便,因此可以考虑使用一个简化的图表,仅显示单个地理位置内的两个时间段。假设地理位置是独立的,因此同一图表可用于表示任何地理位置,且地理位置之间没有任何箭头或关系。两个时间段足以描述滞后处理效应的模式,您可以假设这种模式会在未来无限重复(或达到滞后时长上限)。
在下图中, \(T\) 表示付费媒体、自然媒体和非媒体处理变量, \(C\) 表示控制变量, \(K\) 表示 KPI。每个变量后面的数字表示时间段。在每个时间段内,假设处理变量会影响销售额,而控制变量会影响处理变量和销售额。在下图中,之前时间段的 \(T\) 会影响当前时间段的销售额(“滞后效应”)。Meridian 回归模型会将 Adstock 应用于付费媒体和自然媒体,但不会应用于非媒体处理变量。这实际上是假定非媒体处理变量不会产生滞后效应。在节点 \(T\) 中包含非媒体处理变量仍然有效,因为箭头表示连接节点中的任意一对变量之间可能存在因果效应。在节点 \(T\) 中添加非媒体处理变量会使 DAG 更清晰地呈现相关信息,并且 DAG 仍然可用于确定哪些变量满足后门标准。
假设您要估算处理变量(\(T1\) 和\(T2\))对时间段 2 (\(K2\)) 的 KPI 的因果效应。从图表中可以看出,时间段 2 的控制变量 (\(C2\)) 满足后门标准。
可得出以下主要结论:对于每个时间段,MMM 回归应基于以下条件:
- 来自当前时间段和所有之前时间段的付费和自然媒体(时间段范围不超过滞后时长上限)。
- 仅当前时间段的非媒体处理变量。
- 仅当前时间段的控制变量。
需要注意以下细节:
- 从 \(C1\) 到 \(C2\) 的箭头不表示回归中要包含哪些变量。
- 从 \(C1\) 到 \(K2\) 的箭头要求回归包含滞后控制变量。在实践中,如果可能的话,最好避免这种情况,因为这可能会显著增加回归形参的数量。
- 从 \(T1\) 到 \(C2\) 的箭头有问题。在本例中, \(C2\) 既是中介变量,也是混杂变量。单个 MMM 回归模型无法用于恢复因果合并处理效应。
- 添加路径 \(T2 \leftarrow K1 \rightarrow K2\) 也是有问题的,原因相同。在本例中, \(K1\) 既是中介变量,也是混杂变量。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-04。
[null,null,["最后更新时间 (UTC):2025-08-04。"],[[["\u003cp\u003eCausal relationships between variables are represented in a causal graph where nodes represent variables and arrows indicate potential causal effects.\u003c/p\u003e\n"],["\u003cp\u003eThe backdoor criterion is used to identify a set of control variables (Z) that allow for the estimation of the causal effect of treatment variables (X) on a response variable (Y).\u003c/p\u003e\n"],["\u003cp\u003eMarketing Mix Modeling (MMM) aims to estimate the causal effects of marketing activities on a key performance indicator (KPI) by applying the backdoor criterion.\u003c/p\u003e\n"],["\u003cp\u003eIn MMM, it's crucial to control for confounders (variables affecting both treatment and KPI) and avoid controlling for mediators (variables in the causal pathway between treatment and KPI).\u003c/p\u003e\n"],["\u003cp\u003eThe MMM regression should include current and lagged paid/organic media, current non-media treatments, and current control variables to estimate causal effects accurately.\u003c/p\u003e\n"]]],["Causal graphs depict variable relationships, where arrows indicate potential causal effects. The backdoor criterion, essential for causal effect estimation, requires controlling for confounders (variables affecting both treatment and outcome) while excluding mediators (variables in the causal pathway). Marketing mix modeling (MMM) uses this to estimate treatment effects (media) on a KPI. MMM regressions should include: current and lagged paid/organic media, current non-media, and current control variables. Lagged controls, or cases where control or outcome variables are both a mediator and a confounder should be avoided.\n"],null,["# Causal graph\n\nIn [Required assumptions](/meridian/docs/basics/required-assumptions), it was\nstated that conditional exchangeability assumption holds if you assume a causal\ngraph that meets the backdoor criterion.\n\nA causal graph shows the relationship between variables. Variables are grouped\ninto collections (nodes), and an arrow between nodes indicates that a causal\neffect might exist in the direction of an arrow. An arrow does not necessarily\nindicate that a causal relationship exists between every pair of variables, but\nit does indicate that a causal relationship cannot exist in the reverse\ndirection for any pair of variables.\n\nThe *backdoor criterion* (Pearl, J., 2009) states that given a causal diagram, a\nset of variables \\\\(Z\\\\) satisfies the backdoor criterion relative to a\ntreatment variable \\\\(X\\\\) and the response variable \\\\(Y\\\\) if *both* of the\nfollowing are true:\n\n- No node in \\\\(Z\\\\) is a descendant of \\\\(X\\\\), and\n- \\\\(Z\\\\) blocks every path between \\\\(X\\\\) and \\\\(Y\\\\) that contains an arrow into \\\\(X\\\\)\n\nMarketing mix modeling (MMM) is used to estimate the causal effect of paid\nmedia, organic media, and non-media variables on a KPI (such as sales). So paid\nmedia, organic media, and non-media are the treatment variables (\\\\(X\\\\)) and\nthe KPI is the response variable (\\\\(Y\\\\)). To estimate this causal effect from\nan MMM regression, the MMM must condition on a carefully selected set of control\nvariables that meets the backdoor criterion. To paraphrase the backdoor\ncriterion conditions:\n\n- You must *not* control for any mediators. *Mediators* are variables that lie in the causal pathway between \\\\(X\\\\) and \\\\(Y\\\\).\n- You must control for all confounders. *Confounders* are variables that have a causal effect on both \\\\(X\\\\) and \\\\(Y\\\\).\n\nThe MMM treatment variable is a collection of any combination of paid media,\norganic media, and non-media treatment variables indexed over both geo and time.\nIt is is unwieldy to represent the entire treatment in a graph, so consider a\nsimplified graph that represents only two time periods within a single geo. Geos\nare assumed to be independent, so the same graph can be used to represent any\ngeo, and there are no arrows or relationships between geos. Two time periods are\nenough to describe the pattern of lagged treatment effects, which you can assume\nis repeated indefinitely into the future (or up to some maximum lag duration).\n\nIn the following diagram, \\\\(T\\\\) denotes paid media, organic media, and\nnon-media treatment variables, \\\\(C\\\\) denotes controls, and \\\\(K\\\\) denotes the\nKPI. The number following each variable denotes the time period. Within each\ntime period, assume that treatment affects sales, and that controls affect both\ntreatment and sales. In the following diagram, \\\\(T\\\\) from a previous time\nperiod affects sales in the current time period (\"lagged effect\"). The\nMeridian regression model applies adstock to paid and organic media, but\nnot to non-media treatments. This effectively assumes that non-media treatments\ndon't have lagged effects. Including non-media treatments in node \\\\(T\\\\) is\nstill valid because an arrow indicates that a causal effect **may** exist\nbetween any pair of variables in connected nodes. Including non-media treatments\nin node \\\\(T\\\\) makes the DAG presentation cleaner, and the DAG is still valid\nfor determining which variables satisfy the backdoor criterion.\n\nConsider the task of estimating the causal effect of treatment (\\\\(T1\\\\) and\n\\\\(T2\\\\)) on the KPI for time period 2 (\\\\(K2\\\\)). From the graph, you can see\nthat the time 2 controls (\\\\(C2\\\\)) satisfies the backdoor criterion.\n\nThe main conclusions are that for each time period, the MMM regression should\ncondition on:\n\n1. Paid and organic media from the current time period and all preceding time periods, up to an assumed maximum lag duration.\n2. Non-media treatment variables of the current time period only.\n3. Control variables of the current time period only.\n\nA few noteworthy details to consider are:\n\n- An arrow from \\\\(C1\\\\) to \\\\(C2\\\\) has no implication on which variables to include in the regression.\n- An arrow from \\\\(C1\\\\) to \\\\(K2\\\\) requires the regression to include lagged control variables. In practice, it is best to avoid this if at all possible, as it could significantly increase the number of regression parameters.\n- An arrow from \\\\(T1\\\\) to \\\\(C2\\\\) is problematic. In this case, \\\\(C2\\\\) is both a mediator and a confounder. A single MMM regression model cannot be used to recover the causal joint treatment effect.\n- Adding the path \\\\(T2 \\\\leftarrow K1 \\\\rightarrow K2\\\\) is problematic for the same reason. In this case, \\\\(K1\\\\) acts as both a mediator and confounder."]]