Translating Marketing Science Terms Into Causal Inference Lingo
Stats people love to give things names but we ought to consider whether we should be in the business of naming things given our poor reputation. That’s not a hot take… it’s something we recognize ourselves. Look no further than the dozen different usages of “marginal”, the not-very-intuitive use of the term “random”, “fixed effects” meaning practically opposite things in different contexts, having something called “Granger causality” that isn’t causality, etc.
And then you have the whole other problem of the same term evolving into different things depending on the industry it’s applied to. You could honestly find 100 terms that mean basically the same thing between the fields of statistics and machine learning, for example. This makes learning things all the more challenging and, while I wish we could just homogenize the way we speak about things, the truth is that the terminology evolved the way it did for a reason.
I think it’s less forgivable for this to happen in academia but, in business contexts, you have to speak to your audience in terms that sound familiar. I mean, if I had to work in a field where I was regularly using marginal structural models with inverse probability of treatment weighting (phew) and I had to explain my methodology to my non-technical stakeholders, I would probably re-brand that.
Unsurprisingly, marketing science has developed its own terminology that differs from what academics or folks from other industries might use. Initially, I (coming from a causal inference/political science background) didn’t pay much attention towards what the marketing world was doing… until I figured out that the marketing world was doing some exciting stuff that was directly related to my methodological interests. But the language barrier prevented me from understanding that sooner until I read some materials online that made me think, “wait a minute…. this really sounds like causal inference”. And it was! Like I said, it just sounded different.
So that’s why this blog exists. I’ll update it as more terms come to mind, but I wanted to offer a resource out there to help folks who are coming from a causal inference background to understand both how causal inference is being applied in marketing research and, more importantly, what it’s being called.
Incrementality: You’ll see this term a lot and it really just means “causality”. “Incremental testing” is the same thing as doing causal inference (although, sometimes this can specifically be referring to experiments). An “incremental impact” is a causal effect. The “incrementality of a channel” refers to the causal effect of spending for a given channel. It’s not the most intuitive because the root “incremental” kind of makes an effect sound small/trivial, but this is not something unique to marketing science. After all, we use the term “marginal effect” (which sounds tiny) to describe the effect of a 1-unit change in an exposure level on the outcome, even if that change is going from zero to one for a dummy variable.
Attribution: This one is trickier because, at a broad level, it’s not really anything different from incrementality/causality. What channel(s) do we attribute the sales/conversions, etc. to? That sounds like more causal inference. However, you’ll often see a sharp difference in what incrementality and attribution are referring to. In particular, attribution is often tied to a very simplistic, assumption-free correlation strategy of assigning credit to channels. This strategy varies between different types (last-click attribution, multi-touch attribution, etc.) but the core idea is to assign credit for a conversion to a channel if a user interacted with that channel before converting. There’s obviously a lot of limits here. It doesn’t adjust for any confounding. It doesn’t estimate the counterfactual. We don’t model out (if it’s even possible) the potential post-treatment nature of a conversion (such as seeing an ad on TV and then using branded paid media search to purchase a product). These differences are why you’ll see folks refer to incrementality and attribution as separate things. Attribution “modeling” would never pass in causal academic research, but it is appealing for many reasons. It’s simplistic and intuitive and, if you’re a small company that can’t afford a huge marketing budget, you can do attribution modeling for pretty cheap. So, while the term itself sounds a lot like causal inference, “attribution” is most likely just going to be referring to poor attempts at causal inference.
Marketing Measurement: This one is my least favorite! Because when I hear the term “measurement”’, I’m thinking of a purely descriptive task like, “how do we measure democracy?” or “does this survey item accurately capture what we want it to?”. Measurement is a term that, at least in my head, lies within a totally different category of science than causality. And yet, “marketing measurement” kind of just refers to any attempts to estimate the effect of a media channel on some outcome of interest. I think it would be fair to say that “marketing measurement” is the same thing as “causal inference in a marketing context”.
Lift: This is yet another term to describe a causal effect. In a sentence, we could say that our experiment found positive lift for a given ad campaign on some outcome of interest. Not a fan of this term because it sort of feels directionally loaded (“lift” kind of always sounds like a good thing but null and negative effects are a thing.) Sometimes, you’ll see “incremental lift” too but it all boils down to a treatment effect.
Lift Test: This is another way to just say an experiment or, in some cases, an experiment with some assumptions relaxed. Unsurprisingly, I don’t love this term as well, because, within marketing jargon, you can estimate lift via things that aren’t considered lift tests (such as MMMs… see below).
A/B Test: This one isn’t necessarily claimed by marketing science but marketing scientists do run A/B tests. This is actually a pretty common term that I think a lot of folks are familiar with now-a-days, but in case you are not, no worries! An A/B test is just an online RCT. Some users are randomly exposed to ad A and others to ad B. We compare the difference in the outcome between the two groups and you’ve got an A/B test. You could literally just call it a “control/treatment” test but that doesn’t roll off the tongue as nice.
Geo-Lift Test: This is a type of experiment with relaxed assumptions regarding treatment assignment. Rather than randomizing based on the individual, you aggregate and randomize based on some geographic entity (county, state, etc.). Some geographic units will randomly be exposed to the treatment while others will not. Things are obviously a bit less precise this way, and you now have to defend more difficult identification assumptions, but this is a viable alternative for situations in which you don’t have access to individual-level data, but you can observe higher-level more anonymized differences between geo-units. In addition, because the baseline differences between treatment/control units can be quite large, you’re going to have to use some modeling as well to account for that.
Signal: Signal really is a term that’s used in the context of statistical power. If your result lies within a confidence/credible interval range, we’d want to look at ways to increase the statistical power and reduce the noise to get a more precise results or range of plausible results. As a result, we’d have a clear “signal” of what we should do with that channel of interest. But you can also use “signal” as a synonym for the causal effect you want to estimate in the first place. Basically, it’s a term used to describe the effect of interest specifically within the context of statistical power and uncertainty.
MMM: MMM stands for a media mix model or marketing mix model. It doesn’t really matter which term you use. The idea is to understand the causal effect of a given mix of media spending (like “what is the effect of spending 15% of the media budget on social media, 20% on TV…?”) on your outcome of interest (sales, ROI, etc.). Unlike the aforementioned experiments, it’s a tool rooted in complex modeling decisions that implement aspects of time series analysis, causal inference, machine learning and, depending on your preferences, Bayesian statistics (which might also include experimental design if you want to build the results of lift tests into your priors). MMMs are complicated, but I think they’re really cool precisely because they incorporate a bunch of different things in the data science toolkit into one product. To de-mystify MMMs a bit, they are fundamentally a time series algorithm where causal inference is the goal, they are validated using machine learning techniques, and parameters can be regularized using Bayesian statistics and prior experimental results. Assuming your identification assumptions are met (as always, big assumptions), MMMs give you retrospective information (causal inference: what effect this media mix had on the outcome?) and proactive information (causal decision making: if we go with this media mix, what will the effect be on the outcome?)
Upper/Lower Funnel Channel Effects: This is a concept that is more rooted in marketing, but it can easily be applied to causal inference concepts. In marketing, you can view your efforts as a funnel where broad, far-reaching marketing efforts reside at the top of the funnel and narrow, close-to-sale marketing efforts are located towards the bottom. At each level of the funnel, you won’t get everybody you advertise to, but you will get some. The idea is to preserve what you can as you move down the funnel and successfully guide customers to the desired outcome. An upper funnel channel might be an advertisement on TV. A lot of people are going to see that and a lot of people won’t ever think twice about the product. Some, however, will, and you can deliver more tailored advertisements towards these folks who are lower down the funnel and are more likely to convert. This concept has clear causal implications, specifically for the context of direct/total effects and the concept of causal mediation. Thinking about the funnel causally is actually a big reason that people are skeptical of attribution modeling. Sure, somebody may click on your branded paid search to purchase the product, but should that branded paid search be awarded the credit for the sale? What if that person saw the TV advertisement, went to Google to purchase the product and saw the branded paid search as the quickest way to access the product? With attribution modeling, branded paid search gets the credit, while TV advertisement is de facto assumed to have had no effect. In this scenario, branded paid search would be a mechanism of the conversion, but not necessarily the cause.
Adstock: We might refer to adstock as a lagged and decaying effect of media. The ad we aired on TV will have an immediate effect for the period that it aired, but consumers don’t forget about ads because they are no longer fielded. Marketing science folks deal with modeling adstock in different ways, but it’s basically just a lagged effect.
And that’s all I have for now, but, as suggestions come in, I’ll keep adding to this. I hope this was helpful for you!