Everyone seems to be talking about marketing attribution, but does anyone really know how to do it? I discuss why I think it can be problematic and pose the question: is this really the best use of our time, attention and resource in 2017?
Today, I’m going to be having a little bit of a rant about marketing attribution. So, very very quickly, in a nutshell, marketing attribution is the process of assigning value to the different touchpoints that a user might encounter in their journey. Very simply, that’s all we’re trying to do.
I don’t know how many of you have read a lot on the subject, how many of you feel like you have a really good grasp on attribution, you understand it? Oliver, naturally, of course. I still get a little bit unsure of what we’re trying to do from time-to-time, and I’ll tell you why. It’s because we have lots of information out there, people talking about this. I saw this brilliant tweet the other day, saying that: “Attribution is all about unlocking insights that are actionable and making better data-driven decisions.” You’re clear on that? That’s attribution.
And it gets worse. These are all literally within the last 48 hours, all of these tweets. There’s some brilliant ones. “What good is big data without big attribution?” I read that and I still don’t know what it means. “The internet of things is disrupting marketing attribution, and marketing ROI will never be the same again.” Still not quite sure. I read every single one of these, and all of them have very nice explanations of what attribution is. They have…some of them go as far as maybe going into Google Analytics and explaining all of the different attribution models in there.
A lot of them kind of end with the conclusion that the attribution models in Google Analytics maybe aren’t the best way of looking at things, and nowadays, in 2017, there’s much, much better ways that us marketers should be looking at it. A lot of you may have heard of data-driven attribution, they’re all saying: “Yeah, data-driven attribution is the way we need to go.” So just when it’s getting exciting, you’re thinking: “Great, how are they going to tell me how to do data-driven attribution? Where do I go, what tool do I use, who are the big players in the market?” And the articles just end there.
I don’t know whether you’ve come across this, but I find it frustrating. There’s nobody really out there who’s shouting about how you do data-driven attribution, giving you really tangible examples of how to get started on the topic, the first piece of analysis you can do to get you started down that road. That’s kind of unique for pretty much every topic in marketing analysis. You can Google a blog out there and there’ll be someone giving you a really detailed guide of do this, do that, and here’s how to begin with this type of analysis. Attribution to me feels like a little bit of an exception.
But anyway, after reading through all of these different articles about attribution recently, these are the promises that I kind of extracted from all the information:
They’re all admirable goals, why wouldn’t you want to do these things? It all makes sense. So how do we do it? I want to start off by saying there’s some good things, there’s some brilliant things, and Google Analytics is always a nice place to start. You’ve got the multi-channel reports, the conversion reports, which look at a user’s interaction, so not just one single visit, but the combination of multiple visits to a website, sort of looking at users, which was a great way of looking at things. I really, really like this last column here, which is the ratio of assisted to last click conversions in Google Analytics, so this can be broken down by channel here. I think this is really overlooked sometimes.
In its current form, it’s not very nice to look at, so I quite often like to lay it out like this. You can put it horizontally left to right. If you annotate that, you’ve got the number one here. One means the channel assisted just as many conversions as it was the last point in a user’s path. Further to the left of this means that it’s direct response. It’s very bottom of the funnel, it’s the last click. Then the further right we get, it means the more supportive that channel is.
When you lay it out like this, it suddenly gets quite interesting. You maybe have some ideas, some hypotheses. So organic search, I’m starting to think: “Okay, well if that’s a consumer’s first touch point, they’re seeing us in organic search for the first time, maybe that’s where we need to make a big brand impression.” We need to get them remembering the brand name so that they come back further down the funnel and we can get them with our brand campaigns and so on.
Further down here, there’s referral, which is more of a mid-funnel type source in this particular instance. You might be thinking: “Well, referrals, potentially that’s people reading up information about the product.” Let’s find out more about who these referrers are. You can intuitively look at this, it takes maybe ten minutes to do this, and you can start to get some ideas. This is great. Even better, one thing that we often overlook, when we talk about attribution, we’re often thinking marketing channels. But you can attribute… I prefer to think of it as touch points. You don’t necessarily need to look at the whole website.
This particular example is looking at Ad Words campaigns. You’re just looking at paid media interaction. You can do the same thing, you can get assisted and last click conversions, the ratio between the two. And you can start to identify upper and lower funnel campaigns. That might give you some ideas around how you might adjust the ad copy, whether one is more appropriate for building retargeting lists. You can start to put your…build a sequence of events that you might expect the consumer to go through.
The other thing that I’m sure you’ve heard a lot about attribution is that last click is bad. You can solve this, obviously, by going to the Google Analytics and you just pick a different model position base, why not. Wait for it to load. Problem solved.
But not quite. The problem with the different attribution models that you provide in Google Analytics and similar tools is that they’re static. They apply a rule, or a set of rules, upon all of your traffic. It’s basically saying that every single customer thinks and performs and consumes in the very same way. Last click makes the assumption that every single interaction a consumer had with your brand before they purchased had no impact whatsoever. They had no impression on that person. That’s obviously not true. That’s the bigger argument against last click. Makes complete sense. But then when you start looking at other models, first click for example, that the first touchpoint that a consumer had is the only important event because none of the others would have happened after that if the first point wasn’t there. We don’t know whether that’s true, because they might have come in one of the other channels. Position-based, likewise, that they were all equally important, could be complete nonsense. We’re always applying an assumption about how consumers convert the minute we start using one of these models.
Inevitably, when you start looking at this more, you start to realise that none of the models are any better than other. Last click is what we use, typically, because it’s intuitive. People understand last click. It’s very simple. Attribution promises a way of diving deeper into that, and better understanding the consumer.
One of the reasons you mentioned that social and display, for example, are common topics when you’re talking about common distribution models because they’re not favoured very much by this last-click modelling, because they’re typically higher up the funnel, and then other channels might steal the credit further down the line. But attribution modelling is not about trying to make your channel look more favourable and tying to make sure that it looks like a good investment to the client so they’ll increase their budget and display a little bit more, or they might pay for a social media campaign. That’s not the aim here. The aim is, as we looked at the points earlier, is to better understand your consumer, better understand how they’re consuming, so in turn we can improve our marketing and create greater efficiencies.
We mentioned data-driven attribution before. The idea behind this is really sound. The idea is to try and tackle the flaws of these static models that we just spoke about. That we use known data, the observed data that we have, to build statistical models that accurately attribute credit based on the observed merit of each touchpoint in a path.
This is great, but there’s a few flaws. When we’re talking about the observed merit of each touchpoint, how many touchpoints do you think we’ve got? We need to observe the consumer buying process to attribute credit to all the different things that they do within that. The reality is that we can only observe a tiny little bit of the buying process.
If we think: “How do you shop online?” You might go on Facebook. You see a cool new, something on Kickstarter, maybe. You think: “That’s a great idea.” You go and Google a bit. You find there’s already a product that’s already reached the market. So, you click on the first three, do a little bit of research, think: “Okay, cool.” You check some reviews out on a review website, then you pick the brand that seems to have the best reviews, you go straight to the site, probably from the link that’s on the review site, they come through as referral traffic, and then you might buy.
We only literally understand the end little bit of that journey. There’s this enormous part that we’re missing, that we really, really can’t understand very much. But you might say that that’s a cop-out, so work with what we’ve got. What can we do?
This is something really interesting that I’ve been working on in the last couple of weeks. I wanted to see how far we could take this with what we’ve got. So, very very quick case study, to try and do data-driven attribution on a shoestring budget. So, with Google Analytics Premium you’ve got data attribution. You click a button, it’s done for you. Lovely. Not everybody’s in that luxury position.
The steps I went through, I set up a custom dimension to record the cookie IDs, the bit that identifies you as a user to Google Analytics. I set up user ID tracking as a custom dimension, so if the customer logs in on the website, I can see that they’re a particular user. If they log in on a different device, they’ll have the same ID, which means that I can then connect those two cookies together. That’s effectively allowing cookie consolidation across device tracking. There’s lots of names for it. Then, using the API, we can export this. Because we have the ID we can now see each individual user in every session they had as one row of data. This is very important. We need this level of data to be able to do data-driven attribution.
That’s quite big, if you think about all the visitors to your website over, say, a 30-day period or a 90-day period. Every single session they had is a new row. That’s inevitably quite a lot of data, so we put it into BigQuery. BigQuery is very convenient. It’s kind of like a database which is very easy to put data into and then you can query it very, very quickly. It performs well with very large data sets. Then we had to write some quite complex queries to get it back out again, but in the right format. There’s lots of aggregations going on, you’ve got joins trying to join the cookies and the user IDs together. Then finally we had to learn a bit of statistics about Markov chains to build probabilistic models to estimate the uplift, or rather, the detrimental impact of removing a channel from a particular path. There’s lots of simulations. It got quite complicated. This is a big, big leap from going from the Google Analytics interface to trying to do data-driven attribution and really understand the process and what’s going on.
Here’s the end result. Amazing. We’ve got…I’ve highlighted, direct traffic, Google CPC, and Google organic. Our touchpoint in this case is the source and medium dimensions in Google Analytics. The writing’s very very small, but the red bars here are the first touch, the purple is the data-driven model, and the green is the last click. The arrows have shifted a little bit, but you can see the movement isn’t particularly seismic. Google CPC is perhaps a little bit undervalued, based on the last click. Google organic is near enough the same direct as being overvalued based on the last click model.
Cool. This is really exciting. We could go one step further if we knew how much we were spending on each of these, we could work out CPA and efficiency, and that then starts to get a bit useful, because we might then say: “Oh, actually, paid media is performing a little bit harder than we thought it was, a little bit more efficient. We can maybe move some budget from one of these other channels into this. But we can’t take budget from direct. No one spends marketing budget on direct traffic.”
So, what else do we really do with this information? We could spend a long time going further into detail, doing more analysis, to better understand which orders work better. We could maybe go back. Ultimately, we can’t really influence the way the order that consumers consume our media in. All we need to do is make sure there’s this bubble that is out there.
Data-driven attribution can have some powerful results. There’s some really interesting things you could learn from it. But, going from the Google Analytics last click, where you’ve got all these reports that are very, very detailed, very well laid-out to you, to essentially just having to do all of the processing yourself in an external tool outside of Google Analytics, which isn’t…only a very small part of the tool is built to do attribution modelling for you. You need to really start looking to things like CRM, where you can consistently identify users, and you can start looking at lifetime value and user level conversion rates and potential lifetime value. That’s exciting stuff, but I consider that sitting outside of attribution.
The other argument is that it’s not necessarily uncommon. There’s a particular case we had here, less than 10% of users were converting on a last click, in this case. In other cases that I’ve seen, we see 60%, 70% of users easily converting on a last click within a single session. This means one of two things, that either attribution modeling’s really only relevant for the minority, say the 30% that are converting on two or more sessions; or we have a technology problem and the cookies just aren’t good at identifying people. At the moment we can’t identify someone who browses on their mobile and then jumps to the desktop to convert, although Google are launching a cross-device tracking feature in the coming months, so keep your eyes peeled for that.
So, my question is, my challenge is, is data-driven attribution a must-have for everybody? Should everybody be doing it as the market seems to be screaming? You look at all the media out there, all the publications talking about attribution. Are they really right? Or is data-driven attribution just nice to have for some? Is it the best use of our time and money?
I couldn’t help but put this quote in. It’s usually used in the context of bit data. “Attribution is like teenage sex. Everybody talks about it, nobody really knows how to do it. Everybody thinks everybody else is doing it, so everybody claims that they are doing it.” That’s kind of how I feel about attribution. It has its uses.
There was also a survey that was launched, literally in just the last few weeks. This was a panel of 84 very senior, very experienced, very influential analysts within marketing. These are experts, they know their stuff, and they’re in charge of big budgets too. And they said that their number one thing that would occupy their time, attention, and resources during 2017 was cross-channel measurement and attribution. My challenge is, is that really the best use of time, attention, and resource in 2017? Or are there better ways that we can spend our time and money? It’s open question.