How to improve the accuracy of your Google Analytics data

Related Articles


Google to sunset Universal Analytics. Act Now!

Google Analytics 4

What are data streams in Google Analytics 4?


How to improve the accuracy of your Google Analytics data

Today, I’m going to be showing you five ways to improve the accuracy of your Google Analytics data, and we’re going to start with the users metric.

1. Improve the measurement of users

Now, the user metric actually counts the number of user cookies, and it doesn’t count the number of people. Let’s take this scenario where I visited the RocketMill website yesterday on three separate occasions.

First of all, I accessed through mobile on Safari. I then visited a bit later through desktop and Internet Explorer, and then finally I visited through laptop and Google Chrome.

In each case, I visited the site via a different browser, and device combination. As such, whilst I’m one person visiting the site, I’m counted as three unique users in Google Analytics. The user metric in aggregate isn’t really that meaningful. It’s not telling us the number of people. It’s not telling us the number of session. In fact, it’s not really telling us a great deal at all.

Begs the question, how do we improve the measurement of users within Google Analytics? We can start by passing the client ID as a custom dimension in Google Analytics. The client ID sits within the user cookie, and it’s what uniquely identifies each user in Google Analytics. We can set up a user scope, custom dimension within Google Analytics, and then code that up on our site to pass the client ID from the user cookie.

Doing that, in effect, gives you a row for every unique user in Google Analytics. Having that data at the granular level is fantastic, because it means you can cluster the data in many different ways. You can create lots of useful segments, analyse the users based on those smaller segments, and have more meaningful aggregates, and just generally get more from the Users metric overall.

Once you’ve got the client ID set up, and you’re using this data efficiency, the next step is to enable the user ID feature in Google Analytics to stitch different user cookies together. In our example, that’s bringing the three unique users back to one common user ID.

In order to achieve this, we first need to enable this setting within the Google Analytics interface. then once that’s happened, we need to amend our tracking code on the website in order to pass the user ID, and then for that user ID to be associated with each of these user cookies. Once you’ve done that, and you’ve got that set up and running in Google Analytics, it enables you to track cross-device usage a lot more effectively. It also unlocks a couple of additional reports in Google Analytics, which means you can get more insight from your data and understand your users in a lot more detail.

2. Ensure your bounce rate data is accurate

I’ve talked about a few ways, in terms of improving the accuracy of the users metric, I’m now going to move onto the bounce rate metric.

In terms of the bounce rate metric, incorrect tracking code can artificially skew your bounce rate. Let’s take this example here, where we’ve got standard Google Analytics page tracking codes at the top of the slide. What you’ll see a little bit further down the page, the very last line of that code has been duplicated.

From the user’s point of view, they’re not going to notice any different. The website’s going to work fine. You’re going to be able to use the website absolutely fine, but in terms of the Google Analytics data, that additional line can cause havoc with your data.

Firstly, it’s going to duplicate the number of page views that land on this page, so for everybody that lands on this page, we’re going to be double counting the page views. Already that is going to be inaccurate. Not only that, it’s also going to have an impact on your bounce rate metric as well.

Why? Because a bounce is defined as a session with a single interaction. Somebody lands on this particular page, Google Analytics is going to see two interactions, because it’s going to see two separate page views. Therefore, since that additional line of code was added, this page is going to have a 0% bounce rate, which isn’t right, and it’s artificially skewed just because of that one piece of extra code on the page.

Beware as well, if you’re migrating over to Google Tag Manager. When you migrate over to Google Tag Manager ensure you remove all instances of the hardcoded Google Analytics tags on your site. If you don’t then a similar sort of thing is going to happen as to what we’ve seen here, duplication of page views, artificial lowering of bounce rate. So be really, really careful when you’re implementing tracking code on your site. It needs to be clean. It needs to be right, for your bounce rate to be accurate.

It’s not only tracking code that has an impact on bounce rate. Event tracking can also skew your bounce rate as well. Now, let’s take this very simple example of the landing page that we’ve got here. It’s got a single call to action where we’re going to download an ebook. Let’s assume this landing page has standard Google Analytics page tracking codes on it, but the button has no event tracking on it whatsoever.

Let’s take the user journey, where somebody lands on this page. They click on the call to action to download an ebook, and then they leave the site. That user would be classed as a bounce session, because Google Analytics is only seeing one interaction, and that’s the page view that’s fired when you land on the page.

Now, let’s assume, pretty much everything’s exactly the same, but the difference here is we’re going to event track the button, which makes sense because one key question we’d want to know is: “how many downloads are we getting on our ebook?” Now, by adding that event tracking code onto the page, what’s going to happen is, that very same user journey, Google Analytics is now going to see two interactions. It’s going to see the land on the page, and the page view being generated, but it’s also going to see the click on the button from the event tracking.

They’re going to be classed as a non-bounce session there. That’s another way how bounce rate can be artificially skewed. It’s worth noting at this point, that within the event tracking there is a non-interaction parameter, which you can set up. It’s defaulted to false, but if we were to change that to ‘true’ we’d still get the download information from the event tracking, but we’d exclude those button clicks from the bounce rate calculation, so we could have a like-for-like comparison. But because that’s not the default, quite often that’s missed, and then the even tracking’s implemented, and you see the bounce rate artificially lowered as per this example.

Now, you might be thinking: “Why is this important?” Sometimes in Google Analytics, you might see a bounce rate chart that looks a little bit like this. You see the bounce rate was trending fairly consistently, and then all of a sudden there’s a massive drop off in bounce rate, and it’s continuing again along a very sort of straight trend.

You may see this initially, and think you’ve done something really great from a UX point of view to improve your website, but what I would urge you to do in this situation is before you draw that conclusion, look at whether any tracking code has changed, or whether you’ve implemented even tracking, because more often than not, it’s going to be those things that result in a graph that looks a little bit like this.

3. Tidy up your (none) traffic data

Now I’m going to move onto the (none) traffic channel. You may have noticed I’ve worded that quite carefully. I didn’t just say direct traffic, and that’s because the (none) traffic channel is more than just direct traffic. It actually has a lot of noise around it.

For example, any traffic that’s referred from mobile apps; any traffic that arrives to the website from non web based email, so Outlook installed on your PC is an example; any bot spider traffic can work its way into the (none) traffic channel. In fact, any traffic where Google Analytics doesn’t know the referral, it puts it into the (none) traffic channel. Effectively it becomes a bit of a catch all.

This is important because it can lead to your (none) traffic channel being artificially skewed and making your data difficult to analyse. You can see here there’s a clear baseline where the (none) traffic channel should be, but in this instance, there were four months of data where the (none) traffic channel was vastly overinflated because of some of these noise issues that we’ve talked about. The good news is, with a little bit of housekeeping, you can tidy up the (none) traffic channel. And a lot of this focuses around UTM tracking.

So, make sure your UTM tracking links in all of your email marketing, you’re pulling it out of the (none) traffic channel, and into a separate medium where you can analyse that in a lot more detail. Similarly, UTM track links in any offline material we’ve got as well. So, any PDF documents that have a link through to your website. Use the campaign builder tool to ensure UTM tracking is correctly tagged. That will ensure that you’re using the right syntax and that no rogue traffic will be coming into your (none) traffic channel.

Finally, but more importantly, ensure that within your Google Analytics admin settings that you’ve got the ‘exclude all hits from known bots and spiders’ checked. This is really, really important. If that’s not checked, any bots or spider traffic that’s arriving at your website, will filter through into your Google Analytics data. Now, that bot and spider traffic has very, very different user properties to what a normal human would have, and it has the potential to massively change a lot of your metrics on Google Analytics. It’s really, really important that, that setting is checked, so you’ve got clean data moving forward.

If you apply all of these housekeeping techniques, then you should have the (none) traffic channel, which is a lot more robust. It’s easier to analyse, and it’s going to be a little bit more representative of what direct data should actually be.

4. Make sure all goal completions are tracked

Now, we’re going to move on to number four, which is the goal completions metric. The goal completions metric will not measure every single conversion, and the reason for that is Google Analytics goals trigger once per session only.

Let’s take this example where a session arrives at the site. There’s a goal, which we’ll call Goal A, and that goal is triggered three times within the session and then the user exits the site. It’s clear there are three unique conversions that have occurred here, but Google Analytics is only going to record one goal completion for that session. In effect, the goal completion metric is actually understating the performance of your conversion.

A work around for this would be to also track this goal as an event within Google Analytics. We’ve got that very same example, where we had one goal completion. If we were tracking that as an event in Google Analytics, it would measure as three individual events.

So, we’ve covered all bases here. Bear in mind that you can actually set up an event as a goal in Google Analytics, but it will still dedupe the goal completions metric. Probably the best way to explain this, is that if you’re analysing any data within the goal section in Google Analytics, then be mindful that, that is potentially deduped based on this example. If you want to measure the number of conversions in full, use the event tracking data and set up event tracking on your goals in order to do that.

5. Enhance your page load speed data

Finally, I’m going to move onto number five, which is the page load speed metric. Page load speed is only collected for 1% of your audience by default. This results in a volatile, low sample data set, even for very, very large websites, and you’ll get a chart that won’t look to dissimilar to this where there’s lots of peaks and troughs, and it’s very, very difficult to analyse the data. The good news is, that can be changed through some very simple changes to your Google Analytics tracking code.

If you don’t have Google Tag Manager enabled, then you can ask your developer to add the site speed sample rate call to the create line of your Google Analytics tracking code. You can set that number anywhere between 1 and 100. I’d recommend setting it to 100, which means that it’ll collect page load speed data for all of your audience. I will caveat that for particularly large websites, it will limit that data collection at 10,000 data points a day, but that’s still a huge amount of data – a lot more than you would have had before – and it gives you the ability to segment that data, and slice and dice that data in a lot more detail.

If you’ve got Google Tag Manager enabled, it’s even easier to do. You just go into Google Tag Manager; find your page view tag; go down to the more settings, fields to set; you set the site speed sample rate; you assign it the value of 100. You do your usual quality checks, preview, publish, and you’re all good to go.

This, then, results in the page load speed metric having a lot more data behind it, it’s a lot less volatile, and you can just do loads more with it. You can create lots of different segments. You can analyse it in many, many different ways, and get those unique insights, which is going to help grow your website and push your business forward.

Actions to improve the accuracy of your Google Analytics data

So, in summary, these are the top five actions to improve your data accuracy in Google Analytics.

Firstly, in terms of the users metric, ensure you’re collecting the client ID as a custom dimension. That allows you to create meaningful segments of your users and analyse those aggregates in a lot more detail. Once you’ve got that set up, enable the user ID feature to get more insight into your cross-device tracking.

In terms of bounce rate, ensure that you’ve got robust page tracking on your site, and that you’re event tracking very, very carefully. Be mindful that there are two areas which could artificially skew your bounce rate. If you see a chart like we saw earlier, always think about whether page tracking, or event tracking has changed in any way, because that could be driving the trends that you see.

In terms of the (none) traffic channel, ensure that you’re using UTM tracking effectively, and really, really importantly, ensure that you’re filtering out that bot and spider traffic. By doing that you’re going to have (none) traffic, which is a lot more accurate and you can just do more with it.

In terms of the goal completions metric, we know that dedupes based on the sessions, so also event track your goals, so you’ve got that flexibility. You’ve got that deduped data, but you’ve also got the full data as well, which means you can do loads more analysis around your conversion rate, and around your goals.

Finally, in terms of the page load speed metric, implement the code changes to increase the data sample. More data’s great. More data means you can do more segmentation, and if you do all of these five points combined, you’re just going to have lots more information in Google Analytics. You’re going to be able to unearth more unique insights, and it’s really going to help push your website and business forward.

Thank you very much.