Forecasting marketing performance isn’t for the faint-hearted, thanks to the time-consuming, complicated and expensive nature of the task.
Forecaster is a free tool, developed by RocketMill, that crunches your Google Analytics marketing data and provides accurate marketing forecasts in minutes.
Using a cloud infrastructure and R – a lightning fast statistical program – Forecaster can accurately construct forecasts in a fraction of the time it would take you in Excel.
Forecaster takes your current Google Analytics data and forecasts based on statistical models, removing any guesswork. There's even an accuracy checker that can validate your forecasts against historical data.
Authenticate your Google account to grant temporary read-only access to your Google Analytics data. Your authentication will last approximately one hour before expiring. We never request or store any credentials that would allow access to your data after you’ve finished using the tool.
Wait a moment while Forecaster loads all of your Google Analytics accounts. If you have a large number of properties, this could take a few seconds.
When finished, you’ll see your accounts appear in the dropdown menus. Use these menus to select the Account, Property and View that contains the data you wish to forecast with.
In the left hand menu, select your chosen date range, metric, traffic segment and forecasting model (more info on these below).
Date Range: This is the start and end point for the date range used to create your forecast – more is often better. You will need at least two years of historic data to use the tool. If you have less, you’re unlikely to benefit from statistical forecasting for the time being.
Metric: This is the value that will be forecast. ‘Sessions’ is the default metric, but options are also provided for revenue, goal completions, unique events, etc.
Channel: This option lets you forecast for a specific segment of traffic. The selections provided relate to the Google Analytics default channel grouping definitions, which can be found here.
Forecast Type: This determines the algorithm used to create your forecast. An explanation of each of these algorithms can be found below.
By now you should have a fully configured forecast appearing in the main area of your screen. The graph shows existing Google Analytics data as a black line and your 12 month forecast as a dark blue line. The two blue bands either side of the line represent the 80% (dark blue) and the 95% (light blue) confidence intervals.
We all know an exact forecast is impossible; however, confidence intervals show us the range of values either side of a forecast that we could reasonably expect. We can use this information to help judge the accuracy of the forecast. A large confidence interval would suggest there is either a lack of historic data or an irregular element that is difficult to accurately predict – not an uncommon problem!
The next graph, labelled ‘trend’, shows a 12 month simple moving average. This clearly exposes the underlying direction of the forecast metric without the interference of seasonality.
While this doesn’t give us any additional information about the forecast, it can be a clear visual representation of whether you should intuitively expect a metric to sink or swim over the next 12 months.
The final graph shows a comparison between the actual data from Google Analytics and the predictions made by the forecasting model. As a rule of thumb, the closer the match between the two lines, the more accurate the model’s predictions.
However, be aware there is such a thing as being too close; this is called ‘overfitting’. It occurs when the model technically fits well to existing data, but fails to generalise enough to accurately predict future events.
You may also notice with ARIMA models that the first 12 months (the first complete period) exactly matches the actual data from Google Analytics. This is because predictions from ARIMA models are made progressively and require a full period of values before making the first prediction.
We believe Forecaster meets our aim of providing sophisticated statistical processes to a non-technical audience. However, the power of the tool ultimately lies in the hands of the user.
You may have noticed by now that the tool does not provide an estimate for a given scenario, such as launching a new paid search or SEO campaign.
What the tool does do is provide a baseline – a reasonable prediction of future performance if all variables were to remain on their current course. Only by measuring actual campaign performance against a pre-forecast baseline can we begin to more accurately understand the true impact of marketing activity.
R is an open source programming language that lends itself extremely well to statistical analysis. Although R has been around for over 20 years, it has been rapidly gaining favour in digital marketing as data analysis continues to grow as a driver of success.
Being an open source language makes it possible for the community to drive progress and innovation by releasing ‘libraries’ that extend the functionality of the core programming language. It’s these community-generated packages that made our forecasting tool a possibility.
The key technical components we used were Mark Edmondson’s shinyga for authentication with Google Analytics and ShinyApps/ShinyServer from RStudio because it provides the ability to create a basic web interface to the R programming language.
What is R doing under the hood?
Our forecasting tool uses the shinyga library to authenticate with Google Analytics. The interface can then be used to select your view, audience segment and metric.
We use this information to build an API request to fetch the right data from Google. Once received, the Google Analytics data is sent through one of three forecasting algorithms and results are visualised for the end-user to interpret or download via the interface.
What do the forecasting models mean?
We offer three different forecasting algorithms, each with their own way of building a model to predict future values based on a series of historic values.
Holt-Winters, or ‘triple exponential smoothing’, is a technique that takes into account the seasonality, trend and level of a timeseries. Each of these three components is weighted by a smoothing factor that changes the emphasis given to historic events, i.e. the algorithm will give more importance to recent trends than historical trends. How much importance is given to recent vs. historic can be manipulated by the smoothing factor.
Furthermore, we can treat the seasonality as additive or multiplicative. Additive will treat the peaks and troughs of seasonality in absolute terms. Multiplicative, however, will treat seasonality in relative terms.
Additive = “Every July, there’s a peak which brings in an additional 2,000 sessions.”
Multiplicative = “Every July, there’s a peak which brings in 20% more sessions than the ‘average’ month.”
To keep the tool simple for the end-user, our implementation automatically attempts to find optimal smoothing parameters and decides whether additive or multiplicative seasonality is more appropriate. ‘Optimal’, in this context, is defined as minimal mean squared error for the in-sample prediction.
#2 State Space Model
Similar to the Holt-Winters model explained above, a state space model can take into account trend and seasonality.
Unlike the Holt-Winters forecast, we can also treat the trend components as additive or multiplicative and even decide whether a trend or seasonal component is beneficial to include in the forecast in the first place.
We can also apply damping to the trend component (but not the seasonality). This gives us 15 possible combinations of forecasting methods, each labelled as a pair of letters:
|Trend Component||None (N)||Additive (A)||Multiplicative (M)|
|Additive Damped (Ad)||(Ad,N)||(Ad,A)||(Ad,M)|
|Multiplicative Damped (Md)||Md,N)||(Md,A)||(Md,M)|
In addition to the 15 combinations above, the state space model also uses an error parameter in the algorithm, which (like the trend and seasonality) can be additive or multiplicative, resulting in a total of 30 different available models.
The error parameter does not change the mean point estimates for the forecast, it just adjusts the confidence intervals either side of the mean. For simplicity, our tool will automatically find which of the 30 different models and combination of smoothing parameters best fits your historic data.
You can identify which model the tool has chosen to use by the title in the graph:
We can see in this case the algorithm has used ETS(M,A,M). ‘ETS’ shows that we’re using Error, Trend and Seasonality. ‘M, A, M’ describes that the Error is Multiplicative, the Trend is Additive and the Seasonality is also Multiplicative.
Auto Regressive Integrated Moving Average (or ARIMA) is a little different from the other two algorithms we’ve talked about so far. ARIMA combines two key components to model a timeseries and make predictions.
The AR in ARIMA is an autoregressive process that uses a finite number of historic values to make a prediction for a future value. The historic values form a rolling time-frame in which each value is assigned a weight.
After each prediction, the time-frame then moves on one step to include the new prediction value and exclude the oldest value in the time-frame. We then use our new time-frame to make the next prediction.
Extrapolate this on several steps and you’ll start to notice that fluctuations can have a lasting effect on the forecast. For example, a huge but short-lived spike in traffic would likely have a long-lasting decay within the prediction.
The number of periods we include in our rolling window is known as the ‘order’. An AR(2) model would be a second order autoregressive model which would use a rolling window of two periods in its prediction.
The MA in ARIMA is a moving average function. Much like the autoregressive process, the moving average is calculated from a rolling window of finite historic periods. The key difference to be aware of is that a moving average process will not have a long-lasting effect on future predictions outside the order of the MA process. For example, a prediction made by an MA(2) process will have no effect on the outcome of a prediction made more than two periods beyond it.
Side note: For any music-production geeks out there, the MA process is essentially a FIR (finite impulse response) filter, as used in a convolution reverb, but applied to white noise instead of a pre-recorded impulse response.