By Ole Bossdorf | Co-author: Philipp Werner
As a founder, frontend developer, product manager or BI analyst you might wonder: Should we invest time and money to make our website load faster? And if so, how much?
Or maybe you have a lightning fast website already (congrats!) and want to understand how much effort you should put into maintaining its incredible speed.
At Project A we have encountered these situations quite often, so in order to support our frontend team in knowing what to focus on we decided to do the following: Determine the value of site speed — and thus also understand the incremental up- and downside in terms of revenue.
The key take-away? The revenue impact of changes to your site’s speed is measurable. With this article you will learn to measure the value of your own business’s site speed. So let’s go.
Step 1: Understanding the Critical Rendering Path
Let’s start off with a quick recap of the ‘Critical Rendering Path’, which includes all the necessary steps for loading your website after a user has entered your URL. One could write a book about the entire process (and this Github repo has actually set out to do just that) but for you to understand the revenue impact of your site speed there are only a few important milestones within this process to take into account:
While a domain name server lookup, a TCP handshake and a server connection are essential for a successful page load, for this project we will only focus on a timestamp happening in the processing section of the Critical Rendering Path called domContentLoaded.
The cool thing is that you can actually see all events within the Critical Rendering Path through the Navigation Timing API. Simply right-click this page, hit inspect to open the Developer Tools and enter ‘window.performance.timing’ in the Console tab:
You will receive a complete list of all timestamps (in Unix time) that mark the moment you pulled up this blog post until it was loaded completely within your browser. Pretty neat, huh?
If you want to measure your site’s speed you can simply calculate intervals between different Unix timestamps. To find out how long it took until the browser started rendering the HTML file, this will do the trick: window.performance.timing.domContentLoadedEventStart — window.performance.timing.navigationStart
Luckily there are many tools out there which do this first step for you automatically while also providing suggestions for improvement. We’re not going to list them all, as you can check out this comprehensive overview by Google. They even recommend the right service based on your background and interest.
While these free resources are a great way to get quick insights, they are unable to link two pieces of information which are essential for the topic of this post: site speed data and transactional data.
By transactional data we are basically referring to data which tells us if a user performed a desired action on your site (purchase, sign-up, download etc.). In the next section we’ll show you how we linked these two data sources to evaluate the business value of your site’s speed.
Step 2: How to create the necessary data set
To be able to determine the influence of your site speed on your business you need a dataset on website session level (each row represents one website session or visit) which tells you not only how fast the site loaded during the respective session but also whether the user took your desired action (= conversion) during that session. Ideally this data set contains upwards of 100,000 sessions with site speed data to derive significant results. We found out that there is a shortcut as well as a detour to create this dataset.
Shortcut (using BigQuery):
Maybe your company is already using Google Analytics’ connection to Google BigQuery to extract website sessions. That’s great because the BigQuery export schema also includes something called Latency Tracking KPIs. Essentially, the Google Analytics tracking script embedded into your page already tracks the different milestones within the Critical Rendering Path which we discussed in the previous section. Here’s what they reflect:
Latency KPIs in Google
domainLookupTime = domainLookupEnd — domainLookupStart;
serverConnectionTime = connectEnd — connectStart;
serverResponseTime = responseStart — requestStart;
pageDownloadTime = responseEnd — responseStart;
domInteractiveTime = domInteractive — navStart;
domContentLoadedTime = domContentLoadedEventStart — navStart;
pageLoadTime = domComplete — navStart;
Long story short: This Google feature will save you a lot of time as you can download the mentioned dataset (combining site speed data and transactional data) using a SQL script. You can use ours to get you started.
If your company is using a different web tracking tool which does not capture site speed metrics by default, you have to take a little detour. First, you need to set up custom events or server logs which capture the site speed metrics you want to look at (here’s a github gist to get you started). These events should include a session ID which has to be stable throughout a user’s session. Additionally, you will want to fire an event including the session ID, whenever a user takes the desired action on your website.
Once set up, you can join both pieces of information on session ID and create the aforementioned dataset. The only disadvantages of this approach compared to the shortcut are a more complex setup as well as missing historical data.
Step 3: Analyzing correlation and simulating potential revenue uplifts
Once you have extracted the relevant information from BigQuery or from your custom tracking setup you can group sessions into intervals (PostgreSQL’s FLOOR function comes in handy here). The more data you have the smaller your intervals can be. Next step is to calculate conversion rates by dividing all converted sessions by all sessions for each interval.
For our analysis we focused on the time range between navigationStart & domContentLoaded (domContentLoadedTimein Big Query) for both the first page view of a session as well as an average over all sessions.
This is how sessions were distributed in terms of domContentLoadedTime for 0.1 second intervals:
This serves as a nice reminder that even though your page has an average loading time of x seconds, the individual loading time will differ as it depends on the user’s connection, device, browser etc.
Here’s how visitors converted in each 1 sec interval:
This is how it looks for another portfolio company of Project A:
Different business models and websites — very similar results. There seems to be a relationship between the loading time of the website (expressed as DOMContentLoadedTime) and the conversion rate. It seems as if users are significantly less likely to convert if the page takes longer to load. But more on that later.
Next, we simulated how much additional revenue this portfolio company could claim by improving their site by x seconds (i.e. moving users from one interval up to a better converting interval).
Based on this analysis, improving the site would yield increasing returns until a certain threshold where the marginal utility starts to decrease. You might be able to detect a similar effect for your website when using Google’s Revenue Impact Calculator, which takes a similar approach but is solely focused on mobile.
That’s what we wanted to know, so we’re done, right? Not entirely.
Excursus: Why you are not done just yet — or why sleeping with your shoes on might be okay
So far we have analyzed correlation but left causation aside: You still don’t know whether a faster site will actually result in higher revenues, as it could very well be that both variables are influenced by a third factor.
Let’s make this more tangible with an everyday example: Sleeping with your shoes on might correlate strongly with waking up with a headache. Concluding that going to bed with your shoes on will thus likely result in a headache the next day would be wrong though — as the fact that you went to bed drunk as a fish last night, was actually causing you to both leave your shoes on and wake up with that headache.
In our case, it could very well be that site speed correlates strongly with revenue because both are being influenced by a third factor such as the users’ socio-economic background. They might happen to have faster internet connections or newer devices and also a tendency to spend more on your products, which could be why you see a strong correlation between your site speed and revenue.
Step 4: Determining causation by A/B/C/D-testing
We have clarified why we cannot yet use the relationship we found to justify investments in improving our site speed. Next, we need to test this relationship for causation.
One way to do this is to design a linear regression model that takes into account all variables that potentially affect the relationship between our two main variables (e.g. device, browser, gender etc.). There are two main problems to this approach: growing complexity with the number of included variables as well as a limited ability to defend it against sceptics, who can easily point out other variables you didn’t take into account.
Thus, we recommend following the standard approach for testing causation by doing a split test, ensuring a controlled environment in which only the site speed is altered. Basically you will slow down x% of your website traffic by a certain interval. This will move the distribution curve we showed you earlier slightly to the right. Preferably you would of course like to speed up your page by that interval, but if your website is lightning fast already, that’s not really an option. If you decide for an A/B/C/D split test and intervals of 0.5 / 1 / 2 seconds, the distribution of website sessions by site speed would look similar to this:
Whenever the user experienced a slowed down page we would make that information available in the data layer to be picked up by the Google Analytics tracking script through a custom dimension. Of course you can also attach that information in your custom tracking setup. Thus your dataset we discussed earlier has to be expanded by one additional column indicating which test scenario (A/B/C/D) the session belongs to.
If your results indicate a similar relationship for different treatment groups you can safely assume that your website’s performance affects your business significantly.
However if your treatments lead to a more flattened trend there might be an overarching variable which influences both site speed and conversion rate.
We hope this post can serve as a guideline for you to better understand how your website’s performance impacts your business. We looked at the Critical Rendering Path and how you can analyse its timestamps through different tracking concepts. Based on a data set which includes website session and transactional data we detected a correlation between site speed (expressed as DOMContentLoaded Time) and conversion rate.
Once you’re able to prove causality for this correlation through our suggested split test you can use this analysis to quantify the potential revenue uplift from investments in your website performance.
One thing we still want to look into is something called User Perceived Waiting Time (UPWT). This timestamp would allow us to track more accurately at what point the user saw something happening on the page. While DOMContentLoadedTime seems to be a reliable and readily available proxy, we believe that looking at UPWT instead would increase the significance of this analysis even more.
*Footnote: Due to something called a Processing Limit you might still not be able to track site speed metrics for 100% of your website sessions. Also other problems might be created through multiple page views.