A few years ago, we decided that as a sustainable business we must include the carbon footprint of our digital products in our overall carbon footprint, but at that time there was no known way to do this.
We set out to create a method for quantifying the carbon emissions of websites and eventually created the first public carbon calculator for websites at WebsiteCarbon.com. Now in its second iteration, this tool has come close to completing a million tests, helped engage people in the topic of website emissions and inspired digital teams to pursue higher levels of efficiency.
However, figures for the amount of energy used per gigabyte of data transferred vary enormously. Now that digital sustainability is becoming a more mainstream topic, I’m seeing an increasing number of articles and media reports using vastly different figures in reference to the energy consumption of digital services, which I feel will inevitably lead to confusion if the differences are not clearly explained.
In this post, I hope to shed some light on why the data varies so much, how to interpret numbers quoted in articles, and how to make an informed decision about the data you use in calculating your own emissions for digital projects.
Orders of magnitude
Reviewing the academic literature on the energy consumption of internet data, we found that figures varied from the lowest of 0.004 kWh/GB to 136 kWh/GB. In other words, the estimates varied by several orders of magnitude. What on earth was going on?
In 2017, a meta-analysis by the name of Electricity Intensity of Internet Data Transmission: Untangling the Estimates, was published with the aim of making sense of these huge differences. Having reviewed 14 existing studies on this topic, the authors concluded that an accurate estimate for electricity used to transmit data through the internet was 0.06 kWh/GB for 2015.
Case closed, right?
Well, not so fast.
The importance of system boundaries
One of the key variables in these studies is the system boundary, or put simply, which parts of the total system are actually being studied. The meta-analysis filtered the data to only look at the smallest possible sub-system, representing the network equipment used for data transmission and access at a national level. In other words, they adjusted all of the studies to only look at the energy used to make a gigabyte of data travel through a telecom network within a national cable network.
This is a useful figure to have, especially as a component of larger life cycle analyses, but it inherently gives an incomplete picture. It entirely ignores important parts of the overall system including data centers, international infrastructure, on-site networking equipment and end user devices, not to mention the differences between cable and mobile networks.
And there is more. The system boundaries in the diagram above, which appears to be the whole picture, do not show the embodied energy of building the data centers, manufacturing the servers, constructing the cable and wireless network infrastructure and manufacturing end user devices. Some argue that this is not relevant, while others argue that this is inherently part of the total emissions, especially as servers get replaced every 3-4 years and we are building and upgrading infrastructure rapidly to feed our hunger for data. When this is factored in, the full picture of energy and carbon emissions from digital services looks far bigger.
System boundaries are also important when looking at a project’s carbon footprint for reporting and offsetting purposes. Some organisations have a policy to report all carbon emissions on scopes 1, 2 and 3, in which case they would want to use wide system boundaries. On the other hand, some take a more limited view of where their responsibility for emissions ends and would therefore choose more limited system boundaries.
So what is the correct set of system boundaries?
All of them. Or none of them!
There is no defacto set of system boundaries that we should use for every scenario and that’s part of the reason that none of this is as simple as we all wish it to be. The appropriate boundaries are entirely dependent on what it is we are trying to learn. Whether we are doing the reporting or reading other’s reports, it is important to know where these boundaries are drawn, and perhaps more importantly, why.
The meta-analysis also highlighted that the date of the studies has an impact on the estimates produced, with more recent studies tending to estimate lower energy per gigabyte as technology becomes more efficient. It is therefore important to know the year of the data that was used to do the calculations. For example, I might be estimating emissions for a web project now in 2020, but the most recent reliable study might only include data for 2017. I could choose to use data from a past year, or I may choose to adjust the data myself to factor in gains in efficiency. What is important is that these details are stated so that the numbers are transparent to the people viewing them.
Interestingly, the meta-analysis also reported that study methodology does not make a huge difference to the overall estimates of internet energy use, confirming that system boundaries and date are the major factors.
How does this relate to website emissions?
If we want to know the energy required to make data flow through a cable then, as the meta-analysis did, we need to use narrow system boundaries.
However, if we want to understand the bigger picture of the total emissions associated with a website or web service, then we need to be using the widest possible system boundaries. It’s horses for courses, and we need to set our system boundaries appropriate to the application we are studying.
In the case of our work at Wholegrain, we want to understand the total impact of websites. For this reason, we look to use studies with wide system boundaries, with WebsiteCarbon.com currently based on the study On Global Electricity Usage of Communication Technology: Trends to 2030 and an energy factor of 1.8 kWh/GB for 2017. This is a lot higher than the narrow system boundary estimate of 0.06 kWh/GB, but is actually at the lower end of estimates that include complete system boundaries.
On this basis, we feel fairly comfortable that the figure we are using is a reasonably accurate figure for what we are trying to represent. We hope to soon update this to use a figure for 2020/21, but ideally would like to see a new study covering the full system boundaries to provide this updated data.
Is kWh/GB a suitable metric?
There is debate over whether kWh/GB is a suitable metric for estimating the energy use of anything beyond making data travel through a cable. For example, the energy used in the data center is not necessarily directly proportional to the amount of data transferred because it depends on the amount of processing that the servers need to do when processing each request.
Likewise, there are arguments that energy for onsite networking equipment and end user devices should be measured per hour and not per GB. To complicate things further, the distance between the data center and end user can make a big difference, and yet isn’t represented as a variable when using a standard figure for energy per gigabyte.
These are all fair comments and in an ideal world, we would factor in every possible variable. However, the internet by nature is a hugely complex system in which it is impossible to measure a lot of the individual variables accurately. Even in cases where we could, it would require a complex, time consuming and expensive study to do so. This is simply not practical in most real applications, where time is short and there is no budget for calculating digital carbon emissions.
In order for us to take practical action to reduce website carbon emissions, we need to have a simple, standardised method of quantifying impact on a like-for-like basis. Energy per gigabyte is the simplest way to do that, and we have the benefit that several studies have quantified the full system on that basis.
Of course, we can pursue even greater levels of accuracy by making the effort to calculate some of the other variables, which I have been doing on the side, but it becomes hugely more complex and difficult to use as a method in real web projects. And that is what we need, practical tools and methods to drive improvement in real life web projects.
We must accept that no estimate of internet or website carbon emissions will ever be perfect – that’s why they’re called estimates. What matters is that we have a methodology that helps drive improvement, based on data from credible sources, with system boundaries that are appropriate to our needs, and that we clearly state our own assumptions.
Conflicts of interest
I mentioned that we should take our data from credible sources. In practice, most studies on this topic are going to be objective and trustworthy, but it’s worth being aware of potential conflicts of interest, even if genuinely unintentional. With industries like energy, transport, and food coming under mounting pressure to reduce carbon emissions, it seems reasonable to assume that the big tech and telecoms companies would be keen to avoid their own industries coming into the spotlight in a similar way. There is a natural incentive for the tech industry to want to play down its own energy consumption and carbon emissions, to give the impression that “there’s nothing to see here”.
Although most studies on this topic officially state that they have no conflicts of interest, many of those same studies are funded by tech companies or conducted by research teams at big tech companies. That is not inherently a problem, but I have been witness to conversations in private that have led me to be cautious here, and I believe it’s important that we keep our eyes open.
Erring on the side of caution
Like anything in life, we ultimately have to ask ourselves why we even care about quantifying the energy consumption (and carbon emissions) of websites.
Surely there is one reason above all others – we want to minimise the impact of our web projects in contributing to climate change.
With this in mind, it is helpful to contemplate the worst case scenarios if we as an industry underestimate or overestimate our emissions.
If we underestimate our emissions, we might conclude that there is no problem to be solved, ignore the issue entirely, and continue to build a web that threatens our chances of keeping global warming under 2°C.
If we overestimate our emissions, the worst case scenario is that we build even faster, more efficient web services, save resources, and accelerate the transition to a zero carbon future.
It’s worth keeping this in mind when selecting the data that we use to calculate energy consumption and carbon emissions of the web services that we create and use.