by Huseyn Panahov and Ryan Powers
1. Introduction
Carbon emissions are usually associated with the fossil fuel and transportation industries, yet our online activities also have a significant carbon footprint. It may seem counterintuitive, but data centers account for around 2% of all global greenhouse gas emissions. It is roughly in line with the global airline industry, and not far behind the chemical and petrochemical industry. Parallelly with the digital revolution, the demand for data centers continues to increase. While many industry leaders in the data center business have pledged to zero carbon emissions by 2030, these server farms still need gigantic amounts of energy to operate. In this research we have collected data about 41 data centers owned by Google and Oracle. We looked primarily at the power efficiency of the data centers and the climate indicators in the local geography. Our findings show that every 10 degrees Fahrenheit drop in the temperature translates to 0.006 point improvement in the Power Usage Effectiveness of the data centers. (1.0 is an ideal PUE indicator, whereas globally most data centers have a PUE around 1.8)
2. Background
There are 2,749 data centers from nearly 3,000 service providers in the United States, and about 5 thousand more around the rest of the world. With no alternative technology on the horizon, data centers are here to stay and will continue to grow in numbers. The three most important factors affecting data center energy efficiency are: design, power source, and climate. Data center design and more importantly equipment age affects power consumption as older servers and cooling systems operate at lower efficiency. Power is typically from a combination of renewable and non-renewable energy sources, and facilities that derive a greater share of power from renewable sources are more efficient. Most state-of-the-art facilities built by the largest providers (Google, Oracle, Facebook etc.) run up to 100% on renewable energy. This is not the case when considering the entire data center population. Finally, climate impacts energy efficiency predominantly because cooler, more temperate climates require less of a data center cooling system.
We sought to measure the energy efficiency of data centers accounting for external climate factors like wind, temperature, and precipitation. Cooling processes to regulate server temperature are the most energy intensive, and our hypothesis was that in colder climates you would observe more efficient energy consumption compared to hotter climates. Next, we present our methodology, data analysis, results, and areas for further research.
3. Methodology
While there are thousands of data centers around the world, most of them do not share information about their energy consumption. We were fortunate to find open information about 22 data centers operated by Google and 19 by Oracle. These are two tech industry leaders and they operate very energy efficient data centers. This means that the impact of local climate factors is even more significant on an average data center than in our study.
Data centers require large amounts of energy and electricity to power and cool the servers. Consequently, choosing the right location for a data center is a complex task, which requires consideration of local temperatures, power infrastructure, environmental architecture, in addition to business factors such as land price, legal environment, and skilled workforce.
There are a number of factors that impact a company’s decision to identify a location for a data center. Below are some of the most important factors:
Table 1: Decision-making factors for choosing a data center location
| Non-environmental factors | Description |
| 1. Availability of trained workforce | On average a large data center employs between 50 and 500 employees. They usually need trained workforce who can operate the technology and respond to emergencies. |
| 2. Proximity to the customer base | The shorter the distance between the data center and the main customer base, the less chances for incidents along the route |
| 3. Availability and price of land | Large data centers usually require anywhere between 100’000 and 5’000’000 square feet of land. |
| 4. Tax privileges | On average tech companies invest between $300 million and $3 billion to construct a large database. They provide both short term employment opportunities during the construction phase and long-term jobs after the launch. |
| 5. Security | Are there conflicts or other security vulnerabilities in the area? |
| 6. Rule of law | Can tech companies rely on fair judicial procedures? |
| 7. Energy infrastructure | This can be both environmental and non-environmental, but data centers need large amounts of electric power to remain operational 24/7. |
| Environmental factors | Description |
| 1. Energy infrastructure | Does the existing energy infrastructure rely on renewable power sources or fossil fuels? |
| 2. Potential for producing renewable energy | Wind speed, sunny days, precipitation |
| 3. Water resources | Besides energy, operating a large data center also requires access to large amounts of water. Water Usage Effectiveness (WUE) is the industry metric to measure the efficiency of data centers in utilizing the water resources |
| 4. Average Temperature | Average temperature |
| 5. Temperature variance | How much temperature varies in various time intervals |
Our study focuses solely on environmental factors, specifically how local climate conditions impact a data center’s power efficiency. We are looking for empirical evidence that data centers located in colder climates have higher power efficiency. Then, building up on this analysis we recommend what climate zones would be optimal locations for large data centers.
Every year an increasing number of tech companies release sustainability reports, which analyzes and summarizes the environmental impact of their business operations. However, most companies offer only aggregate numbers and do not make publicly available the datasets that shape those analyses. Big tech companies, such as Amazon and Microsoft, do not share even the locations due to safety considerations. Consequently, availability of data was one of the main factors that shaped this research.
In our project we look at the data centers of two multinational tech companies Google and Oracle. They have made publicly available both the locations of their data centers, as well as the Power Usage Effectiveness (PUE) indicator for each data center. Power Usage Effectiveness is the industry metric to estimate the power efficiency of a data center. Lower PUE means better power efficiency. The lowest possible PUE level is 1.0, which means 100% power efficiency. For most data centers the PUE level varies between 1.2 and 3.0, whereas the industry average is 1.8.
We collected PUE indicators for 38 data centers owned and operated by Google and Oracle and spread across 16 countries and 14 US states. Next, we looked up the various climate indicators for each location at a county or city level. Consequently, we built a dataset with 17 data points for each location, which accounted for local temperature variance, seasonal temperature, average temperatures, precipitation, wind speed, cloudiness, and solar power potential. Please, see the below list for our list of variables:
Table 2: List of variables
| # | Variable | Description |
| 1 | State | Country or US State where the database located |
| 2 | Database location | Location of the database |
| 3 | Company | Company that owns the database |
| 4 | PUE | Power Usage Effectiveness |
| 5 | Temp_variance | The difference between highest and lowest temperatures (max of high monthly average – min of low monthly average) in a given location * * |
| 6 | Temp_annual | Average annual temperature ** |
| 7 | Temp_halfyear_warm | Average temperature Apr – Sep (6 months) |
| 8 | Temp_halfyear_cold | Average temperature Oct – Mar (6 months) |
| 9 | Temp_winter | Average temperature for Dec – Jan – Feb |
| 10 | Temp_spring | Average temperature for Mar – Apr – May |
| 11 | Temp_summer | Average temperature for Jun – Jul – Aug |
| 12 | Temp_fall | Average temperature for Sep – Oct – Nov |
| 13 | Rain_annual | Sum of monthly rain averages. Measured in inches |
| 14 | WindSpeed_annual | Average of monthly wind speeds. Measured in mph |
| 15 | SolarPower_annual | Average Daily Incident Shortwave Solar Energy for the whole year . Measured in kWh |
| 16 | SolarPower_summer | Average Daily Incident Shortwave Solar Energy for Apr – Sep |
| 17 | SolarPower_winter | Average Daily Incident Shortwave Solar Energy for Oct – Mar |
| 18 | Cloudy_annual | % of the time the weather is cloudy in a year |
| 19 | Cloudy_summer | % of the time the weather is cloudy in warmer months: Apr – Sep |
| 20 | Cloudy_winter | % of the time the weather is cloudy in colder months: Oct – Mar |
| * All temperatures are measured in Fahrenheit ** For Australia and Chile, the data points were flipped | ||
4. Data Analysis
3.1 Statistical descriptions
In our dataset we have 20 variables, of which 17 are numeric and 3 are strings. We do not have any missing variables, because we constructed this dataset by hand. Let us look at basic statistical descriptions of our numeric variables.
Table 3: Statistical description of numeric variables

Based on these descriptions we can tell that climate conditions across the data centers in our dataset are quite diverse. For example, the amount of annual rainfall in inches varies between 8 and 73 inches depending on the location. The annual wind speed in these locations varies between 5 mph and 14 mph. The annual temperature varies between 42- and 82-degrees Fahrenheit.
Average temperature at a given location is 59 degrees Fahrenheit. However, considering that in an average data center there are 100 ‘000 servers, where each server emits 1200 BTU heat per hour, which would increase the indoor temperature by 213 degrees Fahrenheit without a proper Heating, Ventilation and Air Conditioning (HVAC) system. Generally, it has been considered that the optimal ambient temperature for most technologies, including servers in the data centers, is 68-75 degrees Fahrenheit.[1] More recently, some companies have introduced new servers that have higher heat tolerance at 81 degrees Fahrenheit.[2] With all things considered, it would be reasonable to assume that optimal indoor temperature for an average data center today is 72 degrees Fahrenheit. So, even in coldest locations there is a need for electric power to cool down the internal temperature, as well as to power up the technology.
As we can see the PUE values in our dataset vary from 1.06 to 1.78, while the average PUE is equal to 1.26. So, the average PUE value in our dataset is about 30% lower than the industry average PUE, which equals 1.8. This means that overall, the dependance on climate factors is likely to be higher for data centers than in our data set, because higher power efficiency (lower PUE) also means relatively less dependence on climate.
Picture 1: PUE distribution

3.2 Correlations
Now, let us look at the correlation between our numeric variables. We can construct a heatmap. We can see that there is a strong negative correlation -0.73 between SolarPower_annual and Cloud_annual, which validates the credibility of our dataset. Naturally, there should be a negative correlation between cloudy weather and the potential for solar power. However, most importantly we want to check the correlations between PUE and other variables.
We want to identify which variables have the most significant correlation with PUE. We notice that there is a positive correlation between PUE and annual temperature average (Temp_annual). There is an even stronger Temp_halfyear_cold (temperature for October through March), and PUE at the level of 0.4.
Picture 2: Heat-map of numeric variables

There is also a moderately strong relationship at the level of 0.4 between Temp_winter (average temperature for December, January, February) and PUE. If we look at this relationship separately, we notice that if the average winter temperature in a given location is 30 degrees Fahrenheit or below, then the PUE is most likely to be sub 1.2.
Picture 3: PUE vs Winter temperature

3.3 Building the model
For constructing our final model, we picked only one independent variable: the temperature for the cold half of the year. We could use more variables, but it could lead to multicollinearity and undermine the efficiency of our analysis. We built two models: a linear regression and a decision tree model.
Picture 4: Linear and Decision Tree models

The above visualization on Picture 4 represents the outputs of our models. The orange line represents a linear regression model, while red dots represent the outputs of a decision tree model. Below are the performance indicators of our models. Generally, we want to pick the model with lower Root Mean Square Error (RMSE) and higher r-squared. We notice that in this regard the decision tree model performs much better than the linear model. However, considering that decision tree models tend to be overfitting, we could choose either one of the models. (Note: because we have a very small dataset, we did not split it into training and test data).

R-squared is not as important in this case, because we are not building a predictive model and the difference between the RMSE values is not very significant, so we could choose either one of the models.
5. Conclusions

If we look up the coefficients of the linear regression model, we get the following numbers:
This means that the relationship between PUE and Temp_halfyear_cold, is as follows:
PUE = 0.943 + 0.006 x [Temp_halfyear_cold]
Based on this formula, we can suggest that every 10 degrees Fahrenheit decrease in the temperature for the cold half of the year, leads to a 0.06 decrease in PUE. Based on this formula, average winter temperature of 10 degrees Fahrenheit would mean PUE = 1.03 (0.943 + 0.006*10), a near perfect level of power efficiency. However, we understand that there are few places on earth with such low temperatures and they might not be the best locations for data centers due to a number of other reasons, discussed above.
This analysis provides empirical evidence that data centers have better power efficiency and lower carbon footprint in climates with lower temperatures. It shows that there is a moderately strong relationship between winter temperatures and PUE, and every 10 degrees increase in temperature could lead to about 0.06 decrease in power efficiency.
6. Limitations and future research
Our research was limited by the data we could access. Oracle and Google are two large companies that happen to uniformly report PUE metrics, as most do not. This limiting factor led to us not being able to compare them to other peers such as Facebook, Equinix, Microsoft. Furthermore, Oracle and Google already have a commitment to sustainable data centers, and thus we were unable to incorporate other companies with perhaps less sustainable practices into our dataset.
The PUE metric could also be considered a limitation. It is a metric designed for easy reporting and industry comparison, rather than true efficiency measurement. The input data for the calculation can and does vary company to company, given that no industry regulation mandates how it is measured and reported.
Future research could explore many different avenues. First, our model was not predictive since our dataset was so small. With a larger data set, one could predict the optimal climate for a data center. From there, we could have measured the PUE and carbon emissions differentials by relocating a data center to a more optimal location. Additionally, with more companies represented in the data, we could control for variables like market share, capital expenditures, and investments in renewable energy. Finally, a more robust analysis could identify a superior metric to PUE in measuring and comparing data center efficiency.
7. Sources
Ambient Temperature and Why it Matters for Data Centers. (2022, December 1). History-Computer. https://history-computer.com/ambient-temperature-and-why-it-matters-for-data-centers/
Benoit, R. (2022, February 9). An Updated Look at Data Center Temperature and Humidity. AVTECH. https://avtech.com/articles/4957/updated-look-recommended-data-center-temperature-humidity/
Google. (n.d.). Data Centers. Google. Retrieved December 14, 2022, from https://www.google.com/about/datacenters/
Oracle Cloud Data Center regions and locations. Oracle. (n.d.). Retrieved December 14, 2022, from https://www.oracle.com/cloud/cloud-regions/data-regions/
Siddik, M. A., Shehabi, A., & Marston, L. (2021). The environmental footprint of data centers in the United States. Environmental Research Letters, 16(6), 064017. https://doi.org/10.1088/1748-9326/abfba1
The Weather Year Round Anywhere on Earth – Weather Spark. (n.d.). https://weatherspark.com
United States of America: Data center market overview. Cloudscene. (n.d.). Retrieved December 14, 2022, from https://cloudscene.com/market/data-centers-in-united-states/all
[1] Ambient Temperature and Why it Matters for Data Centers. (2022, December 1). History-Computer.
[2] Benoit, R. (2022, February 9). An Updated Look at Data Center Temperature and Humidity. AVTECH.