The Differential Impacts of Human Capital and Infrastructure on the Sustainable Development Goals

June 5, 2023May 26, 2025 hpanahovLeave a comment

ABSTRACT:

This study looks at country-level data to explore the dynamics among human capital, infrastructure, and a country’s progress toward the United Nations Sustainable Development Goals (SDGs). Utilizing the confirmatory factor analysis method, I develop a new Infrastructure Index and combine it with the World Bank’s dataset on Human Capital Index to evaluate the relative impact of these factors on a country’s SDG scores. My findings affirm the integral roles of both human capital and infrastructure in the sustainable development context. However, a stronger correlation between human capital and the SDG Index suggests that policymakers seeking to advance the sustainability agenda should prioritize investments in human capital over infrastructure. Moreover, the study uncovers nuanced relationships between these indicators and specific SDGs. Human capital has a significant association with SDG 5 (Gender Equality), whereas infrastructure does not. Both human capital and infrastructure affect SDG 1 (No Poverty), with no statistical difference between their effects. Interestingly, while human capital correlates more strongly with SDG 13 (Climate Action), this relationship is negative due to the larger carbon footprint of more developed economies. These findings can inform policy decisions for goal-specific sustainable development strategies.

I. INTRODUCTION:

The central framework in the global development agenda is based on the 2030 Agenda for Sustainable Development, which “provides a shared blueprint for peace and prosperity for people and the planet, now and into the future.” It is undersigned by all UN Member States. Hundred-ninety-one countries have committed to achieving measurable progress on these goals by 2030. The Agenda constitutes seventeen interlinked Sustainable Development Goals (SDGs) that encompass a very wide variety of objectives. The seventeen SDGs are broken down into hundred-sixty-nine targets and two-hundred-thirty-two indicators to measure progress.

Measuring progress

One of the challenges in the SDG framework is measuring the progress in order to inform the policy. SDGs are successors to the Millennium Development Goals (MDGs), which consisted of 8 goals and 18 targets, 14 of which could be assessed quantitatively. MDGs were adopted in 2000, and all the countries from around the world committed to achieving these goals within 15 years. By the end of 2015, only three and a half of the 14 measurable targets were achieved. In 2023, we are at the half-way mark of the 2030 Agenda. According to the latest reports, the international community is behind schedule to achieving the SDG’s, partially due to the impact of the COVID-19.[1] In the given context, one of the most important questions is to find what policy interventions would be most effective to advance progress towards the SDGs.

What interventions are most effective?

Investments in both human capital and infrastructure are critical for achieving the sustainable development goals. These are both interdependent and complimentary domains in the international development space. However, policymakers working on specific developmental objectives are often forced to prioritize one over the other due to the limited nature of resources. This research analyzes country-level data from the United Nations and the World Bank to estimate the relationship between the overall SDG Index of a country and its performance on the Human Capital Index and Infrastructure Index. I will also examine the impact of human capital and infrastructure on SDG 1 (No Poverty), SDG 5 (Gender Equality), and SDG 13 (Climate Action). Below I provide more information about each one of the concepts analyzed in this research.

SDG Index

SDG Index is a composite indicator developed by the United Nations that weighs in the effects of development metrics across all the SDG metrics. It estimates countries’ performance on a scale from 0 to 100, and usually, Scandinavian countries, such as Finland, Denmark, Sweden, and Norway, achieve the highest rankings with scores > 80.[2] The 2022 Report includes the SDG indexes for 163 countries, among which the Central African Republic and South Sudan have the lowest scores, sub-40.

SDG 1: No Poverty

The first goal in the UN SDG framework calls to “end poverty in all its forms everywhere.” SDG 1 aims to ensure that everyone, regardless of their circumstances, has equal access to opportunities and resources for a quality life. It calls for comprehensive strategies to end poverty that include social protection systems and measures to build the resilience of the poor and those in vulnerable situations. The three main metrics of SDG 1 are: poverty headcount ratio at $1.90/day (%), poverty headcount ratio at $3.20/day (%), and poverty rate after taxes and transfers (%).

SDG 5: Gender Equality

Gender equality is fundamentally important for achieving the Sustainable Development Goals for several reasons. First, it is a matter of human rights. Everyone, regardless of gender, should have equal access to health, education, economic opportunities, and political representation. Second, gender equality is pivotal for economic growth, as women constitute half of the world’s potential human capital, and studies consistently show that societies that discriminate by gender tend to experience less economic growth and slower poverty reduction. The SDG 5: Achieve Gender Equality and Empower all Women and Girls incorporates the following metrics: the ratio of female-to-male mean years of education received (%), the ratio of female-to-male labor force participation rate (%), seats held by women in national parliament (%), gender wage gap (% of male median wage).[3]

SDG 13: Climate Action

SDG 13 calls for immediate action to combat climate change and its impacts. The Goal underscores the critical need for the global community to address the pressing issue of climate change. Recognizing that climate change is not just an environmental issue but also a significant threat to social and economic development, this goal calls for urgent action to reduce greenhouse gas emissions, build resilience, and improve adaptive capacity to climate-induced impacts. The metrics of SDG 13 include CO₂ emissions from fossil fuel combustion and cement production (tCO2/capita), CO₂ emissions embodied in imports (tCO₂/capita), CO₂ emissions embodied in fossil fuel exports (kg/capita), Carbon Pricing Score at EUR60/tCO₂ (%, worst 0-100 best).[4]

Statistical Performance Index

The Statistical Performance Index (SPI) evaluates the performance of national statistical systems based on the aggregate of five pillars of statistical capacity: data use, data services, data products, data sources, and data infrastructure. The SPI is a weighted average of the statistical performance indicators.

Human Capital Index

Human capital is sometimes referred to as soft infrastructure.[5] Without thriving human capital, nations cannot achieve their development goals, highlighting its central role in international development. It is widely acknowledged that improvements in human capital lead to increased productivity, which in turn spurs economic growth. Education and health, the two main components of human capital, have a direct impact on a country’s development trajectory. In 2018, the World Bank developed the Human Capital Index as a metric to measure and evaluate the quality and potential of human capital in a country. The HCI enables policymakers to identify strengths, weaknesses, and areas for improvement in human capital development. The HCI is based primarily on three components:

Child survival: This component considers that not all children survive to start formal education and looks at the under-5 mortality rate.
Education: This section combines information on the quality and quantity of education. The number of years a child is expected to complete school by age 18, considering current enrollment rates, measures the quantity of education. The quality is assessed using harmonized test scores from international student achievement testing programs.
Public health: This component uses two proxies for the overall health environment – adult survival rates (the percentage of 15-year-olds who will survive until age 60) and healthy growth among children under 5, measured by stunting rates.[6]

Infrastructure Index

According to the Merriam-Webster dictionary: Infra- means “below,” so the infrastructure is the “underlying structure” of a country and its economy, the fixed installations that it needs in order to function.”^[7] Public infrastructure provides the basic physical systems and structures, such as water supply, sewers, electrical grids, roads, bridges, and telecommunications, among others. High-quality infrastructure ensures the provision of fundamental necessities, advances safety, and enhances the quality of life. Infrastructure also facilitates the exchange of reliable information, increases productivity, creates more job opportunities, and fosters overall economic growth.

Unlike the Human Capital Index, there is no internationally recognized index that would indicate the level of public infrastructure in a given country. The objective of the UN SDG 9 is to “Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation.”[8] However, for the purposes of this research, it is not the best pointer because it includes indicators, such as Expenditure on Research and Development, Female share of graduates from Science, Technology, Engineering, and Mathematics (STEM) programs, but does not include indicators for access to electricity, water supplies, etc. However, there are seven SDG indicators across four different sustainable development goals that are related directly to the public infrastructure:

Indicator	Description	SDG
1. Access to basic water services	The percentage of the population using at least a basic drinking water service, such as drinking water from an improved source, provided that the collection time is not more than 30 minutes for a round trip, including queuing.	SDG 6: Ensure availability and sustainable management of water and sanitation for all
2. Access to basic sanitation services	The percentage of the population using at least a basic sanitation service, such as an improved sanitation facility that is not shared with other households.
3. Access to electricity	The percentage of the population who has access to electricity.	SDG 7: Ensure access to affordable, reliable, sustainable and modern energy for all
4. Adult population with bank accounts	The percentage of adults, 15 years and older, who report having an account (by themselves or with someone else) at a bank or another type of financial institution, or who have personally used a mobile money service within the past 12 months.	SDG 8: Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all
5. Internet penetration	The percentage of the population who used the Internet from any location in the last three months. Access could be via a fixed or mobile network.	SDG 9: Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
6. Transportation systems	The percentage of the surveyed population that responded “satisfied” to the question “In the city or area where you live, are you satisfied or dissatisfied with the public transportation systems?”.	SDG 11: Make cities and human settlements inclusive, safe, resilient and sustainable

Hypotheses:

The question driving this research is to find the differences in the effects of human capital and infrastructure on SDG scores. So, I have constructed the following hypotheses:

H₀:	There is no statistical difference in the effects of Human Capital and Infrastructure on SDG Index
H₁:	There is a statistical difference in the effects of Human Capital and Infrastructure on SDG Index
H₂:	There is a statistical difference in the effects of Human Capital and Infrastructure on SDG 1: No Poverty
H₃:	There is a statistical difference in the effects of Human Capital and Infrastructure on SDG 5: Gender Equality
H₄:	There is a statistical difference in the effects of Human Capital and Infrastructure on SDG 13: Climate Action

II. METHODS

Merging the data sets

I merge the World Bank Human Capital Index and the UN Sustainable Development 2022 datasets with the Country name as the unique identifier. When I drop the rows with missing HCI Index or the SDG Index values, the number of entries in my data frame reduces from 201 to 141. Part of the reason is that UN SDG data also includes geographic Regions (such as “East and South Asia” or “Latin America and the Caribbean”) and Income categories (such as “Low-income Countries” or “Upper-middle-income Countries”) under the Country variable. With that being said, there are also missing values in both data sets. Nonetheless, we still have 141 complete data rows, which is sufficient for us to proceed with our analysis.

Factor Analysis

Public infrastructure is a broad concept which we cannot easily observe and measure. In statistical terms, it is a latent variable, which refers to “concepts that cannot be measured directly but can be assumed to relate to a number of measurable manifest variables.”[9] I use the factor analysis technique, which allows me to account for various dimensions of the public infrastructure (such as water, electricity, internet, etc.) and output one variable. Factor Analysis is often used for constructing a new index, as it explores and uncovers the underlying relationships between observed manifest variables and unobserved latent variables.

KMO Test

The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy is a statistic that indicates the proportion of variance in the variables. The KMO values range from 0 to 1, with higher values indicating a better fit for factor analysis. The individual KMO values for each variable tell us how well each variable fits with all the others. Variables with a KMO less than 0.5 might not be suited for factor analysis as they do not correlate well with the other variables. As we see from the below output, the MSA values of all my variables are 0.8 or above, which brings the overall MSA score to 0.87, which is a positive sign.

Kaiser-Meyer-Olkin (KMO) Test results

Model 1

So, I keep all six manifest variables to construct a model that will estimate the infrastructure index. In the first model, the Comparative Fit Index (CFI) and Tucker-Lewis Index (TLI) are both above the 0.95 threshold, which indicates an appropriate fit. However, the Root Mean Square Error of Approximation is 0.099, above the maximum threshold of 0.08.

Infrastructure Index Model 1: fit estimates

Changing model specifications

Since the fit of the first model is not satisfactory, I change the specifications of the model based on the modification indices and theoretical considerations. Modification indices are a measure of how much the overall model chi-square would be expected to decrease if a particular parameter were freely estimated in the model. In other words, it provides a suggestion on how to improve the model fit. If we look at the modification indices between our variables, we notice that the relationship between sdg8_bank and sdg9_internet is disproportionately stronger than any other in our data. Most likely due to a strong correlation between the indicator for internet penetration and the percentage of the adult population who have bank accounts.

Modification Index table (descending order)

Model 2

So, I add a special path to my model, which accounts for the dependency between ‘SDG 8 bank accounts’ and ‘SDG 9 Internet’. I also create a path between ‘SDG 6 Water’ and ‘SDG 6 Sanitation’ because academic literature dictates that there is usually a strong dependency between these two variables. When I check the fit indices, the new model with two special paths performs much better than the previous one. Both the CFI and TLI values are > 0.99, and the RMSEA has decreased to 0.056.

Infrastructure Index Model 2: fit estimates

The Infrastructure Index

I use the model to estimate the infrastructure index for all 141 countries in our dataset. I also use the min-max normalization technique to transform the index into a new scale from 0 to 1. This method scales the values by subtracting the minimum value and dividing by the range of the original values (i.e., the difference between the maximum and minimum values). So, they maintain the same variance and proportions but on a scale from 0 to 1.

Estimating the Impact on the SDG Index

Both the Human Capital Index and the Infrastructure Index are ratio-level measures. When both independent variables are ratio level measures, “regression and correlation analysis are the standard techniques for measuring relationships and testing hypotheses.”[10] My main hypothesis is to test whether human capital or infrastructure makes a bigger impact on the overall SDG Score of a country. So, I construct a multivariate regression model with HCI and the Infrastructure Index as independent variables and then explore the beta coefficients of the model to understand which index has a stronger effect on the SDG Score.

III. RESULTS

Summary of the Multivariate Regression Model

Multivariate Regression Model

Besides the Human Capital Index and Infrastructure Index, I also have the Statistical Performance Index as an explanatory variable in my model. As mentioned earlier, it helps to account for some of the possible shortcomings in the data. We can see that all three variables and the model have a very high level of statistical significance, with p = 0. The R-squared value is not very important for us because we are looking at a descriptive model versus a predictive model. However, in any case, the multiple R-squared value is 0.91, which means that approximately 91% of the variability in the outcome variable can be explained by the predictor variables.

Model assessment: regression diagnostics

1. Test for Linearity

Before I proceed further with my analysis of the findings, I need to test the assumptions to validate that a linear regression model was a suitable approach. First, I look for linearity and equal variance in the below Residuals vs Fitted plot. Upon visual examination, there are no substantial deviations in the red line, which confirms that the relationship is linear between our explanatory and response variables.

2. Test for homoscedasticity

In the below plot, we can also observe that the vertical spread of the residuals is equally distributed, which means the error term does not vary much as values of the outcome variable change. So, our model passes the test for homoscedasticity as well.

3. Testing for Independence of residuals

Based on the observations from the below “Residuals vs Leverage” plot, our model passes the test of independence of residuals as well. Large residual values on this plot would suggest that the model is not explaining some aspects of the data. Our model does not have any standardized residual values above 1. In R programming language, I double-checked and confirmed no observations with Cook’s distance value above 1.

4. Testing for Normality of the error distribution

We can tell whether the error terms are normally distributed based on the observations from the below Q-Q plot. We want the residuals to be as close to the diagonal line as possible. However, generally, we rarely have real data where errors are perfectly normally distributed. So, some deviations are expected, and overall, it seems like our model passes the normality test. However, to double-check, I also apply the Shapiro-Wilk test.

The null hypothesis for the Shapiro-Wilk test is that the data is normally distributed. In this case, the p-value for the Shapiro-Wilk test is way above the significance level, which means that we cannot reject the null hypothesis, and the data is normally distributed.

5. VIF Score

Last but not least, since we are dealing with a multiple linear regression model, we need to make sure there is no multicollinearity. So, we apply the VIF Score test. “A rough rule of thumb is that variance inflation factors [VIF] greater than 10 give some cause for concern.” (Vehklahti p.93) As we can see from the below table, the VIF scores for all three of our independent variables are below 5. These scores indicate some multicollinearity but are safely within an acceptable range.

VIF Scores:

Beta coefficient analysis

After we have confirmed that model meets all 5 assumptions of a multivariate regression model, we can proceed with the analysis of the model. In order to estimate the impact of each individual variable on the SDG Index, we can look at the beta coefficients. The standardized beta coefficients allow us to compare the effects of the variables on the same scale, regardless of the units of measurement. Below are the beta coefficients of our linear multivariate regression model. We notice that the beta coefficient for hci_ind (Human Capital) is larger than the coefficient for infr_ind (Infrastructure). This suggests that Human Capital has a stronger impact on the output variable, the SDG Index.

Beta coefficients of the Multivariate Regression model

However, we also need to make sure the difference between the two beta coefficients is statistically significant. I run the below linear hypothesis test, which is based on the null hypothesis that there is no difference between the effects of the two indices: hci_ind and infr_ind.

Linear hypothesis test

The associated p-value (Pr(>F) = 0.0001545) is far below 0.05, indicating strong evidence to reject the null hypothesis that the coefficients for hci_ind (Human Capital Index) and infr_ind (Infrastructure Index) are the same. So, the data provides strong evidence that the effect of Human Capital on the sdg_ind (SDG Index) is different from the effect of Infrastructure (infr_ind) on sdg_ind.

Next, I explore the relationship between Human Capital Index, Infrastructure Index and specific Sustainable Development Goals: SDG 1: No Poverty; SDG 5: Gender Equality; and SDG 13: Climate Action. I construct a multivariate multiple regression model with three left-hand variables, indicators for SDG 1, SDG 5, and SDG 13.

Response SDG 1:

Based on the initial observation of the model summary, we can conclude that both human capital and infrastructure have a significant effect on poverty. However, we will need to explore further if there is a statistical difference between the effects of the two variables. Upon closer examination of the two beta coefficients, we find no statistically significant difference between the effects of the two explanatory variables.

Linear hypothesis test

Response SDG 5:

When we look at the response of SDG 5, we notice that Human Capital Index has a statistically significant impact on SDG 5, whereas Infrastructure Index does not. The value of the coefficient magnitude for hci_ind (52.15) is also larger than the coefficient for infr_ind (-8.30). Based on these observations, we can conclude that there is a statistical difference in the effects of human capital and infrastructure on SDG 5.

Response SDG 13:

The summary of the response to SDG 13 suggests that, once again, infrastructure does not have a statistically significant effect, but the impact of human capital is significant. So, we can claim that human capital has a statistically more significant effect on SDG 13 Climate action. However, we should also note that the coefficients are negative, which means there is a negative correlation between human capital and SDG 13. This is consistent with the basic correlations of the indicators in our dataset (please, see the Correlation Matrix table). I discuss these findings further in the conclusion.

Correlation Matrix

IV. CONCLUSION

Our findings confirm once again that both human capital and infrastructure are essential for the sustainable development of countries. They are both fundamentally important factors foretelling a country’s level of development. With that being said, based on our results, we can reject the Null Hypothesis that there is no statistical difference in the effect of human capital and infrastructure on a country’s SDG Index. The statistical analysis suggests a stronger inter-dependency between human capital and the SDG Index than with infrastructure. So, policy-makers facing the dilemma of choosing between investments in human capital and infrastructure should prioritize human capital if their goal is to advance the overall sustainable development agenda in the country.

However, we also found that Human Capital Index and the Infrastructure Index may have different levels of impact on specific objectives within the UN SDG framework. We discovered that human capital is a statistically significant indicator of a country’s performance on SDG 5: Gender equality, whereas infrastructure is not. We also established that while both indicators have a significant impact on a country’s performance on SDG 1: No poverty, there is no statistically significant difference between the effects of human capital and infrastructure on the poverty levels of a country. Last but not least, we figured that compared to infrastructure, there is a stronger inter-dependency between human capital and SDG 13: Climate action. However, there is a negative correlation between human capital and a country’s performance on climate indicators. This should not come as a surprise because developed countries with higher Human Capital Indexes produce far more carbon footprint than developing countries.[11] It is another reminder that developed countries should transition to more sustainable solutions.

V. WORKS CITED

Guterres urges countries to recommit to achieving SDGs by 2030 deadline. (2023, April 25). UN News. https://news.un.org/en/story/2023/04/1136017

Johnson, J. B., & Joslyn R. A. (1991). Political Science Research Methods: Second Edition. Congressional Quarterly Inc.

Merriam-Webster. (n.d.). Infrastructure. In Merriam-Webster.com dictionary. https://www.merriam-webster.com/dictionary/infrastructure

Sachs, J.D., Lafortune, G., Kroll, C., Fuller, G., Woelm, F. (2022). From Crisis to Sustainable Development: the SDGs as Roadmap to 2030 and Beyond. Sustainable Development Report 2022. https://dashboards.sdgindex.org/downloads

The Investopedia Team. (2023, February 7). Infrastructure: Definition, Meaning, and Examples. Investopedia. https://www.investopedia.com/terms/i/infrastructure.asp

World Bank Group. (2023). The Human Capital Project: Frequently Asked Questions. In World Bank. https://www.worldbank.org/en/publication/human-capital/brief/the-human-capital-project-frequently-asked-questions

The World Bank Group. (2020, September 23). Data Catalog. Human Capital Index. https://datacatalog.worldbank.org/search/dataset/0038030 

The world’s top 1% of emitters produce over 1000 times more CO2 than the bottom 1% – Analysis – IEA. (n.d.). International Energy Agency.

https://www.iea.org/commentaries/the-world-s-top-1-of-emitters-produce-over-1000-times-more-co2-than-the-bottom-1

United Nations. (n.d.). The 17 goals: Sustainable Development. United Nations. https://sdgs.un.org/goals

Vehkalahti, K., & Everitt, B. S. (2020). Multivariate Analysis for the Behavioral Sciences: Second Edition. CRC Press.

[1] Guterres urges countries to recommit to achieving SDGs by 2030 deadline. (2023, April 25). UN News.

[2] Sachs, J.D., Lafortune, G., Kroll, C., Fuller, G., Woelm, F. (2022). From Crisis to Sustainable Development: the SDGs as Roadmap to 2030 and Beyond. Sustainable Development Report 2022.

[3] United Nations. (n.d.). The 17 goals: Sustainable Development. United Nations. https://sdgs.un.org/goals

[4] Ibid

[5] The Investopedia Team. (2023, February 7). Infrastructure: Definition, Meaning, and Examples. Investopedia.

[6] World Bank Group. (2023). The Human Capital Project: Frequently Asked Questions.

[7] Merriam-Webster. (n.d.). Infrastructure. In Merriam-Webster.com dictionary.

[8] United Nations. (n.d.). The 17 goals: Sustainable Development. United Nations.

[9] Vehkalahti, K., & Everitt, B. S. (2020), p. 295

[10] Johnson, J. B., & Joslyn R. A. (1991), p. 319.

[11] The world’s top 1% of emitters produce over 1000 times more CO2 than the bottom 1% – Analysis – IEA. (n.d.). International Energy Agency

Impact of climate indicators on the carbon footprint of data centers

December 14, 2022June 7, 2024 hpanahovLeave a comment

by Huseyn Panahov and Ryan Powers

1. Introduction

Carbon emissions are usually associated with the fossil fuel and transportation industries, yet our online activities also have a significant carbon footprint. It may seem counterintuitive, but data centers account for around 2% of all global greenhouse gas emissions. It is roughly in line with the global airline industry, and not far behind the chemical and petrochemical industry. Parallelly with the digital revolution, the demand for data centers continues to increase. While many industry leaders in the data center business have pledged to zero carbon emissions by 2030, these server farms still need gigantic amounts of energy to operate. In this research we have collected data about 41 data centers owned by Google and Oracle. We looked primarily at the power efficiency of the data centers and the climate indicators in the local geography. Our findings show that every 10 degrees Fahrenheit drop in the temperature translates to 0.006 point improvement in the Power Usage Effectiveness of the data centers. (1.0 is an ideal PUE indicator, whereas globally most data centers have a PUE around 1.8)

2. Background

There are 2,749 data centers from nearly 3,000 service providers in the United States, and about 5 thousand more around the rest of the world. With no alternative technology on the horizon, data centers are here to stay and will continue to grow in numbers. The three most important factors affecting data center energy efficiency are: design, power source, and climate. Data center design and more importantly equipment age affects power consumption as older servers and cooling systems operate at lower efficiency. Power is typically from a combination of renewable and non-renewable energy sources, and facilities that derive a greater share of power from renewable sources are more efficient. Most state-of-the-art facilities built by the largest providers (Google, Oracle, Facebook etc.) run up to 100% on renewable energy. This is not the case when considering the entire data center population. Finally, climate impacts energy efficiency predominantly because cooler, more temperate climates require less of a data center cooling system.

We sought to measure the energy efficiency of data centers accounting for external climate factors like wind, temperature, and precipitation. Cooling processes to regulate server temperature are the most energy intensive, and our hypothesis was that in colder climates you would observe more efficient energy consumption compared to hotter climates. Next, we present our methodology, data analysis, results, and areas for further research.

3. Methodology

While there are thousands of data centers around the world, most of them do not share information about their energy consumption. We were fortunate to find open information about 22 data centers operated by Google and 19 by Oracle. These are two tech industry leaders and they operate very energy efficient data centers. This means that the impact of local climate factors is even more significant on an average data center than in our study.

Data centers require large amounts of energy and electricity to power and cool the servers. Consequently, choosing the right location for a data center is a complex task, which requires consideration of local temperatures, power infrastructure, environmental architecture, in addition to business factors such as land price, legal environment, and skilled workforce.

There are a number of factors that impact a company’s decision to identify a location for a data center. Below are some of the most important factors:

Table 1: Decision-making factors for choosing a data center location

Non-environmental factors	Description
1. Availability of trained workforce	On average a large data center employs between 50 and 500 employees. They usually need trained workforce who can operate the technology and respond to emergencies.
2. Proximity to the customer base	The shorter the distance between the data center and the main customer base, the less chances for incidents along the route
3. Availability and price of land	Large data centers usually require anywhere between 100’000 and 5’000’000 square feet of land.
4. Tax privileges	On average tech companies invest between $300 million and $3 billion to construct a large database. They provide both short term employment opportunities during the construction phase and long-term jobs after the launch.
5. Security	Are there conflicts or other security vulnerabilities in the area?
6. Rule of law	Can tech companies rely on fair judicial procedures?
7. Energy infrastructure	This can be both environmental and non-environmental, but data centers need large amounts of electric power to remain operational 24/7.

Environmental factors	Description
1. Energy infrastructure	Does the existing energy infrastructure rely on renewable power sources or fossil fuels?
2. Potential for producing renewable energy	Wind speed, sunny days, precipitation
3. Water resources	Besides energy, operating a large data center also requires access to large amounts of water. Water Usage Effectiveness (WUE) is the industry metric to measure the efficiency of data centers in utilizing the water resources
4. Average Temperature	Average temperature
5. Temperature variance	How much temperature varies in various time intervals

Our study focuses solely on environmental factors, specifically how local climate conditions impact a data center’s power efficiency. We are looking for empirical evidence that data centers located in colder climates have higher power efficiency. Then, building up on this analysis we recommend what climate zones would be optimal locations for large data centers.

Every year an increasing number of tech companies release sustainability reports, which analyzes and summarizes the environmental impact of their business operations. However, most companies offer only aggregate numbers and do not make publicly available the datasets that shape those analyses. Big tech companies, such as Amazon and Microsoft, do not share even the locations due to safety considerations. Consequently, availability of data was one of the main factors that shaped this research.

In our project we look at the data centers of two multinational tech companies Google and Oracle. They have made publicly available both the locations of their data centers, as well as the Power Usage Effectiveness (PUE) indicator for each data center. Power Usage Effectiveness is the industry metric to estimate the power efficiency of a data center. Lower PUE means better power efficiency. The lowest possible PUE level is 1.0, which means 100% power efficiency. For most data centers the PUE level varies between 1.2 and 3.0, whereas the industry average is 1.8.

We collected PUE indicators for 38 data centers owned and operated by Google and Oracle and spread across 16 countries and 14 US states. Next, we looked up the various climate indicators for each location at a county or city level. Consequently, we built a dataset with 17 data points for each location, which accounted for local temperature variance, seasonal temperature, average temperatures, precipitation, wind speed, cloudiness, and solar power potential. Please, see the below list for our list of variables:

Table 2: List of variables

#	Variable	Description
1	State	Country or US State where the database located
2	Database location	Location of the database
3	Company	Company that owns the database
4	PUE	Power Usage Effectiveness
5	Temp_variance	The difference between highest and lowest temperatures (max of high monthly average – min of low monthly average) in a given location * *
6	Temp_annual	Average annual temperature **
7	Temp_halfyear_warm	Average temperature Apr – Sep (6 months)
8	Temp_halfyear_cold	Average temperature Oct – Mar (6 months)
9	Temp_winter	Average temperature for Dec – Jan – Feb
10	Temp_spring	Average temperature for Mar – Apr – May
11	Temp_summer	Average temperature for Jun – Jul – Aug
12	Temp_fall	Average temperature for Sep – Oct – Nov
13	Rain_annual	Sum of monthly rain averages. Measured in inches
14	WindSpeed_annual	Average of monthly wind speeds. Measured in mph
15	SolarPower_annual	Average Daily Incident Shortwave Solar Energy for the whole year . Measured in kWh
16	SolarPower_summer	Average Daily Incident Shortwave Solar Energy for Apr – Sep
17	SolarPower_winter	Average Daily Incident Shortwave Solar Energy for Oct – Mar
18	Cloudy_annual	% of the time the weather is cloudy in a year
19	Cloudy_summer	% of the time the weather is cloudy in warmer months: Apr – Sep
20	Cloudy_winter	% of the time the weather is cloudy in colder months: Oct – Mar
* All temperatures are measured in Fahrenheit ** For Australia and Chile, the data points were flipped

4. Data Analysis

3.1 Statistical descriptions

In our dataset we have 20 variables, of which 17 are numeric and 3 are strings. We do not have any missing variables, because we constructed this dataset by hand. Let us look at basic statistical descriptions of our numeric variables.

Table 3: Statistical description of numeric variables

Based on these descriptions we can tell that climate conditions across the data centers in our dataset are quite diverse. For example, the amount of annual rainfall in inches varies between 8 and 73 inches depending on the location. The annual wind speed in these locations varies between 5 mph and 14 mph. The annual temperature varies between 42- and 82-degrees Fahrenheit.

Average temperature at a given location is 59 degrees Fahrenheit. However, considering that in an average data center there are 100 ‘000 servers, where each server emits 1200 BTU heat per hour, which would increase the indoor temperature by 213 degrees Fahrenheit without a proper Heating, Ventilation and Air Conditioning (HVAC) system. Generally, it has been considered that the optimal ambient temperature for most technologies, including servers in the data centers, is 68-75 degrees Fahrenheit.^[1] More recently, some companies have introduced new servers that have higher heat tolerance at 81 degrees Fahrenheit.^[2] With all things considered, it would be reasonable to assume that optimal indoor temperature for an average data center today is 72 degrees Fahrenheit. So, even in coldest locations there is a need for electric power to cool down the internal temperature, as well as to power up the technology.

As we can see the PUE values in our dataset vary from 1.06 to 1.78, while the average PUE is equal to 1.26. So, the average PUE value in our dataset is about 30% lower than the industry average PUE, which equals 1.8. This means that overall, the dependance on climate factors is likely to be higher for data centers than in our data set, because higher power efficiency (lower PUE) also means relatively less dependence on climate.

Picture 1: PUE distribution

3.2 Correlations

Now, let us look at the correlation between our numeric variables. We can construct a heatmap. We can see that there is a strong negative correlation -0.73 between SolarPower_annual and Cloud_annual, which validates the credibility of our dataset. Naturally, there should be a negative correlation between cloudy weather and the potential for solar power. However, most importantly we want to check the correlations between PUE and other variables.

We want to identify which variables have the most significant correlation with PUE. We notice that there is a positive correlation between PUE and annual temperature average (Temp_annual). There is an even stronger Temp_halfyear_cold (temperature for October through March), and PUE at the level of 0.4.

Picture 2: Heat-map of numeric variables

There is also a moderately strong relationship at the level of 0.4 between Temp_winter (average temperature for December, January, February) and PUE. If we look at this relationship separately, we notice that if the average winter temperature in a given location is 30 degrees Fahrenheit or below, then the PUE is most likely to be sub 1.2.

Picture 3: PUE vs Winter temperature

3.3 Building the model

For constructing our final model, we picked only one independent variable: the temperature for the cold half of the year. We could use more variables, but it could lead to multicollinearity and undermine the efficiency of our analysis. We built two models: a linear regression and a decision tree model.

Picture 4: Linear and Decision Tree models

The above visualization on Picture 4 represents the outputs of our models. The orange line represents a linear regression model, while red dots represent the outputs of a decision tree model. Below are the performance indicators of our models. Generally, we want to pick the model with lower Root Mean Square Error (RMSE) and higher r-squared. We notice that in this regard the decision tree model performs much better than the linear model. However, considering that decision tree models tend to be overfitting, we could choose either one of the models. (Note: because we have a very small dataset, we did not split it into training and test data).

R-squared is not as important in this case, because we are not building a predictive model and the difference between the RMSE values is not very significant, so we could choose either one of the models.

5. Conclusions

If we look up the coefficients of the linear regression model, we get the following numbers:

This means that the relationship between PUE and Temp_halfyear_cold, is as follows:

PUE = 0.943 + 0.006 x [Temp_halfyear_cold]

Based on this formula, we can suggest that every 10 degrees Fahrenheit decrease in the temperature for the cold half of the year, leads to a 0.06 decrease in PUE. Based on this formula, average winter temperature of 10 degrees Fahrenheit would mean PUE = 1.03 (0.943 + 0.006*10), a near perfect level of power efficiency. However, we understand that there are few places on earth with such low temperatures and they might not be the best locations for data centers due to a number of other reasons, discussed above.

This analysis provides empirical evidence that data centers have better power efficiency and lower carbon footprint in climates with lower temperatures. It shows that there is a moderately strong relationship between winter temperatures and PUE, and every 10 degrees increase in temperature could lead to about 0.06 decrease in power efficiency.

6. Limitations and future research

Our research was limited by the data we could access. Oracle and Google are two large companies that happen to uniformly report PUE metrics, as most do not. This limiting factor led to us not being able to compare them to other peers such as Facebook, Equinix, Microsoft. Furthermore, Oracle and Google already have a commitment to sustainable data centers, and thus we were unable to incorporate other companies with perhaps less sustainable practices into our dataset.

The PUE metric could also be considered a limitation. It is a metric designed for easy reporting and industry comparison, rather than true efficiency measurement. The input data for the calculation can and does vary company to company, given that no industry regulation mandates how it is measured and reported.

Future research could explore many different avenues. First, our model was not predictive since our dataset was so small. With a larger data set, one could predict the optimal climate for a data center. From there, we could have measured the PUE and carbon emissions differentials by relocating a data center to a more optimal location. Additionally, with more companies represented in the data, we could control for variables like market share, capital expenditures, and investments in renewable energy. Finally, a more robust analysis could identify a superior metric to PUE in measuring and comparing data center efficiency.

7. Sources

Ambient Temperature and Why it Matters for Data Centers. (2022, December 1). History-Computer. https://history-computer.com/ambient-temperature-and-why-it-matters-for-data-centers/

Benoit, R. (2022, February 9). An Updated Look at Data Center Temperature and Humidity. AVTECH. https://avtech.com/articles/4957/updated-look-recommended-data-center-temperature-humidity/

Google. (n.d.). Data Centers. Google. Retrieved December 14, 2022, from https://www.google.com/about/datacenters/

Oracle Cloud Data Center regions and locations. Oracle. (n.d.). Retrieved December 14, 2022, from https://www.oracle.com/cloud/cloud-regions/data-regions/

Siddik, M. A., Shehabi, A., & Marston, L. (2021). The environmental footprint of data centers in the United States. Environmental Research Letters, 16(6), 064017. https://doi.org/10.1088/1748-9326/abfba1

The Weather Year Round Anywhere on Earth – Weather Spark. (n.d.). https://weatherspark.com

United States of America: Data center market overview. Cloudscene. (n.d.). Retrieved December 14, 2022, from https://cloudscene.com/market/data-centers-in-united-states/all

[1] Ambient Temperature and Why it Matters for Data Centers. (2022, December 1). History-Computer.

[2] Benoit, R. (2022, February 9). An Updated Look at Data Center Temperature and Humidity. AVTECH.

The Evidence-Based Policymaking Act and Privacy

June 1, 2022May 26, 2025 hpanahovLeave a comment

Abstract:

The legislative act on The Foundations of Evidence-Based Policymaking created a framework for the centralization of statistical information collected by dozens of US federal agencies across the country and imposed responsibilities for sharing that data within the government, as well as with researchers and private entities. One of the main outcomes of the act is expected to be a National Secure Data Service, which will promote collaboration, help to avoid duplication, and minimize public expenditure on data collection and processing. Most importantly, it will improve the government efficiency by restructuring the national statistical ecosystem to better inform policy decisions. However, the centralization of federal data foretold by EBPA creates new privacy risks and vulnerabilities, which is why in the 1960s similar idea of a National Data Center was rejected in Congress. Back then, the debate around data centralization ended with the passing of the Privacy Act of 1974. A semi-century later, the data centralization idea has been approved, but no changes were made to the privacy legislation. This paper argues that, while EBPA is a positive step forward, it needs additional privacy safeguards that could be provided by revising the Privacy Act of 1974, which was last updated in 1988.

1. Background

This section looks at why the government collects data, how its institutional and technical capacity to process data has changed over time, and its consequential impact on the public debate around privacy.

1.1 Purpose

The corpse idea behind the Foundations of Evidence-Based Policymaking Act of 2018 (EBPA) is to create metrics for analyzing the government’s policy decisions and thus improve the federal government’s effectiveness. According to title 44 of the U.S. Code, the term evidence means “information produced as a result of statistical activities conducted for a statistical purpose” (44 USC 3561: Definitions). However, not all statistics are the same, and relying on bad data can do more harm than good. So, EBPA intends to increase not only the quantity of data supplied for informing policy decisions but also the quality.

1.2 Why do governments collect data?

The collection of certain data types is essential for a government to carry out its basic functions. As far back as five-six thousand years ago, ancient governments in Babylonia and Egypt collected some primitive forms of census data. Early governments needed the census data mainly for taxation and military recruitment. However, with the emergence of democratic states, census data became a crucial element of political representation. In the United States, holding a decennial census is embedded in the Constitution. Article 1, Section 2 of the Constitution mentions that “the actual Enumeration shall be made within three Years after the first Meeting of the Congress of the United States and within every subsequent Term of ten Years” (The National Constitution Center). The Nation’s Founders intended to equally divide the seats in Congress among the States and their populations. The initial benchmark was one Congress representative for every 30 thousand residents (Gauthier). (Today, that number hovers around 700 000) Consequently, census data was necessary to advance democratic governance.

1.3 Changing capacity to process data

However, producing quality data did not come so easily, as it requires institutional capacity building, training of professional staff, a certain level of public awareness, and resources to provide for all this. The first census in the U.S. took place in 1790 and counted the total population as 3,929 214 (A Timeline of Census History). Then, both President George Washington and Secretary of State Jefferson expressed skepticism and thought it was undercounted. Until 1840, State Secretaries were put in charge of organizing the decennial census, which was a temporary assignment. In 1849, Congress established a census board to oversee data collection, and the responsibility for census data shifted from the Department of State to the Department of Interior (DOI). And only in 1902, the Census Bureau became a permanent agency under the DOI (A Timeline of Census History).

This gradual shift from a temporary ad-hoc group of amateurs to a permanent government bureaucracy happened parallel to the increasing complexity of census operations and the government’s growing demand for quality data. It is noteworthy that two other federal statistical agencies were established before the Census Bureau. One is the National Center For Education Statistics, founded in 1867, and the other is the U.S. Bureau of Labor Statistics, founded in 1884. Today, overall, the U. S. has thirteen principal Federal Statistical Agencies and more than 90 federal organizations that engage in statistical activities. Please, see Table 1 for the full list of thirteen principal statistical agencies in the U.S.

Table 1: 13 Principal Federal Statistical Agencies in the U.S.:

Agency	Governing body	Founded
Bureau of Economic Analysis	Department of Commerce	1972
Bureau of Justice Statistics	Department of Justice	1979
Bureau of Labor Statistics	Department of Labor	1884
Bureau of Transportation Statistics	Department of Transportation	1992
Census Bureau	Department of Commerce	1903
Economic Research Service	Department of Agriculture	1961
Energy Information Administration	Department of Energy	1977
National Agricultural Statistics Service	Department of Agriculture	1961
National Center for Education Statistics	Department of Education	1867
National Center for Health Statistics	Department of Health and Human Services	1960
National Center for Science and Engineering Statistics	Independent	1950
Office of Research, Evaluation and Statistics	Social Security Administration	1935
Statistics of Income	Department of Treasury	1916

One of the forces driving the increasing demand for quality data was the transition of the U.S. to a welfare state. By definition, a welfare state means “a state that is committed to providing basic economic security for its citizens by protecting them from market risks associated with old age, unemployment, accidents, and sickness” (Weir). In order to efficiently allocate resources and provide targeted assistance, the state needed more complex and accurate databases on individual citizens. A turning point became the Social Security Act of 1935, which was part of the New Deal, a series of government programs in response to the Great Depression. At the time, the United States was the only modern industrial country that did not have a social security system.

One of the main provisions of the Social Security Act was the creation of the Social Security Number (SSN), which assigned a unique 9-digit number to every U.S. citizen, as well as a permanent and temporary resident. Over time the social weight and public perception of the SSN changed. Carolyn Puckett, working for the Office of Research, Evaluation, and Statistics at the Social Security Administration, wrote in 2009 that “created merely to keep track of the earnings history of U.S. workers for Social Security entitlement and benefit computation purposes, it. [SSN] has come to be used as a nearly universal identifier” (Puckett). It turned into the primary method for public services to identify citizens and organize the individual records.

1.4 Changing perceptions

As the number of entries about citizens started going up in the following years, various concerns about its privacy implications started emerging. In her 2018 book “The Known Citizen: A History of Privacy in Modern America,” Sarah E. Igo, Professor of History and the Dean of Strategic Initiatives for the College of Arts and Science at Vanderbilt University, writes that with the passage of the Social Security Act of 1935, “questions about how thoroughly the state ought to know its own people became less theoretical” (Igo, p. 57). Professor Igo writes that until the 1930s, public perception was that the government tracked only the troubled citizens and marginal communities to maintain public order. However, in the New Deal era, the government’s administrative tracking captured even more privileged citizens, and “being known to the government” became “increasingly constitutive of citizenship itself: a necessary exchange for steady employment, increased economic security, and free movement across borders” (Igo, p. 56).

Initial public reactions to the newly instituted Social Security programs were largely positive, especially during the years of World War II, when it enabled the government to efficiently identify and provide assistance for war veterans and wounded warriors. Some people went even as far as tattooing their social security numbers on their bodies to make sure they would not forget their nine digits. However, in the following decades, especially as the economic crisis and war waded into history, public debate about government databases shifted into a new phase. On the one hand, some government bureaucrats and social scientists believed that increasing public data records’ quantity and quality would lead to more efficient social and economic policies. On the other hand, many civil society activists and legal scholars were voicing concerns that swelling volumes of databases on citizens was an invasion of privacy.

1.5. The National Data Center

In this context, the story of the failed National Data Center in the 1960s is especially noteworthy and extremely relevant to the debate around the Evidence-Based Policy Act adopted in 2018. It started with a request from a group of social scientists, who in 1965 “recommended that the federal government develop a national data center that would store and make available to researchers the data collected by various statistical agencies” (Kraus, p. 1). The ensuing political turmoil is captured very eloquently in the “Statistical Déjà vu: The National Data Center Proposal of 1965 and Its Descendants” paper Rebecca Kraus wrote in 2013. On one side, some social scientists believed that “government programs designed to address social issues, such as civil rights, housing, employment, welfare, education, and poverty” could be improved if the academic community had access to the public data generated by the federal government (Kraus, p. 4). On the other hand, privacy advocates were concerned about the potential risks and vulnerabilities such a center would create. The proposal of the National Data Center lost fume in 1970, when the Bureau of the Budget, which led the research behind it, was reorganized into the Office of Management and Budget.[1]

2. The Commission

2.1 Formation of the Commission

In March 2016, Speaker of the House Paul Ryan and Senator Patty Murray put forward the bipartisan Evidence-Based Policymaking Commission Act of 2016, which President Barack Obama signed within the same month. It laid the foundation for the establishment of the U.S. Commission on Evidence-Based Policymaking (CEP), directed to “consider how to strengthen government’s evidence-building and policymaking efforts,” as well as “study how the data that government already collects can be used to improve government programs and policies,” and present its findings and recommendations to the Congress and the President.

2.2 Bipartisan initiative

It is worth underlining the bipartisan nature of this initiative. Two congress leaders, Democratic Senator from Washington State Patty Murray and Republican Speaker of the House of Representatives from Wisconsin Paul Ryan, had established good relations back in 2013 when they achieved breakthrough success with the Bipartisan Budget Act of 2013. The bill allowed Congress to avert a government shutdown and, in the long run, to save close to $23 billion. Patty Murray and Paul Ryan had made only small compromises to achieve the breakthrough, and both were applauded for the ensuing agreement. Three years later, they built on this success and initiated the CEP. During the introduction of the commission’s findings, Senator Murray said that “No matter what side of the aisle you’re on, we should all agree that government should work as efficiently as possible for the people it serves” (U. S. Senator Patty Murray). At the same time, Ryan Paul remarked that “Patty and I have long advocated for a way to better measure the federal government’s effectiveness—and this bill puts those efforts into action” (U. S. Senator Patty Murray).

2.3 Composition of the Commission

Consequently, the Commission was comprised of individuals who did not have strong political affiliations. They were mostly academics, some with prior experience in the federal government, one current employee of the U.S. Office of Management and Budget, and three from the private sector. Two of the commission’s fifteen members are well-known privacy advocates: Paul Ohm, a Professor of Law at the Georgetown University Law Center, and Latanya Sweeney, Professor of the Practice of Government and Technology at the Harvard Kennedy School. They are both well recognized for their research and publications on privacy law and policy. Paul Ohm’s position is that “data can be either useful or perfectly anonymous but never both.” Latanya Sweeney was a graduate student at the Massachusetts Institute of Technology in 1997 when she reidentified the Massachusetts Governor Bill Weld connecting his publicly accessible records to his anonymized medical records (Meyer). This made a big public impact and led to new legal restrictions on the disclosure of protected health information under the Health Insurance Portability and Accountability Act, known as HIPAA. So, within the CEP, Ohm and Sweeney advocated for additional frictions in accessing the government databases and adding layers of privacy protections.

Table 2: Members of the U.S. Commission on Evidence-Based Policymaking

	Name	Affiliation
1	Commissioner and Chair Katharine G. Abraham	University of Maryland
2	Commissioner and Co-Chair Ron Haskins	Brookings Institution
	Commissioners:
3	Sherry Glied	New York University
4	Robert M. Groves	Georgetown University
5	Robert Hahn	University of Oxford
6	Hilary Hoynes	University of California, Berkeley
7	Jeffrey Liebman	Harvard University
8	Bruce D. Meyer	University of Chicago
9	Paul Ohm	Georgetown University
10	Nancy Potok	U.S. Office of Management and Budget
11	Kathleen Rice Mosier	Faegre Baker Daniels, LLP
12	Robert Shea	Grant Thornton, LLP
13	Latanya Sweeney	Harvard University
14	Kenneth R. Troske	University of Kentucky
15	Kim R. Wallin	D.K. Wallin, Ltd.

However, on the other side of the debate were social scientists who believed that access to more data would improve both the quality of academic research and the efficiency of the government’s public policy. Consequently, there were many heated debates within the commission. The CEP held its first meeting in July 2016 and presented its final report in September 2017. During this period, they surveyed 209 Federal offices that work with evidence (data), invited 49 witnesses, held meetings with 40 organizations, hosted three public hearings, and reviewed comments from 350 respondents in the Federal Register (Bipartisan Policy Center). When the time came, they were able to present a final document that was undersigned unanimously by all commission members.

2.4 Recommendations

The final report of the commission, titled The Promise Of Evidence-Based Policymaking, was presented to the public on September 7, 2017. It included 22 specific recommendations that fell under four categories: 1. Improving Secure, Private, and Confidential Data Access; 2. Modernizing Privacy Protections for Evidence Building; 3. Implementing the National Secure Data Service; 4. Strengthening Federal Evidence-Building Capacity.In the 138-page document, the word privacy is used – 390 times, secure – 183 times, and confidential – 12. Overall, the report recognizes that “the country’s laws and practices are not currently optimized to support the use of data for evidence building, nor in a manner that best protects privacy” and suggests several measures to address this issue (Commission on Evidence-Based Policymaking).

2.5 The National Secure Data Service

One of the central ideas in the report is the establishment of a National Secure Data Service (NSDC), a kind of a successor to the idea of the National Data Center from the 1960s. Back then, during one of the Congressional hearings, economist Richard Ruggles had remarked that “although the emphasis in the privacy hearings was mainly on the possible danger of centralizing records, they also brought out that in some instances, the centralization of files can result in increasing the protection of individual privacy in situations where there have been flagrant abuses” (Kraus, p. 21). Building on this premise, members of the CEP believed that creating a centralized data center could enhance both the quality of data and privacy standards. The report suggested that the NSDC could learn from the expertise and institutional knowledge of the Center for Administrative Records Research and Applications (CARRA) and the Center for Economic Studies (CES) under the Census Bureau, which have been carrying out similar functions.

3. The Legislation

3.1 Passing into law

The Foundations for Evidence-Based Policymaking Act passed the House of Representatives on November 15, 2017. About eleven months later, the Senate approved the bill, as amended, by a unanimous vote. In January 2019, the President signed the “Foundations for Evidence-Based Policymaking Act of 2018” into law (Legislative Bulleting).The final act, which is about thirty pages long, makes only seven references to privacy, but it creates clear boundaries for the use of public data, assigns responsible parties for handling and protection of databases, and assumes legal penalties for the violations of the act’s provisions. Overall, the Act presents several progressive and innovative approaches to handling public data, but whether a sufficient level of privacy protections supplements these new practicesrequires a closer examination.

3.2 Legal Amendments

It goes without saying that the act is not built in a vacuum but rather supplements a complex system of pre-existing rules and regulations. The full title of the EBPA is: “to amend titles 5 and 44, United States Code, to require Federal evaluation activities, improve Federal data management, and for other purposes.” Title 5 of the U.S. Code is about “Government Organization And Employees,” and it contains regulations, such as The Freedom of Information Act (FOIA) adopted in 1967 and the Privacy Act of 1974. FOIA provides the American citizens the right to request access to records from any federal agency, given it does not violate certain privacy and confidentiality rules (Branscomb).[2] Privacy Act of 1974 established “a code of fair information practices that govern the collection, maintenance, use, and dissemination of information about individuals that is maintained in systems of records by federal agencies” (Privacy Act of 1974). EBPA did not make any changes either in FOIA or the Privacy Act but complimented title 5 of the U.S. Code with additional provisions about federal government data handling practices.

Title 44 of the U.S. Code is about “Public Printing and Documents” and covers all the archives, registries, and records managed by the federal government. Most provisions of the Confidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA) also fall under title 44. CIPSEA was part of the broader E-Government Act of 2002, and it established uniform confidentiality standards to protect the data collected by federal statistical agencies. The purpose was to avoid opportunities for triangulating data points and reidentifying respondents based on data shared by various statistical agencies. The Evidence-based Policymaking Act repealed CIPSEA 2002 and instead reauthorized CIPSEA 2018, with the overall intention of providing more opportunities to use public data for statistical purposes and imposing more responsibilities for risk aversion (Ruyle).

The EBPA also passed into law the “Open, Public, Electronic, and Necessary Government Data Act,” also known as the OPEN Government Data Act. Since 2009, the U.S. General Services Administration has been running a website Data.gov, which publishes for public access machine-readable datasets produced by the executive branch of the national government (Data.Gov). In March 2017, House democratic representative Derek Kilmer from Washington State proposed the OPEN Government Data Act that would expand the coverage of the data.gov and require “open government data assets made available by federal agencies (excluding the Government Accountability Office, the Federal Election Commission, and certain other government entities) to be published as machine-readable data… when not otherwise prohibited by law” (H.R.1770 – 115th Congress). All in all, EBPA was not an out of the blue, disruptive legislature, but rather another step towards open data and evidence-based policymaking that was plugged into the pre-existing legal infrastructure.

3.3 Statistical purpose

A top priority in the text of the EBPA is ensuring that only anonymized aggregate data will be shared to protect the confidentiality of respondents. One of the most frequently used terms is “statistical purpose” (mentioned 35 times), which according to the title 44 of the U.S. Code, means “the description, estimation, or analysis of the characteristics of groups, without identifying the individuals or organizations that comprise such groups” (44 USC 3561: Definitions). For example, collecting and processing data on the overall number of traffic incidents in Washington DC falls under statistical purposes. However, if the data is used to calculate car insurance rates adjusted for individual drivers in Washington DC, that would be a non-statistical use. For most social research and public policy purposes, aggregate data is sufficient. For example, if the unemployment rate among the Hispanic population is higher than other groups, then the government can initiate a tailored policy approach targeting specifically that group. However, when very large quantities of data are centralized in one place and various bits and parts are shared on public platforms, it creates opportunities for reverse tracking the data points, making meaningful connections, and reconstructing certain parts of the database not meant for public disclosure.

3.4 Risks and Responsibilities

From this standpoint, EBPA puts a big responsibility on the heads of federal agencies and will hold them accountable for determining “risks and restrictions related to the disclosure of personally identifiable information, including the risk that an individual data asset in isolation does not pose a privacy or confidentiality risk but when combined with other available information may pose such a risk.” Additionally, the law establishes the position of Evaluation Officer in each agency, whom the head of the agency will designate without regard to political affiliation. The main function of the Evaluation Officer will be to “continually assess the coverage, quality, methods, consistency, effectiveness, independence, and balance of the portfolio of evaluations, policy research, and ongoing evaluation activities of the agency.”

However, EBPA centralizes the data generated by all federal agencies. So to minimize the risks of reidentification, there is a need for interagency coordination. For this purpose, the law expands the functions and the institutional scope of the Interagency Council on Statistical Policy (ICSP), established under section 3504 (e)(8) of title 44, which designates the head of the Office of Management and Budget as head of the Council. In the 1980s, ICSP was an informal group that brought together representatives from federal statistical agencies to coordinate their activities, but it was authorized by statute as a formal council in 1995 (The Structure of the Federal Statistical System).. The Paperwork Reduction Act of 1995 has put the OMB, namely its Office of Information and Regulatory Affairs (OIRA) division, in charge of coordinating the U.S. Federal statistical system (Statistical Programs & Standards). The head of OIRA’s Statistical and Science Policy Office is also the Chief Statistician of the U.S.,[3] who hosts the meetings of the ICSP on a monthly basis. Under the EBPA, heads of statistical units or other officials with appropriate expertise from other federal agencies will also join the ICSP, which will have more responsibilities.

3.5 Upcoming assessment

The new law also establishes the position of Chief Data Officer (CDO) in each agency, who are “responsible for lifecycle data management,” as well as managing “data assets of the agency, including the standardization of data format, sharing of data assets in accordance with applicable law,” among fourteen other duties outlined in the law. Furthermore, section § 3520A of the EBPA provisions the establishment of the Chief Data Officer Council, which also falls under the OMB, but is separate from the ICSP. It is a temporary council that brings together representatives from 39 federal agencies. The CDO Council is assigned a number of tasks to complete before January 2025 (when it will disintegrate) (About Us. Federal CDO Council): “1. establish Governmentwide best practices for the use, protection, dissemination, and generation of data; 2. promote and encourage data sharing agreements between agencies; 3. identify ways in which agencies can improve upon the production of evidence for use in policymaking,” etc. So, certain provisions of EBPA are still in the assessment phase, and it will take a couple more years for EBPA to fully unpack.

4. Privacy

4.1 What is privacy?

The word “privacy” traces its roots to the Latin word privus, which means separate or single. The Merriam-Webster dictionary offers two definitions for privacy: 1. the quality or state of being apart from company or observation; 2. freedom from unauthorized intrusion. However, there are different approaches to privacy in the scholarly community and, consequently, different definitions. Generally, the significance and value of privacy may change depending on the social, political, and cultural circumstances, which has made it an elusive concept for a consensus definition. Nonetheless, the debate around privacy has been trending since the mid-twentieth century. It is not likely to end anytime soon, as modern technologies move us into uncharted territories with new friction points.

4.2 Privacy as a human right

A popular privacy perspective views it as a fundamental human right, or “the right to be left alone,” protected by law (MacCarthy, 2017). This approach recognizes an individual’s right to personal physical and informational space protected from external intrusion. The United States Constitution provides certain privacy protections. The Constitution’s fourth Amendment states: “The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated.” The Fifth Amendment provides conditional protections for private information by creating the right against self-incrimination. In the U.S. Common Law system, there are a number of court cases, such as Griswold v Connecticut, Lawrence v Texas, among others, that broaden the scope of these constitutional privacy protections. Additionally, the United States has dozens of legislation providing sectoral privacy protections. For example, the aforementioned HIPAA provides privacy protections for medical records, or the Family Educational Rights and Privacy Act legally restricts access to student records.

4.3 Privacy as a harm

Another privacy perspective is to view it as a right to be protected from harm. This approach shifts the focus of privacy debate from an individual to the level of society and asks, what the implications of data collection are on the society overall. This leaves a smaller space for personal information protection, which applies only when there is direct and tangible harm to the individual. Privacy as a harm framework emerged in the 1970s when one of its pioneers, Richard Posner wrote that we have two economic goods, “privacy” and “prying,” and expanding the privacy protections of individuals while contracting the rights of organizations collecting data is against our common interests (Posner, 1978). More recently, Howard Beales and Timothy Muris made a case for privacy in the harm framework by highlighting the example of credit score reporting since “collecting financial information about individuals has made loans more accessible to general public” (Beales & Muris, 2008). So, the social benefit of more accessible loans trumps the individual’s right to withhold financial information. This approach also prioritizes data protection over data collection and emphasizes the right to be protected from harmful externalities of data versus the data collection itself.

4.4 Privacy in social context

The most recent addition to the privacy debate was made by Hellen Nissenbaum, Professor at Cornell University. In her 2007 book, “Privacy in Context” Nissenbaum laid out a new privacy framework, which integrates elements from both the human rights and harm frameworks. Nissenbaum builds on the premise that privacy is a social construct, so its interpretation and application may vary depending on the social circumstances. From this vantage point, structured social factors such as canonical activities, roles, norms, and values, define the optimal degree of data access and visibility (Nissenbaum, p. 17). For example, the doctor you are visiting may have access to your medical records, but an insurance company may not. A basic quality of the social context framework is that privacy does not stop the flow of information but facilitates the information flow to some stakeholders while restricting it for others (MacCarthy, 2017). It is hard to disagree with Nissenbaum that privacy is a social construct, the value of which tends to change across geographic space, time, and other conditional factors. Especially nowadays, data has become omnipresent, and a uniform, rigid approach to all privacy issues cannot be the solution moving forward. Sometimes privacy is an inalienable human right. Other times there is a common public interest in sharing certain pieces of information that would be otherwise considered private. So, the social context framework offers a matrix that is broad, structured, and flexible enough to be applied across the privacy landscape.

4.5 Reasonable expectation of privacy

One of the most common reference points in the debate about privacy is the notion of “reasonable expectation of privacy.” It traces back to the seminal Supreme Court case Katz v. United States, which took place in the 1960s when the debate around the first National Data Center was ongoing. The Court’s decision expanded the Fourth Amendment privacy protections to include “what [a person] seeks to preserve as private, even in an area accessible to the public.” In concurrence with the final decision, Justice John Harlan established a two-part privacy test, which relies on the subjective expectations of the individual under query and the objective expectations of privacy by society as a whole. However, Hellen Nissenbaum, along with other contemporary privacy scholars, believes that due to the impact of modern disruptive technologies, the binary approach to privacy of inside/outside, secret/not secret or expected/not expected is somewhat outdated. Nissenbaum writes that previously “people could count on going unnoticed and unknown in public arenas; they could count on disinterest in the myriad scattered details about them” (Nissenbaum, p. 116), but now it has become far more complicated. New technologies allow capturing myriad details or data points about us into centralized databases, which adds new layers to the privacy debate where little details make big differences.

5. Analysis

This section takes a closer look at the implications of EBPA by asking what the risks and opportunities are in the centralization of federal data from a privacy standpoint.

5.1 Data Centralization

Contrary to one of the top recommendations in the CEP report, the EBPA did not establish a National Secure Data Service, but it did create frameworks for interagency coordination and data centralization. We discussed earlier the expanded role of the ICSP and the temporary Chief Data Officers Council. EBPA also established another temporary council, Advisory Committee on Data for Evidence Building, that brings together Evaluation Officers, Chief Data Officers, and other managers responsible for data handling across the federal statistical system. Currently, the Advisory Committee is administered by the Census Bureau and the Bureau of Economic Analysis (BEA) under the Department of Commerce and works closely with the Office of Management and Budget. (Advisory Committee on Data for Evidence Building). In its Year 1 report, published in October 2021, the Advisory Committee has already affirmed the need for the establishment of the National Secure Data Service, as proposed by the CEP (Advisory Committee on Data for Evidence Building: Year 1 Report).

5.2 Advantages of the NSDS

It is hardly surprising because, from the beginning, one of the top priorities behind the EBPA was the creation of a centralized command and control mechanism over all the data the federal government generates. When CEP started its first meetings for the research on EBPA, it had sixteen talking points, four of which were about the NSDS. It included points such as “tiered access with a NSDS,” or the role of the NSDS in the federal evidence ecosystem (CEP report, p. 123). To follow up on an earlier discussion, in its final recommendations, CEP proposed that it would enable the OMB to create higher standards for data collection and protection, which could be applied across the country. So, the same level of national database protection principles would be applied to data from either small rural communities or large metropolitan areas. NSDS would also help curtail duplicative efforts and improve the efficiency of the statistical agencies. Consequently, it would also decrease the expenses on federal data and reduce the burden on the public.

5.3 Risks and vulnerabilities

However, the federal government is handling very large volumes of data on a routine basis, and the centralization of so much statistical information within the hands of one center creates new privacy risks. First, it may change the public perception of federal data and potentially create a burden on civic life. Second, increasing public access to federal statistics increases the risk to data confidentiality, and EBPA creates obligations for making significant amounts of datasets publicly available.

5.4 Panopticon view

During the first round of privacy debates in the 1960s, Democratic Congressman from New Jersey, Cornelius Gallagher said that improving government efficiency promised by the idea of a National Data Center “would be paid for at the far greater expense of weakening the right to privacy of all American citizens” (Kraus, p. 11). While a privacy scholar Vance Packard concluded his Congressional testimony by noting that “my own hunch is that Big Brother, if he ever comes to these United States, may turn out to be not a greedy power seeker, but rather a relentless bureaucrat obsessed with efficiency” (Kraus, p. 10). As we discussed earlier, privacy is a social construct, and its social value and impact may change depending on the circumstances. For example, we might feel comfortable sharing our medical records with the hospital, educational records with the employer, and income statements with the Internal Revenue System, but it creates a different reality when someone is able to put it all together. It gives an impression that someone knows about you as much as you do, and you are no longer in charge of your privacy. This kind of public opinion is detrimental to civic life, even if it is not based on true facts, as perception becomes a reality, and people inhibit their freedom of self-expression.

One of the most influential philosophers of the 20^th century, Michael Foucault, put forward the concept of panopticism. Its central argument is that people change their behavior, even when there is a modest chance that those in the position of power could watch them. Originally panopticon was a constructional design plan for prisons proposed by English philosopher Jeremy Bentham in the late 18^th century. The idea is that the prison floor is designed in a circular form, where the prison guard sits in the very center and can see all the inmates, but they cannot see the guard. Foucault articulated that this creates a power dynamic, where inmates become their own surveillance because they do not know when they could be monitored.

5.5 Privacy: statistics vs. surveillance

However, there is an important distinction between the statistical analysis foretold by the EBPA and the type of surveillance assumed by the panopticon approach. Surveillance focuses on specific targets, whereas statistics processes aggregate data, and as we mentioned earlier, EBPA puts a heavy emphasis that data will be used for statistical purposes only. Part B of the Act is titled “Confidential Information Protection” and has several safeguards against abuses of the federal databases. For example, it suggests that those who handle the data will take a pledge of confidentiality and will be liable in front of the law for a Class E felony and could be

imprisoned for up to 5 years and/or fined up to $250 000.[4] EBPA also obliges the statistical agencies to clearly distinguish any information that could be used for non-statistical purposes and provide public notice about the actual purpose of the data. However, there are loopholes in the legislation about what will be the mechanisms and conditions for public communication. A 2019 survey by the Pew Research Center showed that 64% of Americans are concerned over the government’s use of public data, while 78% do not understand what government does with the collected data (Auxier, et al). It would be good to have more legal encouragement for the executive agencies like the NSDS to prioritize public accountability and engagement.

5.6 Privacy vs. confidentiality vs. anonymity

People working for the NSDS will also face technical challenges in preserving the confidentiality and anonymity of data. First, let us look at the distinctions between privacy, confidentiality, and anonymity. Confidentiality and anonymity are only about a person’s actions and data, but privacy is also about the person (Privacy and Confidentiality). For example, whether someone may ask you personal questions is a matter of privacy. However, whether they can share your responses with another person is a question of confidentiality. Confidentiality implies that the surveyor knows your identity but will not share it outside a certain social group. Anonymity refers to a condition where even the primary surveyor does not know or register your identity. Both confidentiality and anonymity fall under the bigger umbrella of privacy, but neither captures its full meaning.

5.7 Privacy legislations

Not only in the United States but around the world, privacy regulations do not apply to anonymized data. For example, European Union’s well-known privacy law, the General Data Protection Regulation, has a provision that states that “The principles of data protection should apply to any information concerning an identified or identifiable natural person… this Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes” (Recital 26: Not Applicable to Anonymous Data). The United States Privacy Act of 1974 also has a specific exemption for “statistical record,” which means “a record in a system of records maintained for statistical research or reporting purposes only and not used in whole or in part in making any determination about an identifiable individual” (Privacy Act of 1974). EBPA complies with this provision of the privacy law. However, in half a century since the Privacy Act was passed, many changes have happened both in the statistical science and technical capacity of machines to process data.

5.8 Reidentification

We have mentioned earlier the new methods and techniques for reidentification by triangulating data points from several anonymized data sets. Paul Ohm, Georgetown Law professor and a member of the CEP, wrote in his 2010 paper that “Reidentification science disrupts the privacy policy landscape by undermining the faith we have placed in anonymization… advances in reidentification expose these promises as too often illusory” (Ohm, Paul ). To avoid the traps of reconstruction algorithms, statistical experts have developed several data protection mechanisms. For example, for many years, the Census Bureau, forced to publish blocks of its data sets, has been using various noise-infusion techniques, such as ‘swapping,’ ‘blank-and-impute,’ ‘partially synthetic data’ and most recently, differential privacy (boyd and Sarathy, p. 7). These approaches preserve the integrity of the datasets and maintain their full value for most purposes without compromising confidentiality. However, in very few scenarios, these methods could result in minor deviations since data sets are manipulated. These manipulation methods cannot be shared publicly because that would undermine the confidentiality of the datasets. Consequently, these disclosure control methods create friction between data users and the Census Bureau.

6. Recommendations:

6.1 Study the impact on civic activism

In the early 1970’s Advisory Committee on Automated Personal Data Systems was established under the Department of Health, Education, and Welfare to research the potentially harmful consequences of automated personal data systems, effective safeguards to protect against those negative consequences, as well as “policy and practice relating to the issuance and use of Social Security Numbers” (U. S. Department of Health, Education and Welfare). The Committee published its final report titled “Records, Computers, and the Rights of Citizens” in 1973, which had a ripple effect on privacy laws and regulations around the world for the following decades. In the United States, it laid the foundations for the Fair Information Practice Principles, applied by the Federal Trade Commission to the private sector, and made an impact on the Privacy Act of 1974.

Much has changed since the 1970s, and more changes will come after the EBPA is fully unrolled. Now, the federal government needs to conduct a similar study to assess the impact of the EBPA and data centralization on civic activism and freedom of expression. The United States is by far the biggest experiment in human history, testing the power of a society built on individual liberties. One of the cornerstones of America’s success story is the value and emphasis it puts on freedom of self-expression. Even nominal burdens on privacy and civil liberties could be a very high cost to pay for the promises of EBPA.

6.2 Revisiting the legislation

The findings of that report should be built into revising the Privacy Act of 1974. The latest change to the legislation was made in 1988 when Congress passed the Computer Matching and Privacy Protection Act, which requires that federal agencies “enter into written agreements with other agencies or non-Federal entities before disclosing records for use in computer matching programs.” On the official online database of the Congress (Congress.gov), there are 1,200 bills that have the word privacy in the title. Most of them have not passed the House floor, but it shows how complicated is the legal terrain on privacy in the United States. Ninety-four privacy bills were introduced in 1973-74, and then there have been, on average, twenty bills on privacy initiated every year.

Current legislation puts a heavy burden on the statistical agencies to respond to three competing demands. They have to produce good quality data, but they also have to protect the privacy of their respondents. Now, they are also obliged to make these datasets publicly available, which forces them to use various techniques such as differential privacy. However, that makes certain data consumers unhappy, as we can see from the experience of the Census Bureau. So, it would be good to relieve the statistical agencies of some of this burden and provide legal tools and justifications for the privacy protections applied to public datasets.

7. Conclusion

Over the years, U.S. federal statistical agencies have accumulated tremendous institutional expertise and technical capacity to produce large-scale, high-quality data. Now EBPA is rallying up the forces of the federal statistical agencies into a cohesive unit to provide a numerical insight into the performance of the executive branch. It will create an administrative mechanism for informing the government’s policy decisions, as well as a public accountability mechanism since large segments of the government data will be made publicly accessible. However, it also consolidated all the statistical information of the federal government into centralized databases, which creates new privacy risks and vulnerabilities. EBPA is yet to be fully unrolled, but one of its main consequences is expected to be the establishment of the NSDS, which will have an enormous weight on its shoulders as it will need to satisfy several competing demands. On both ends of the line, NSDS will be working with and for the American people, so it is very important to keep them informed and understand public impact and expectations. It is the right time for the U.S. government to conduct a study on the impact of centralized, automated databases on civic life, akin to the one conducted in 1973, and incorporate that into updating the privacy legislation.

References:

About Us. (2020). Federal CDO Council. https://www.cdo.gov/about-us/

Advisory Committee on Data for Evidence Building. (2022). U.S. Bureau of Economic Analysis (BEA). https://www.bea.gov/evidence

Advisory Committee on Data for Evidence Building: Year 1 Report. (2021, October). Office of Management and Budget. https://www.bea.gov/system/files/2021-10/acdeb-year-1-report.pdf

Auxier, B., Rainie, L., Anderson, M., Perrin, A., Kumar, M., & Turner, E. (2020, August 17). Americans and Privacy: Concerned, Confused and Feeling Lack of Control Over Their Personal Information. Pew Research Center: Internet, Science & Tech. https://www.pewresearch.org/internet/2019/11/15/americans-and-privacy-concerned-confused-and-feeling-lack-of-control-over-their-personal-information/

A Timeline of Census History. United States Census Bureau.

https://www.census.gov/history/img/timeline_census_history.bmp

Beales, Howard, & Muris, Timothy. “Choice or Consequences: Protecting Privacy in Commercial Information.” 75 U. Chi. L. Rev. 109 2008 pp. 109-120

Bipartisan Policy Center. Frequently Asked Questions Related to the Commission on Evidence-Based Policymaking’s Report. (2019, March). https://bipartisanpolicy.org/download/?file=/wp-content/uploads/2019/03/CEP-FAQs.pdf

boyd, d. & Sarathy, J. “Differential Perspetives: Epistemic Disconnects Surrounding the US Census Bureau’s Use of Differential Privacy”

Branscomb, Anne (1994). Who Owns Information?: From Privacy To Public Access.

Commission on Evidence-Based Policymaking. (2017, September). THE PROMISE OF EVIDENCE-BASED POLICYMAKING. Bipartisan Policy Center. https://bipartisanpolicy.org/download/?file=/wp-content/uploads/2019/03/Appendices-e-h-The-Promise-of-Evidence-Based-Policymaking-Report-of-the-Comission-on-Evidence-based-Policymaking.pdf

Data.Gov. (2022) About. https://data.gov/about/

Dr. Latanya Sweeney’s Home Page. (2021). http://latanyasweeney.org/

Gauthier, J. H. S. (2021). 1790 Overview – History – U.S. Census Bureau. United States Census Bureau. https://www.census.gov/history/www/through_the_decades/overview/1790.html

H.R.1770 – 115th Congress (2017–2018): OPEN Government Data Act. Congress.Gov | Library of Congress. https://www.congress.gov/bill/115th-congress/house-bill/1770

Igo, S. E. (2020). The Known Citizen: A History of Privacy in Modern America. Harvard University Press.

Mark MacCarthy, (2017). “Privacy Policy and Contextual Harm” 13 I/S: Journal of Law and Policy. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3093253

Nissenbaum, Helen (2007). Privacy in Context. Stanford University Press. Kindle Edition

44 USC 3561: Definitions. Office of the Law Revision Counsel (2022).https://uscode.house.gov/view.xhtml?req=(title:44%20section:3561%20edition:prelim)%20OR%20(granuleid:USC-prelim-title44-section3561)&f=treesort&edition=prelim&num=0&jumpTo=true

The National Constitution Center (2022). The Constitution – Full Text. https://constitutioncenter.org/interactive-constitution/full-text

Paul Ohm. (n.d.). PaulOhm.Com. https://www.paulohm.com/

Puckett, C. (2009, July 1). The Story of the Social Security Number. Social Security Administration Research, Statistics, and Policy Analysis. https://www.ssa.gov/policy/docs/ssb/v69n2/v69n2p55.html

U. S. Senator Patty Murray (2017, November 1). Senator Murray, Speaker Ryan Introduce Evidence-Based Policymaking Legislation. https://www.murray.senate.gov/senator-murray-speaker-ryan-introduce-evidence-based-policymaking-legislation/

Legislative Bulletin (2019). The President Signs H.R. 4174, “Foundations for Evidence-Based Policymaking Act of 2018.” Social Security Administration

https://www.ssa.gov/legislation/legis_bulletin_021519.html

Meyer, M. (2018, October 31). Law, Ethics & Science of Re-identification Demonstrations. Harvard Law Petrie Flom Center. https://blog.petrieflom.law.harvard.edu/symposia/law-ethics-science-of-re-identification-demonstrations/

Ohm, Paul. “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization” (August 13, 2009). UCLA Law Review, Vol. 57, p. 1701, 2010. Available at SSRN: https://ssrn.com/abstract=1450006

Posner, Richard. “The Right of Privacy.” Georgia Law Review 393 (1978) pp. 393 – 404

Privacy Act of 1974. (2021, April 30). Department of Justice. https://www.justice.gov/opcl/privacy-act-1974

Privacy and Confidentiality. (n.d.). CHOP Research Institute. https://irb.research.chop.edu/privacy-and-confidentiality

“Privacy.” Merriam-Webster.com Dictionary, Merriam-Webster, https://www.merriam-webster.com/dictionary/privacy. Accessed 3 May. 2022.

Public Law No: 115–435. Foundations for Evidence-Based Policymaking Act of 2018 Congress.gov. (2019). https://www.congress.gov/bill/115th-congress/house-bill/4174

Recital 26: Not Applicable to Anonymous Data. General Data Protection Regulation. (2016).

https://gdpr-info.eu/recitals/no-26

Ruyle, M. (2019, March 1). New Law Offers Reforms to Improve Access to Data, Confidentiality Protections | Amstat News. Magazine of the American Statistical Association. https://magazine.amstat.org/blog/2019/02/01/law-improves-data-confidentiality/

Statistical Programs & Standards. (2021, December 22).The White House.

https://www.whitehouse.gov/omb/information-regulatory-affairs/statistical-programs-standards/

The Structure of the Federal Statistical System. (n.d.). The White House. https://obamawhitehouse.archives.gov/omb/inforeg_statpolicy/bb-structure-federal-statistical-system

Understanding Confidentiality and Anonymity. (n.d.). The Evergreen State College. https://www.evergreen.edu/humansubjectsreview/confidentiality

U. S. Department of Health, Education and Welfare. (1973, July). Records, Computers and the Rights of Citizens. DHEW Publication. https://www.justice.gov/opcl/docs/rec-com-rights.pdf

Weir, M (2001). Welfare State. International Encyclopedia of the Social & Behavioral Sciences

https://doi.org/10.1016/B0-08-043076-7/01094-9

[1] General Service Administration proposed a similar idea in the 1970s to create an inter-connected network of federal government data systems, which did not succeed either.

[2] They have a very user-friendly website operated by the Department of Justice at https://www.foia.gov/

[3] OIRA is also in charge of the cost-benefit analysis laid out in the President’s Executive Order 12866

[4] It is important to note that the Privacy Act of 1974 imposed note more than $5000 fine, which in today’s money equals around $30 000: “Any member, officer, or employee of the Commission… who knowing that disclosure of the specific material is so prohibited, willfully discloses the material in any manner to any person or agency not entitled to receive it, shall be guilty of a misdemeanor and fined not more than $5,000.”

A Critical Review of UNEP’s Food Waste Index

April 5, 2022May 26, 2025 hpanahovLeave a comment

Its Impact and Limitations on Sustainable Consumption Policies

I. Introduction

Sustainable consumption is one of the priority areas in the international development agenda. In 2015, 193 UN member states undersigned the 2030 Agenda for Sustainable Development, which consists of seventeen interlinked Sustainable Development Goals. It is a comprehensive development framework that also focuses on “responsible consumption and production.” However, it is a strategic-level document, which did not take into account the operational-level challenges for developing indicators to measure the progress towards these goals. In 2021, United Nations Environment Program published its first Food Waste Index (FWI) report, which is presented as the most comprehensive report on global food waste and made many news headlines.[1][2] The UNEP has done an enormous job building the groundwork for producing global data on food waste, but the organization attributes low or very low confidence level to nearly 80% of the data used to construct the FWI. Given the context, the FWI is not a reliable benchmark for either measuring progress or informing adequate policy decisions.

II. Background

In September 2015, at the landmark UN Sustainable Development Summit in New York, countries worldwide agreed on a post-2015 global development agenda “to achieve a better and more sustainable future for all people and the world by 2030.”[3] They agreed on 17 Sustainable Development Goals, which are broken down into 169 SDG Targets, which in turn have 232 unique indicators (as of February 2022) to track progress.[4] Particularly, SDG 12 focuses on “responsible consumption and production,” which is about “decoupling economic growth from environmental degradation, increasing resource efficiency and promoting sustainable lifestyles.”[5] There are eight targets under SDG 12, which mainly focus on national policies and big-scale producers, but two of them are about consumer behavior and thus fall within the scope of our research. Target 12.3: reduce food losses along production and supply chains and halve global per capita food waste at the retail and consumer levels;[6] and, 12.8: promote universal understanding of sustainable lifestyles.

SDG Target 12.3 has two indicators: the Food Loss Index produced by Food and Agriculture Organization of the UN and Food Waste Index produced by the UN Environment Programme (UNEP). The Food Loss Index (FLI) measures the percentage of food loss from production up to (but not including) retail level. Food Waste Index (FWI) focuses on the percentage of food wasted at the retail and consumption stages. Since the focus of this paper is on sustainable consumption, I will take a closer look at the Food Waste Index, analyze the data behind it, and assess its impact.

After carefully examining the datasets used for the Food Waste Index, I concluded that existing data are not reliable enough for measuring the progress towards SDG Target 12.3, and advancing tailored policy interventions. However, these conclusions should not undermine the importance of the food waste issue, since every data point, every study and observation demonstrate that there is a significant food waste problem both in economically developed and underperforming countries. It is a major concern, as hundreds of millions around the world suffer from malnutrition, since their caloric intake falls below minimum energy requirements.[7] That is also the reason, why we need to understand the limitations of currently available data.

III. Data Analysis

UNEP worked together with a non-profit organization based in the United Kingdom the Waste and Resources Action Program (WRAP) to produce its first Food Waste Index in 2021, which is considered the “most comprehensive report into global food waste in homes.”[8] The report was published in 2021, but the numbers represent the situation in 2019. According to the report, 17% of all food that reaches retail ends up in the dumpster. Of that number, households are accountable for 61% of food waste, food service industry (restaurants) for 26% and retail for 13%.[9]

These are staggering numbers and to put them in perspective, they mean that roughly 931 million tonnes of food is wasted every year, which is more than the total consumption in a country as big as India. If we combine Food Waste Index with Loss Index, it would mean that more than a third of all food is either lost or wasted somewhere along the chain, which also accounts for nearly 10% of global carbon emissions. However, what if we scratch the surface and look behind the report into the raw data[10] that shaped this report. How reliable are the food waste numbers?

Authors of the report acknowledge that it is very challenging to collect data on food waste and admit that they have high-quality data from only 14 countries,[11] while they have medium confidence in reports from 42 countries. The dataset of the report lists 233 geographic units (mainly UN Member states), and has assigned no estimate, very low confidence or low confidence for data estimates for 183 of them, or 79%.[12] The below pie chart presents a visual breakdown of the data source confidence levels:[13]

Evidently, there is not much confidence in the credibility of the reported figures. The authors of the report also elaborate that overall, they were able to collect 152 data points from 54 countries and then extrapolated that data to calculate the estimates for other geographic areas where data was not available. However, even the credibility of those available data points can be questioned. For example, Poland is assigned a medium confidence level, even though the data source for Poland is a small study by local civil society actors. “The Pilot Study of Characteristics of Household Waste Generated in Suburban Parts of Rural Areas” (Steinhoff-Wrześniewska, Aleksandra), mentions that:

21 households, representing 83 people, were audited. None of them were involved in agricultural production. They were provided with three bags for sorting (bio-waste, hygenic waste, all other waste) and had waste collected in each of the four seasons. It is unclear for how long during each season the measurement took place. As a result of small sample size and unknown length, we cannot have high confidence in the estimate.

Population of Poland is 38 million and only 15 million of it lives in rural areas, while 61% reside in urban centers. A sample of only 21 households from suburban parts of rural Poland observed over undefined periods of time is not a strong representative of food management habits across the whole country.

The question is whether these numbers can serve as a reliable metrics to measure the progress or calibrate policy actions. SDG Target 12.3 aims to halve the global per capita food waste by 2030. According to UNEP’s 2021 Index average food waste per household equals 79 kg a year in high-income countries equals, 76 kg in upper middle-income countries, 91 kg in lower middle-income countries, while the data for low-income countries is insufficient. For example, the 2021 Food Waste Index Report mentions that “The next questionnaire will be sent to Member States in September 2022, and results will be reported to the SDG Global Database by February 2023.” What if the next report shows that annual food waste per household in upper middle-income countries is 86 kg. It would lead to the conclusion that the food waste in this category of countries is increasing, while in fact, the number could have been decreasing. American biochemist Erwin Chargaff once said: “I thought it was the task of the natural sciences to discover the facts of nature, not to create them.” Relying on inaccurate data for measuring progress could set in motion mismatched policy interventions and do more harm than good.

IV. Theoretical Framework

There are no easy shortcuts to producing global data, such as Food Waste Index. It requires the formation of a specific global knowledge infrastructure focused around food waste. It entails standardizing measurements and processes, disciplining staff and synchronizing reporting timelines. Achieving this subject specific institutional interoperability on a global scale, requires significant amounts of money and resources. So, I explain the current shortcomings of the Food Waste Index, by looking at the global knowledge infrastructure behind it and reference mainly these two scholarly works for theoretical backup: A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming, by Paul Edwards, and Standards and Their Stories: How Quantifying, Classifying, and Formalizing Practices Shape Everyday Life, by Martha Lampland and Susan Leigh Star.

Food Waste Index is not a legitimate scientific fact, because there is no well-founded knowledge infrastructure behind it. In his book “A Vast Machine”, Paul Edwards writes that “an established fact is one supported by an infrastructure,”[14] and elaborates that “knowledge infrastructures comprise robust networks of people, artifacts, and institutions that generate, share, and maintain specific knowledge about the human and natural worlds.”[15] If we get rid of the infrastructure, we are left with claims and facts that can neither be backed up nor verified.

In modern world, infrastructures are all around us and we use them on a daily basis, without paying much attention, unless there is a problem with them and/or we have to change them.[16] For example, behind the tap water we use, there is a complex infrastructure of plumbing and water regulation. In a similar fashion, global data requires an elaborate knowledge infrastructure that consists of national communities of scientists, government bureaucrats, and civil society activists, who understand each other, can inform and keep each other accountable. These communities need physical facilities, such as offices and laboratories, as well as legal space to conduct their work with respect to intellectual property.[17] They require mediums of communication such as conferences, journals, web portals etc., to exchange knowledge and keep up to date.

However, most importantly, for these national information eco-systems to reach beyond their borders and co-produce global data, they need standardized methods and measures. The amount of reported food waste can change depending on how countries define food waste, when they measure it and what factors they take into account. For example, according to the UNEP, “food waste is defined as edible parts and associated inedible parts going directly to the following destinations: landfill, controlled combustion, litter discards/refuse, compost/aerobic digestion, land application, co/anaerobic digestion, sewer, but does not include food waste used for biomaterial/processing, animal feed or not harvested.[18] In some countries associated inedible parts of the food used for compost, might not be considered food waste. A more accurate report, should also take into account seasonal fluctuations of food waste.

V. UNEP’s Food Waste Index

Bottom line up front, there is no global knowledge infrastructure around food waste and UNEP did not have the resources to build it up in the given time frame. UNEP has been working on food waste reduction since 2013, when it launched the global campaign Think Eat Save, but it became a priority task for UNEP only in 2019, following the UN Environment Assembly Resolution 4/2, which mandated UNEP to accelerate global action on food waste reduction.[19]

Established in 1972 and headquartered in Nairobi, Kenya UNEP has around 860 staff members worldwide.[20] The mission statement of the UNEP, which is celebrating its 50^th anniversary this year, “is to provide leadership and encourage partnership in caring for the environment by inspiring, informing, and enabling nations and peoples to improve their quality of life without compromising that of future generations.”[21] By default the top priority for UNEP has been to lead the international efforts against climate change.

In 2013, UNEP in partnership with the Food and Agriculture Organization of the UN (FAO) launched the Save Food Initiative and its subcomponent program “Think Eat Save: Reduce Your Footprint.” Primary goal of the FAO established in 1945 is to “achieve food security that people have regular access to enough high-quality food to lead active, healthy lives.”[22] In 2011, FAO had released its estimates that nearly 1/3 of the world’s food was lost or wasted every year, which lead to their joint Save Food Initiative with UNEP two years later.

So, until recently food waste data was tangled with research into food loss and fell under the prerogative of FAO. The inherent structure of the UN system and the scheme for resource distribution, incentivizes UN agencies to compete for more responsibilities and programmatic oversight. In a 2019 survey by the UN Office of Internal Oversight Services, 80% of UNEP staff “noted that there was critical competition for donor sources with other UN entities.”[23] This institutional contest between FAO and UNEP could potentially explain why between 2015 and 2019, no organization was assigned as a custodian for Food Waste Index.

The first-time that food waste showed up in UNEP’s program of work and budget was in biennial 2018-2019, approved by the UN Environmental Assembly of the UNEP (UNEA) in May 2016.[24] It includes planned work outputs such as “Within sustainable food and agriculture policy frameworks, urban planning and/or existing sustainable consumption strategies, technical and policy guidance provided to public and private actors to measure, prevent and reduce food waste and increase the uptake of sustainable diet strategies and activities,” as well as “Outreach and communication campaigns to raise awareness of citizens (particularly young people) on the benefits of shifting to more sustainable consumption and production practices.” Their previous work plan for 2016-2017, proposed in 2014, had no mention of food waste.[25]

In May 2016, UNEA also adopted a resolution on “Prevention, reduction and reuse of food waste,” which requests the UNEP Executive Director “in cooperation with the Food and Agriculture Organization to “continue to raise awareness of the environmental dimensions of the problem of food waste, and of potential solutions and good practices for preventing and reducing food waste and promoting food reuse and environmentally sound management of food waste.”[26] However, UNEP became the custodian of the Food Waste Index only in 2019, and solidified itself as the lead agency on tackling food waste pursuant to the UNEA Resolution 4/2.[27]

In 2019, UNEP received a new Executive Director Inger Anderson, a competent professional who is well versed both in sustainable development and food security issues. She has more than 30 years of experience in international development organizations, which include her roles as Vice President of the World Bank for Sustainable Development and Head of the CGIAR Fund Council.[28] CIAGR is the Consortium of International Agricultural Research Centers, which brings together international organizations engaged in research about food security. Her predecessor came from a diplomatic background and was asked to resign as a result of an internal audit. Media reports, citing the leaks from the internal audit documents, mentioned that the head of UNEP spent “$500,000 on air travel and hotels in just 22 months, and was away 80% of the time.”[29] So, positive changes happened in the organization under the new leadership and Food Waste Index became one of the top priorities for UNEP.

When UNEP was first assigned as a custodian in 2019, Food Waste Index was still classified as a Tier 3 indicator by the UN’s Inter-agency and Expert Group on SDG Indicators (IAEG-SDGs). The UN breaks down all SDG indicators into 3 Tiers:

“Tier 1: Indicator is conceptually clear, has an internationally established methodology and standards are available, and data are regularly produced by countries for at least 50 per cent of countries and of the population in every region where the indicator is relevant.
Tier 2: Indicator is conceptually clear, has an internationally established methodology and standards are available, but data are not regularly produced by countries.

Tier 3: No internationally established methodology or standards are yet available for the indicator, but methodology/standards are being (or will be) developed or tested.”

Tier classifications change over time as the quality of data for indicators improves. For example, as of February 2022, IAEG-SDG lists 136 Tier I indicators, 91 Tier II indicators and 4 indicators that have multiple tiers (different components of the indicator are classified into different tiers),[30] while in September 2016, there were 81 Tier I indicators, 57 Tier II indicators and 88 Tier III indicators.[31] According to the IAEG reports Food Waste Index was upgraded from Tier III to Tier to II within 2 years.

When we look at the work plan of the UN Environment Program for 2020-2021, it has 7 subprograms, and collecting data for Food Waste Index falls under the Subprogram 6, which is about Resource Efficiency. In 2020-2021, UNEP allocated $95.6 million to the Subprogram 6, which means roughly $48 million per annum. It had 114 staff members working towards the 20 planned work outputs under the Resource Efficiency subprogram.

Mainly these work outputs were geared towards developing the information infrastructure for delivering the SDG indicators. For example, “Resource use assessments and related policy options are developed and provided to countries to support planning and policy-making, including support for the application and monitoring of relevant SDG indicators.” Or, “Database services providing enhanced availability and accessibility of life cycle assessment data are provided through an interoperable global network, methods for environmental and social indicators and the ways to apply them in decision-making.”[32] Most of these programmatic activities are about capacity development, technical assistance, training, policy support, etc.

As a result of UNEP’s active engagement, the number of countries that have a common global measurement approach for consistent reporting under SDG 12.3 increases every year. On average UNEP adds around 10 countries a year to their list of countries compatible for food waste reporting. This shows that UNEP is on the right track on building the knowledge infrastructure for a more reliable global Food Waste Index.

UNEP’s methodology for data collection is to send out Questionnaire on Environment Statistics (Waste Section) to National Statistical Offices and Ministries of Environment. If the respective authorities from these countries do not respond, then UNEP refers to alternative sources for information. However, we should be clear eyed that national executive agencies that collaborate with UNEP are not politically neutral entities and their responses to questionnaires can be subject to political interests of their respective governments.[33] So, these agencies might have the capacity to produce reliable numbers, but not the intention. For this reason, it would benefit the credibility of the food waste index, if UNEP increases its engagement with civil society organizations that can serve as alternative sources of reporting on food waste.

VI. Conclusion

The 2021 Report on Food Waste Index, does not just provide us with numbers about food waste, but it also informs us about the state of the knowledge infrastructure around food waste. The formation of a knowledge infrastructure is a lengthy and complicated process. Institutional resources of the UN system, its global reach, and modern technologies have enabled UNEP to make tremendous progress towards building this infrastructure, within a very short period of time. However, it is still unclear, when UNEP will be able to produce reliable global data on food waste. UNEP can draw many valuable lessons from their 2021 report on food waste, but it should not be used as a benchmark for progress, since it could lead to many misplaced conclusions down the road.

Looking into the future the importance of sustainable consumption will only increase. Over the course of the past century, humanity experienced unprecedent growth in global wealth and food production. Surging food production rates create enormous pressure on the environment, even though hundreds of millions are still not getting their fair share. One of the big reasons for this failure is the food waste problem. Unfortunately, until recently food waste issue has been largely neglected and calculating exactly how much food is wasted has remained an elusive target. If UNEP stays consistent with its action plan, global Food Waste Index will become increasingly more reliable, as more and more countries will be able to plug into the global knowledge infrastructure on food waste. However, there is a lot of work ahead. In the meantime, I would like to reiterate the call of the UNEP Executive Director Inger Anderson’s opening message in the 2021Food Waste Index Report, “let us all shop carefully, cook creatively and make wasting food anywhere socially unacceptable.”

[1] “U.N. Report Says 17% of Food Wasted at Consumer Level.” U.S., Reuters, 4 Mar. 2021,

[2] Merchant, Natalie. “Global Food Waste Twice the Size of Previous Estimates.” World Economic Forum, 26 Mar. 2021.

[3] Sustainable Development. (2022). UN Department of Economic and Social Affairs. https://sdgs.un.org/

[4] “Measuring Progress towards the Sustainable Development Goals.” Our World in Data, SDG Tracker, sdg-tracker.org. Accessed 5 Mar. 2022.

[5] Sustainable consumption and production policies. (2022). UNEP – UN Environment Programme.

[6] UNEP Food Waste Index Report 2021. (2021). UNEP – UN Environment Programme. https://www.unep.org/resources/report/unep-food-waste-index-report-2021

[7] Roser, M. (2019, October 8). Hunger and Undernourishment. Our World in Data. https://ourworldindata.org/hunger-and-undernourishment

[8] “New UNEP Report Developed in Collaboration with WRAP Reveals True Scale of Global Food Waste.” The Waste and Resources Action Programme, 2021, wrap.org.uk/FoodWasteIndex.

[9] UNEP Food Waste Index Report 2021. (2021). UNEP – UN Environment Programme.

[10] SDG Indicators Database. (2021). UN Department of Economic and Social Affairs. https://unstats.un.org/sdgs/UNSDG/IndDatabasePage

[11] According to the UNEP Food Waste Index Report 2021, countries with high-quality data on food waste are Australia, Austria, Canada, China, Denmark, Estonia, Germany, Ghana, Italy, Malta, the Netherlands, New Zealand, Norway, the Kingdom of Saudi Arabia, Sweden, the United Kingdom and the United States.

[12] “Food Waste Index Level 1 Annex.” UNEP- UN Environment Program, 2021, wedocs.unep.org/bitstream/handle/20.500.11822/35355/FWD.xlsx.

[13] Ibid

[14] Edwards, P. N. (2013). A Vast Machine, p. 22

[15] Edwards, P. N. (2013). A Vast Machine, p. 17

[16] Lampland, Martha, and Susan Leigh Star. Standards and Their Stories.

[17] Ibid

[18] UNEP Food Waste Index Report 2021. (2021), p. 14

[19] “Promoting Sustainable Practices and Innovative Solutions for Curbing Food Loss and Waste.” United Nations Environment Assembly, UNEP – UN Environment Programme, Mar. 2019, wedocs.unep.org/bitstream/handle/20.500.11822/28499/English.pdf.

[20] UNEP | International Organizations. (2005). IGPN – International Green Purchasing Network. http://www.igpn.org/global/interorg/unep.html

[21] “About UN Environment Programme.” UNEP – UN Environment Programme, http://www.unep.org/about-un-environment. Accessed 5 Mar. 2022.

[22] “About FAO.” Food and Agriculture Organization of the United Nations, http://www.fao.org/about/en. Accessed 5 Mar. 2022.

[23] Ivanova, Maria (Feb 23, 2021). The Untold Story of the World’s Leading Environmental Institution: UNEP at Fifty, p. 62

[24] “Programme of Work and Budget for the Biennium 2018‒2019.” United Nations Environment Assembly, UNEP – UN Environment Program, May 2016

[25] “Proposed Biennial Programme of Work and Budget for 2016–2017.” United Nations Environment Assembly, UNEP – UN Environment Programme, June 2014

[26] “Prevention, Reduction and Reuse of Food Waste.” United Nations Environment Assembly, UNEP – UN Environment Program, May 2016.

[27] “Promoting Sustainable Practices and Innovative Solutions for Curbing Food Loss and Waste.” United Nations Environment Assembly, UNEP – UN Environment Programme, Mar. 2019.

[28] Inger Andersen. (2019). UNEP – UN Environment Program

[29] Carrington, D. (2018, November 20). UN environment chief resigns after frequent flying revelations. The Guardian.

[30] “Tier Classification for Global SDG Indicators.” UN Statistics Division, Feb. 2019,

[31] “Tier Classification for Global SDG Indicators.” UN Statistics Division, Sept. 2016,

[32] “Proposed Programme of Work and Budget for the Biennium 2020‒ 20211.” UN Environment Assembly, p. 98

[33] In her book “Shades of Citizenship,” Melissa Nobles presents a very illuminating discussion about the impact of the political interests of the data collecting agencies on the data they produce

On Facial Recognition Technology

April 5, 2022January 22, 2023 hpanahovLeave a comment

Why the US needs federal law on Facial Recognition Technology?

Originally published on Intersect: The Stanford Journal of Science, Technology, and Society

Introduction

Since the beginning of the 2000s, Facial Recognition Technology (FRT) has become significantly more accurate and more accessible. Both government and commercial entities use it in increasingly innovative approaches. News agencies use it to spot celebrities at big events. Car companies install it on dashboards to alert drivers falling asleep at the wheel. Governments have used it to track Covid-19 patients’ compliance with quarantine regimes, or to reunite missing children with their families.[1] However, as the use of technology has become more widespread, the controversies around it have also grown. The technology offers tremendous opportunities, but there are reasons to be concerned about its impact on privacy and civil liberties, if it is not used properly. In this paper, I make a brief introduction to facial recognition technology, look separately at commercial and government applications of it, and present my argument why the US needs a federal legislation on FRT.

1. The Nuts and bolts of FRT

Facial recognition falls under the category of biometric data. The software pinpoints facial landmarks, measures the distance between them, and creates a geometric shape of your face.[2] It is less accurate than other biometric identifiers, such as iris and or fingerprint scanning, because of two reasons. One, facial images are not always of high quality. Two, unlike other biometric identifiers, facial features can change over time, due to aging, plastic surgery, cosmetics, effects of drug abuse or smoking, etc.[3] However, FRT has become a lot more popular, because it can be used remotely and is a lot easier to apply in high traffic places.

Today, facial recognition is used mainly for two reasons. First, face verification, also, known as “one-to-one” matching. It is used to verify that you are who you say you are. It is commonly applied to unlock a smartphone or replace ID checks.[4] Second, face identification, also, known as “one-to-many” matching. Usually used to search for persons of interest, where you start the search with an image of a person you do not know to determine his/her identity.[5]

Another category is facial analysis, where the algorithm analyses facial features to determine “age, gender, ethnicity, emotions, fitness for certain jobs.”[6] For example, McDonald’s has used facial analysis in its Japanese stores to check if the employees are smiling, when assisting the customers.[7] Walmart is working on a facial analysis system that will help to process the shoppers’ mood while they are in a store.[8] There have been numerous reports that China is using facial analysis to track ethnic Uighurs, a largely Muslim group in the western province of Xinjiang. Reportedly, the technology can distinguish “Uighur/non-Uighur attributes”, and allows the Chinese police to track the movements of the minority group.[9] While the reports of the Chinese government crackdown on Uighurs have been confirmed, the credibility of the software distinguishing Uighurs purely on facial features is questionable.[10][11]

These news stories give us a good idea about how the FRT can evolve in the future, but at this point in time, facial analysis software is mainly in the research and trial phase. So, this paper will keep the focus on facial recognition. The truth is even facial recognition technology is prone to mistakes. On several occasions, police have arrested the wrong person, because of a mistake by the FRT. In June 2020, Detroit Police Chief said that the software they use misidentifies 96% of the time, so they use it only to narrow down their search sample.[12] In 2018, American Civil Liberties Union tested the facial recognition software of Amazon to compare the images of members of Congress with a database of 25000 mugshots of convicted criminals.[13] Amazon’s “Rekognition” software falsely identified 28 members of congress as criminals. (Amazon’s software is available for public use and cost the ACLU only $12.33).

The FRT is more likely to make a mistake with women and people with darker skin tones than with white men. In the ACLU test, 40% of the false matches were African Americans, even though they comprise only 20% of Congress. In 2018, MIT study of gender and skin-type bias in commercial artificial-intelligence systems showed a 34.7% error rate for dark-skinned women, and only 0.8% for light-skinned men.[14] There are two likely explanations for this bias: darker skins do not reflect light as well as fair skin tones; 2. smaller sample size of minorities’ images.

However, this is a changing pattern and every year the FRT is getting better at recognizing people of all skin tones. A big reason for this is that both the quantity and the quality of the facial images are going up. According to the National Institute of Standards and Technology (NIST) under the US Department of Commerce, the best face identification algorithm in 2014 had an error rate of 4.1%, while by 2020 the leading algorithm had an error rate of less than 1%.[15]

2. Commercial use of FRT

The market for FRT emerged only around 2001, but it has been dynamically growing ever since. According to various estimates, it is expected to reach somewhere between 7 and 10 billion USD in 2022. More and more organizations are using FRT to replace ID checks. Schools use it to track attendance and/or keep away unwanted people. It is widely used to group and catalog images and video files. We have already mentioned some other innovative ways how FRT can be used. However, it is important to note that not all uses of the technology have equal social impact and the US Congress needs to take action and set legal boundaries for commercial use of the FRT.

If we go back to the two sub-categories of facial recognition we discussed earlier, the main issue in the commercial use of the FRT is around facial identification. In the case of facial verification or one-to-one searches, there is a set limit to the database and everyone involved is usually aware that they are part of a certain facial verification system. Usually, facial features of more people are processed to train the algorithm, but that is less problematic since those images are anonymized. In the case of facial identification or one-to-many searches, there is no set limit to the databank and many people are not aware that their information is on a certain database. So, this raises a question about consent.

Can the companies use the images we share on public platforms online to build their database without asking for permission? On November 2, 2021, Facebook announced that it is shutting down its facial recognition system and deleting “more than a billion people’s individual facial recognition templates”.[16] That is why we no longer see little squares around faces when we scroll over Facebook photos. The decision came 6 months after Facebook had to pay $650 million for violating the Illinois Biometric Information Privacy Act (BIPA), which bans collecting and storing of the facial geometry of Illinois residents.[17] Facebook made an elaborate argument that it inflicted no harm on its users, but still lost the case, since BIPA clearly states that processing the biometric data of Illinois residents without opt-in consent is illegal.[18]

Most big tech companies in the US have or had their own facial recognition software, but following the controversies over the racial bias issue, they have restricted investments in FRT. Within a week in June 2020, IBM announced that it is getting out of the facial recognition business altogether, while Microsoft and Amazon declared a moratorium on selling their facial recognition technology to law enforcement agencies. However, these tech giants are not the biggest in the facial recognition market. Table 1 lists 10 of the biggest companies in the FRT market.

Table 1: Some of the biggest companies in the FRT market

Company	Country	Founded in	Web info
Ayonix	Japan	2007	https://ayonix.com
Clearview AI	USA	2017	https://www.clearview.ai/
Clear Secure	USA	2010	https://www.clearme.com
Cognitec	Germany	2002	https://www.cognitec.com/
iOmniscient	Australia	2001	https://iomni.ai
Kairos	USA	2012	https://www.kairos.com
Megvii	China	2011	https://en.megvii.com
NVISO	Switzerland	2009	https://www.nviso.ai/en
Oosto*	Israel	2015	https://oosto.com
SenseTime	China	2014	https://www.sensetime.com/en

* Former AnyVision

In January 2020, The New York Times investigation revealed that a New York based company Clearview AI built a database of 3 billion images scraped from the internet and is selling its software to 600 law enforcement agencies.[19] A month later, BuzzFeed did a follow-up investigation and found that Clearview “had provided its facial recognition tool to more than 2,200 police departments, government agencies, and companies across 27 countries.”[20] Now the company is facing lawsuits in at least 7 countries, including the United States, Canada, Australia, Germany, United Kingdom, France, Italy and Greece.[21] In November 2021, UK government imposed a $23 million fine on Clearview, for violating their national data privacy law. Twitter, Google, and Facebook have also sent cease-and-desist letters requesting it stops using the public information of their users.^[22]

When sued under BIPA, Clearview responded that it will delete data of all the residents from Illinois. Currently, on its website Clearview offers an opt-out form for residents of Illinois and California, which also has legislation similar to BIPA.[23] The United States Congress should pass federal law similar to BIPA or California’s Consumer Privacy Act that would introduce clearly defined limits for commercial use of the FRT. However, considering that on the other side of the debate, this technology adds value to the efforts of the security agencies, the federal legislation should not be overly restrictive. An opt-out consent might be a reasonable solution.

We should, also consider that with every passing day, it is becoming easier to build a search engine for photo matching, like Clearview. Two weeks after the attack on the US Capitol on January 6^th, 2021, a website named Faces of the Riot appeared online, which catalogued the faces of 6000 individuals who were present during the incident, extracted from 827 videos posted on social media platform, Parler. The author of the website, who self-identified as a student in the Washington DC area, told the journalists that he intended to help the police investigation and that he used only open-source software.[24] Thus, a heavily restricted legal environment might not achieve the intended purpose, but create a lucrative black market for the FRT. The federal law on commercial use of FRT should define feasible legal boundaries and find the right balance between the right to privacy and public security efforts.

3. Government use

The number of governments using facial recognition is growing every year. They use it mainly for security and traffic control purposes. However, facial recognition technology and the artificial intelligence behind it are very powerful tools that can be used in many different ways that are not always in the public interest. The federal legislative bodies need to intervene and establish certain standards, impose responsibilities and delineate restrictions for the public use of the FRT.

If facial recognition becomes overly pervasive, then independent of the intent, it could lead to constraints on public freedom. It is important for governments to evaluate the potential impact of facial recognition on civil liberties and establish ethical principles and regulatory guidelines before expanding the use of FRT. A privacy impact assessment by The International Justice and Public Safety Network, which is comprised mainly of seasoned law enforcement officers, mentions that “the mere possibility of surveillance has the potential to make people feel extremely uncomfortable, cause people to alter their behavior, and lead to self-censorship and inhibition.”[25] There are various reports that this is happening in China, where facial recognition is very commonplace. German journalist, Kai Stritmatter, who has studied China for more than 30 years writes about the government use of facial recognition in China: “What the Communist Party is doing with all this high-tech surveillance technology now is they’re trying to internalize control. … Once you believe it’s true, it’s like you don’t even need the policemen at the corner anymore, because you’re becoming your own policeman.”[26] In order to provide a better context, I present a brief overview of the government uses FRT in China and the European Union.

China

According to one estimate in 2020, there were around 770 million surveillance cameras installed around the world and roughly 54% of those cameras were in China.[27] Based on the number of cameras per 1000 people 16 out of the top 20 most surveilled cities are in China.[28] Facial recognition technology is omnipresent in most parts of the country and is used by both government and private entities. For example, at KFC China you can pay by smiling into a camera. According to new guidelines passed by China’s Supreme People’s Court, since August 1, 2021, commercial venues, such as hotels, shopping malls, and airports, need to get consent from customers to use facial recognition.[29] The new rules also impose restrictions on the use of the technology and responsibilities for protecting it.[30] The decision of the Supreme People’s Court came about a year after residents in Honk-Kong staged mass protests against the ubiquitous facial recognition and toppled 20 lampposts equipped with cameras.[31] However, there are no restrictions on the government use of the FRC and it continues to be an integral part of the social credit score system. If a Chinese citizen decides to jaywalk on a street equipped with facial identification camera, she will receive a private message with a fine and that will impact negatively on her social credit score.

European Union

Two weeks ago, a coalition in the German parliament, led by the ruling Social Democratic Party said they want to ban “biometric recognition in public spaces as well as automated state scoring systems by AI.”[32] In April of 2021, European Commission proposed a new regulation titled Harmonized Rules on Artificial Intelligence, which also suggests a ban on facial recognition, absent certain exceptions for security purposes. According to the proposed regulation, the use of “real time remote biometric identification systems in publicly accessible spaces for the purpose of law enforcement is prohibited unless certain limited exceptions apply.”[33] Exceptions include: strictly necessary for a targeted search of potential victims of a crime, prevention of a specific imminent threat to life, or the detection or identification of a perpetrator. The act has already been criticized and various improvements have been offered, but is a great starting point on this very important issue.

The United States

The United States, the world leader in AI industry, does not have a regulation on the fair use of facial recognition either, but the issue is on the agenda of political debates in Congress. In March 2021, National Security Commission on Artificial Intelligence, a bipartisan working group, released its final report, where it recommends the “Congress to require prior risk assessments “for privacy and civil liberties impacts” of AI systems, including facial recognition.” In 2020, “Facial Recognition and Biometric Technology Moratorium Act”, was proposed, but did not pass. Such a moratorium would give time to improve the accuracy of the facial recognition technology and conduct an assessment of its potential implications.

4. Conclusion

One of the biggest concerns in the United States has been the bias of the facial recognition software. As discussed earlier facial recognition systems have been biased against minorities, which has led to several wrong arrests by police. For example, in the summer of 2020 Robert Williams, a resident of Michigan was detained and kept in the police station overnight because a facial recognition algorithm made a flawed match. Usually, these cases get resolved within hours, but it creates a tremendous inconvenience for innocent people and their families. The United States needs a national law that sets out the legal framework for public use of the FRT and addresses all the possible side effects. For example, an effective way to address this issue would be to have third-party testing and approval for the facial software used by police.[34] They would use only the software that is certified by an independent agency. It is also important that police do not use low quality images in their queries.[35]

Facial recognition technology is a powerful new tool that requires a comprehensive approach, which takes into account its impact on the economy, national security, and civic life. It presents incredible opportunities, especially in aiding the work of law enforcement agencies, but finding the right balance between security and civil liberties will be one of the biggest challenges. Federal law is required to regulate both commercial and government use of the FRT and establish quality and credibility standards for the facial recognition software. The law should not force the police to work with analog technologies in a digital age,[36] but they should enforce high ethical standards that will minimize the potentially negative impact on civic life.

[1] Nagaraj, A. (2020, Feb 14). Indian police use facial recognition app to reunite families with lost children. Reuters

[2] Symanovich, S. (2021, Aug 20). What is facial recognition? How facial recognition works. Norton.

[3] Facial Recognition. (2021, October). INTERPOL.

[4] Nature Editorial, & Castelvecchi, D. (2020, Nov 18). Is facial recognition too biased to be let loose? Nature.

[5] Ibid

[6] Ibid

[7] Kaspersky. (2021, August 23). What is Facial Recognition – Definition and Explanation. Kaspersky.Com

[8] Nothing personal? How private companies are using facial recognition tech. (2020, Jun 8). TechHQ.

[9] Mozur, P. (2019, May 6). One Month, 500,000 Face Scans: How China Is Using A.I. to Profile a Minority. The New York Times

[10] Crawford, K., Dobbe, R., Dryer, T., & Fried, G. (2019, December). 2019 Report. AI Now Institute. New York University

[11] Rollet, C. (2019, November 11). Hikvision Markets Uyghur Ethnicity Analytics, Now Covers Up. IPVM.

[12] Koebler, J. (2020, June 29). Detroit Police Chief: Facial Recognition Software Misidentifies 96% of the Time. Vice.

[13] Snow, J. (2018, August 3). Amazon’s Face Recognition Falsely Matched 28 Members of Congress With Mugshots. American Civil Liberties Union.

[14] Hardesty, L. (2018, February 12). Study finds gender and skin-type bias in commercial artificial-intelligence systems. MIT News | Massachusetts Institute of Technology.

[15] Crumpler, W. (2020, April 14). How Accurate are Facial Recognition Systems – and Why Does It Matter? Center for Strategic and International Studies.

[16] Pesenti, J. (2021, Nov 3). An Update On Our Use of Face Recognition. Meta.

[17] 740 ILCS 14/ Biometric Information Privacy Act. (2008, October 3). Illinois General Assembly.

[18] MacCarthy, M. (2020, Aug 20). Who thought it was a good idea to have facial recognition software? Brookings.

[19] Hill, K. (2021, November 2). The Secretive Company That Might End Privacy as We Know It. The New York Times.

[20] Mac, R. (2020, May 8). Clearview AI Says It Will No Longer Provide Facial Recognition To Private Companies. BuzzFeed News.

[21] Webster, S. (2021, May 27). Clearview AI Hit With Dozens of Lawsuit in Europe Over Method of Collecting Data. Tech Times.

[22] Julia Horowitz (2020, Jul 3). Tech companies are still selling facial recognition tools to the police. CNN Business

[23] Illinois Opt-Out Request Form. (2021). Clearview AI. Retrieved December 9, 2021, from https://clearviewai.typeform.com/to/HDz8tJ?typeform-source=www.clearview.ai

[24] Greenberg, A. (2021, January 20). This Site Published Every Face From Parler’s Capitol Riot Videos. Wired.

[25] Garvie, C., & Moy, L. M. (2019, May 16). America Under Watch | Face Surveillance in the United States. America Under Watch – Real-Time Facial Recognition in America. https://www.americaunderwatch.com

[26] Davies, D. (2021, Jan 5). Facial Recognition And Beyond: Journalist Ventures Inside China’s ‘Surveillance State’. NPR.

[27] Keegan, M. (2020, August 14). The Most Surveilled Cities in the World. US News.

[28] Bischoff, P. (2021, May 17). Surveillance camera statistics: which cities have the most CCTV cameras? Comparitech.

[29] Dou, E. (2021, July 30). China built the world’s largest facial recognition system. Now, it’s getting camera-shy. Washington Post. https://www.washingtonpost.com/world/facial-recognition-china-tech-data/2021/07/30/404c2e96-f049-11eb-81b2-9b7061a582d8_story.html

[30] Ibid

[31] Fussell, S. (2019, August 30). Why Hong Kong Protesters Are Cutting Down Lampposts. The Atlantic.

[32] Heikkilä, M. (2021, November 24). German coalition backs ban on facial recognition in public places. POLITICO.

[33] HARMONISED RULES ON ARTIFICIAL INTELLIGENCE. (2021, April 21). European Union Law.

[34] MacCarthy, M. (2021, May 25). Mandating fairness and accuracy assessments for law enforcement facial recognition systems. Brookings.

[35] Hill, K. (2020, August 3). Wrongfully Accused by an Algorithm. The New York Times.

[36] Porter, T. (2019, March 21). The debate on automatic facial recognition continues. Surveillance Camera Commissioner’s Office.

From cybernetics to posthumanism: Biological humans vs synthetic machines

December 15, 2021January 22, 2023 hpanahovLeave a comment

Cyberspace, cybersecurity, cyberinfrastructure and cyborg are some of the most popular words in modern vocabulary. If we look up the etymology of the prefix cyber, it is an abbreviation of cybernetics, which in turn traces its roots back to a Greek word “kybernētēs” that means steersman, governor or pilot. In the mid XX century cybernetics emerged as a transdisciplinary scientific approach, which applies to engineering and computer science, as well as to philosophy and psychology. One of its many definitions is that cybernetics is “the study of systems of any nature which are capable of receiving, storing, and processing information so as to use it for control” (Umpleby, 1982). Since its first public introduction, cybernetics paved a new path of research comparing human mind and computer machines. Over the decades, this line of inquiry has evolved and gained new layers as both the computer and cognitive sciences have advanced and reached new frontiers. By now there is a substantial scientific literature, which argues that in the near future we will be able to upload human mind onto computers, the line between biological human and synthetic machine will dissolve, and humans will no longer be identified by their physical bodies.

Modern cybernetics emerged in the post-World War II period, as a result of the Macy Conferences, but first scholarly works comparing humans to machines go back to the philosophers of the French Enlightenment in the 18^th century. For example, in 1748 Julien Offray de La Mettrie published the book “Man a Machine”, where, as the title suggests, he argued that humans are basically machines. However, neither La Mettrie, nor his like-minded contemporaries such as Pierre Cabanis, and Baron d’Holbach had the depth and breadth of knowledge that the scientists attending Macy’s conferences had. Held in New York between 1941 and 1960, Macy Conferences aimed to stimulate a cross-disciplinary scientific discussion. The conferences were attended by the most influential scientists of the century including physicists John von Neumann and Heinz von Foerster, mathematicians Norbert Wiener and Claude Shannon, neurophysiologists Warren McCulloch and John Young, anthropologist Margaret Mead, psychologist Heinrich Klüver and psychiatrist Ross Ashby, sociologist Paul Lazarsfeld, ecologist George Hutchinson, among many others. This created a rare opportunity for the emergence of a transdisciplinary concept like cybernetics.

Norbert Wiener first introduced the cybernetics to general public in 1948 in his seminal book “Cybernetics: Or Control and Communication in the Animal and the Machine.” Wiener was a child a prodigy, who earned his BA in mathematics at the age of 14 and enrolled in graduate studies in zoology at Harvard, but a year later transferred to Cornell, where he completed a graduate program in philosophy by the age of 17. This background explains how in his research Weiner is able to intertwine mathematical formulas with philosophical ideas. His first book on Cybernetics includes chapters “Computing Machines and the Nervous System”, “Cybernetics and Psychopathology”, “On Learning and Self-Reproducing Machines”, where one of the underlying themes is the comparison of human mind and computing machines. For example, Wiener writes that “a very important function of the nervous system, and, as we have said, a function equally in demand for computing machines, is that of memory, the ability to preserve the results of past operations use in the future” (Wiener, p. 121).

Another giant in the field of cybernetics is Ross Ashby, whose books “Introduction to cybernetics” published in 1956 and “Design of a Brain” from 1960, made him one of the most influential voices in the field of cybernetics. Psychiatrist by profession, Ashby analyzed the human mind as a complex system, and proposed to simplify it to well-defined constraints, rules and algorithms that shape our thinking and behavior. Ashby believed that cybernetics lifted the mystery of “brain and its higher functions” (Ashby, R. Mechanisms of Intelligence, p. 334) and that if properly taught future scientists will be able to “to demonstrate that the science of brain-like mechanisms is essentially clear, practical and useful” (Ashby, R. Mechanisms of Intelligence, p. 334).

From the perspective of cybernetics human mind is a complex system, that receives, stores, and processes information, which makes it essentially similar to a computing machine. The main issue is to find the right code and build a machine that is powerful enough. In many ways, the computational power of modern artificial intelligence can surpass that of a human brain, but can it replace the human mind completely is another question. One of the most outspoken scholars, who argues that computers can only simulate certain functions of a human brain, but never replace it entirely is John Searle (Searle, 1980). Searle is the author of the well-known thought experiment Chinese Room Argument. Searle imagines himself alone in a room, where he is supplied with a string of Chinese characters and numerals under the door and expected to answer queries in Chinese language, even though he does not speak the language. Searle says that with the help of a rule book (with the right code in case of machines), he could produce the right answers to the questions, but yet not understand a word of it (Stanford Encyclopedia of Philosophy). Searle’s famous proposition is that a computer can learn the syntax but it is not sufficient for semantic content.

Katherine Hayles took this debate to a whole new level in her book “How we became post-human,” where she suggests that not only computers have consciousness, but we can upload a human consciousness onto a machine. She builds on the findings of the cyberneticians, to propose that we are the information we have constructed and our body is just a prosthesis that stores and processes that information. According to Hayles, the creation of cyborgs “as a technological artifact and cultural icon” in the post-World War II years, is not a coincidence, but a sign of the direction we are heading to. Hayles proposes that we are already in the middle of a historical process that is transforming the conventional definition of human to a new construct called the post-human (Hayles, p. 2).

Hayles offers 4 characteristics for her definition of post-human: first, it privileges “information pattern over material instantiation”; second, it identifies the human solely with the consciousness; third, our physical body is “the original prosthesis we all learn to manipulate”; fourth, there are no fundamental differences between physical existence and computer simulation, “cybernetic mechanism and biological organism” (Hayles p. 2-3). Basically, Hayles argues that an individual is not a physical body, but a cloud of information that could be transferred from one prosthesis to another. Hayles, cites a poignant quote from the influential study of the relation between humanism and anorexia by Gillian Brown, “you make out of your body your very own kingdom where you are the tyrant, the absolute dictator” (Hayles, p. 5).

Post-humanism has a different meaning in social philosophy, but the definition of a cyborg-like posthuman emerged shortly after the invention of cybernetics. Around the 1960’s a new philosophical movement emerged, called transhumanism, which represents the people, who firmly believe in the coming of a posthuman and identify themselves as transitional between human and posthuman. Today, most influential transhumanists like Ray Kurzweil, Hans Moravec, Vernor Vinge, believe that sometime between 2030s and 2040s, humanity will reach the point of technological singularity, when humans will no longer be able to either control or contain the artificial intelligence. They believe machines will outsmart humans, and then build even smarter machines. For example, Kurzweil writes that beyond that point of singularity, we, the humans, will be able to scan our brains and upload them onto a computer, and thus Human Body Version 3.0 will emerge. Human 3.0 will be able to transfer from one body to another, and will not be constrained by biological weaknesses characteristic to humans of our time. According to Kurzweil individuals will be compelled to upgrade to 3.0, in order not to lose in competition either to machines or other humans (Kuzweil, p. 310).

From this perspective of post-humanists and transhumanists the information stored in human mind captures the entirety of our consciousness and it is possible to separate the consciousness from physical body. This is a contentious topic that relates to the centuries old mind-body problem in philosophy. One of the earliest and most influential thinkers who discussed this subject is 17^th century French philosopher Renee Descartes, who rejected Aristotelian school of thought that all knowledge comes from our sensory experiences and started his philosophical investigation with external world skepticism (Fieser, 2020). Some observers compare Descartes to the protagonist of the Matrix film series Neo, for this form of methodological skepticism, which is called Cartesian doubt in philosophy.

Descartes proposed that an evil demon could be misleading us, so we cannot blindly trust our sensory experiences. However, then Descartes concluded that, if he can doubt the world around him, question the potentially evil plot, then he can think and has an independent mind. Descartes famously proclaimed “I doubt therefore I think, I think therefore I exist” and developed on this premise to achieve that consciousness is distinct from the body and can exist on its own. Descartes was a devout Christian, who believed only the soul can be conscious neither the physical body nor brain. It is hard to tell now, whether Descartes would agree that the conscious soul would follow the memory, if it is ever possible to transfer all the information on human mind onto a machine.

Conversely, scholars like John Searle believe that “conscious states are entirely caused by lower-level neurobiological processes in the brain” and “they have absolutely no life of their own” (Searle, Mind: A brief Introduction. P. 113). Searle argues that consciousness is a purely biological phenomenon, the same as “photosynthesis or digestion” (Searle, Theory of mind and Darwin’s legacy). From this perspective, even if you have the most powerful computers, you cannot separate the consciousness from physical body, since the first cannot exist without the latter.

Conclusion

Human mind is a very complex system and claims that we will be able to upload our consciousness onto machines are open to discussion. However, with regards to the dawn of artificial intelligence and its impact on our collective identity as human species, there are certain trends that are easily observable and undeniable.

First, as transhumanists like to emphasize, technology is developing very rapidly. Moore’s law, which basically proposed that the computing power you could fit in a certain device (number of transistors in a circuit) would double every 2 years (initially it was every 1 year), has proven true for more than 50 years now. Second, we are growing increasingly dependent on technology. It is already turning into a basic necessity both for our mundane daily lives and professional industries. An average cell phone user touches his/her phone 2617 times a day (Lee). A 2019 study demonstrated that algorithms are responsible for 92% of trade in the Forex market (Kissel). Third, evolution is a scientific fact. It might be hard to imagine that our species could change, but in the big scheme of things evolution is inevitability, not just a possibility.

Given these trends, I also believe that fundamental changes are in the making for our species. Changes so big that they will transform our very essence as a species. However, I think these changes will take a little more time than one or two decades. Also, I find it plausible that in that future, it will be possible to scan a human mind and upload it onto a computer, but I do not think that will be the same person. At best it will be a very good clone that will not be able to associate with the human feelings of its original copy.

References

Ashby, R. (1960). Design of a Brain. Butler and Tanner LTD

Ashby, R., & Conant, R. (1981). Mechanisms of Intelligence. Intersystem Publications. http://www.rossashby.info/Ashby-Mechanisms_of_intelligence.pdf

Bell, L. (2016, August 28). What is Moore’s Law? WIRED explains the theory that has defined the tech industry. WIRED UK. https://www.wired.co.uk/article/wired-explains-moores-law

Bostrom, N. (2005). A History of Transhuman Thought. Journal of Evolution and Technology, 14(1). https://www.nickbostrom.com/papers/history.pdf

Dembski, W. A. (1999, October 1). Are We Spiritual Machines? | William A. Dembski. First Things. https://www.firstthings.com/article/1999/10/are-we-spiritual-machines

Descartes, R. (2021). Discourse on the Method Annotated. Independently published.

His famous work, where he proclaims Cogito, ergo sum.

Descartes, R., & Cress, D. A. (1993). Meditations on First Philosophy (Hackett Classics) (3rd ed.). Hackett Publishing Company.

Fieser, J. (2020, June 1). The History of Philosophy: A Short Survey. The University of Tennessee at Martin. https://www.utm.edu/staff/jfieser/class/110/8-empiricism.htm

Kissell, Robert. (2020, September 18). Algorithmic Trading Methods. Academic Press

Hayles, N. K. (1999). How We Became Posthuman. The University of Chicago Press.

Huxley, J. (1942). Evolution. The Modern Synthesis. London: George Alien & Unwin Ltd.

Keeling, D. M. and Lehman M. N. (2018, April 26). Posthumanism. Oxford Research Encyclopedias.

Kurzweil, R. (2006). The Singularity is Near. Penguin Books

Moravec, H. (1988). Mind Children: The future of Robot and Human Intelligence. Harvard University Press

Naftulin, J. (2016, July 14). Here’s how many times we touch our phones every day. Business Insider. https://www.businessinsider.com/dscout-research-people-touch-cell-phones-2617-times-a-day-2016-7

Rushkoff, D. (2019). Team Human. W. Norton & Company

Searle, J. (2013, June 18). Theory of mind and Darwin’s legacy. PNAS. https://www.pnas.org/content/110/Supplement_2/10343

Searle, J. R. (2005). Mind: A Brief Introduction (Fundamentals of Philosophy Series) (Illustrated ed.). Oxford University Press.

Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–424. https://doi.org/10.1017/s0140525x00005756

Stanford Encyclopedia of Philosophy. The Chinese Room Argument. (2020, February 20). https://plato.stanford.edu/entries/chinese-room/

Umpleby, S. (1982). Definitions of Cybernetics. American Society for Cybernetics. https://asc-cybernetics.org/definitions/

Weiss, D. M. (1999). Posthuman Pleasures: Review of N. Katherine Hayles’ How We Became Posthuman. University of Chicago Press. https://jcrt.org/archives/01.3/weiss.shtml

Wiener, N. (1968). Cybernetics: or the Control and Communication in the Animal and the Machine: Or Control and Communication in the Animal and the Machine by Wiener (1961) Paperback (2nd Revised edition). MIT Press.

Terrorists’ Quest on the Dark Web

November 20, 2020January 5, 2023 hpanahovLeave a comment

Why the counter-terrorism strategies need an update?

Introduction

Two decades after the Global War on Terror was launched, the terrorist organizations that rallied around the Salafi-jihadist ideology are defeated. Their malicious plots are proactively disrupted, their most recognized leaders are eliminated and they are forced to operate from hideouts, following the defeat of the ISIS. However, the ideology is surviving by far more followers today, than in the wake of the 9/11 attacks that prompted the war on terrorism. Through effective use of the internet platforms, these terrorist organizations have recruited and indoctrinated a large number of supporters, who can operate semi-autonomously without a need for strict organizational hierarchy. In the last several years, a number of restrictions were introduced against the violent extremist content on the internet, but these measures pushed the terrorists to the dark web, where it is nearly impossible to regulate the content or monitor the data traffic, due to privacy and anonymity features. In the face of these new realities, it is high time to rethink and recalibrate the war on terrorism and allocate more resources to fighting the terrorist groups on digital platforms, especially on the dark web, which offers an unprecedented array of new opportunities for terrorist engagements. The new counter-terrorism strategy should be about chasing the terrorists on the dark corners of the internet, versus the mountains of Afghanistan or deserts of Syria.

Content:

1. Evolution of the Salafi-Jihadist movement
2. Dark Web
3. Terrorists’ quest
4. The counter-terrorism strategies
5. Conclusion

1. Evolution of the Salafi-Jihadist movement

Deaths from Terrorism fell by 15% for the fifth consecutive year to 13, 800 in 2020, according to the Global Terrorism Index report, released by the US National Consortium for “Study of Terrorism and Responses to Terrorism” (START). ISIS is forced out of its strongholds in Iraq and Syria and retreated to hideouts. Al Qaeda has not made the news headlines in the past five years. Most recognized terrorist leaders, such as Osama Bin Laden, Abu Bakr Al Baghdadi, Abu Musab al-Zarqawi, Anwar al-Awlaki, were killed by the US-led anti-terrorist coalition. These are the positive results of the 20 years of the War on Terror that was launched in the aftermath of the 9/11 attacks in 2001. However, the problem of the Salafi-Jihadist ideology that inspires organizations like ISIS and Al Qaeda is far from resolved. According to the datasets of the Center for Strategic and International Studies, there were between 100,000 and 230, 000 Salafi-Jihadists around the world in 2018, which is several times more than in September 2001 (Jones, et al, 2018, p. 9). Their organizations have been defeated, their leaders were taken down, but the ideology survives with a lot more followers today, who are connected over the internet.

The increase in the number of supporters was paralleled with structural changes in the Salafi-Jihadist movement, which has become more decentralized and more diffuse. Jarret Brachman, a terrorism expert at the University of Maryland, wrote an analytical brief in 2014, where he suggested that as a result of innovative approaches on social media platforms global jihadist movement has achieved a critical mass of supporters, which can maintain itself without an organizational leadership. According to the expert’s view, there is a paradigm shift in the global jihadist movement, “moving away from the organization-centric model advanced by Al-Qaida, to a movement unhindered by organizational structures” (Brachman, 2014). “The global Salafi-jihadi movement was and remains more than just al Qaeda—or ISIS… it consists of individuals worldwide, some of whom have organized, who seek to destroy current Muslim societies and resurrect in their place a true Islamic society,” according to Katherine Zimmerman, a resident fellow at the American Enterprise Institute (Zimmerman, 2017). Another subject matter expert, Charles Lister of Middle East Institute, responding to the question “Where is ISIS today?” in 2018, suggests that having lost much of its territory ISIS is retreating to its virtual caliphate to recruit new members, inspire new terrorist attacks and capitalize on its past achievements and thousands of operatives in various countries (Lister, 2018).

Al Qaeda or ISIS, the flag bearers of the Salafi-Jihadist movement, might be incapable of launching a strategic attack today, but that does not mean their movement is not a critical threat to international peace and security. The violent jihadis might project the image of medieval barbarians, but they have demonstrated advanced skills in modern technologies, which allowed them to recruit thousands of followers in all parts of the world. Many researchers studying ISIS, agree that effective online marketing tactics conditioned the initial breakthrough successes of the terrorist group. ISIS exploited social media platforms such as Twitter, Youtube, Facebook, among others, to recruit new sympathizers, collect money, and to deceive its opponents in the region by projecting the image of a more powerful organization than it really was. With more than 40 thousand tweets in a day, #AllEyesonISIS was one of the top trending hashtags on Twitter, at the time of the 2014 Mosul attack (Berger, 2015).

The United Nations estimates that more than 25,000 foreign terrorist fighters from more than 100 countries traveled to Syria and Iraq, between 2011 and 2015, the heydays of the ISIS (United Nations Security Council). The online propaganda of the terrorist organizations did not just recruit foot-soldiers to join the war in the Middle East, but also brainwashed people vulnerable to radicalization to launch attacks in their home cities and guided them about technical know-how. For example, Sayfullo Saipov, an Uzbek migrant in the United States, who killed 8 people in New York City in 2017 by driving a truck into the crowd, was inspired by the Islamic State propaganda videos found on his phone. The New York Police Department revealed that Saipov followed the instructions in the online propaganda materials of the terrorist group to the letter (Mueller, et al, 2017). The Tsarnaev brothers from the North Caucasus who carried out the 2013 Boston bombings, followed the instructions of an article titled “Make a Bomb in the Kitchen of Your Mom”, published in the online magazine of Al Qaeda “Inspire” (Khan, 2013). During 2014-2016, ISIS published numerous online magazines for propaganda purposes, including Istok (in Russian), Konstantiniyye (in Turkish), Dar al-Islam (in French), Dabiq, and Rumiya (multiple languages).

Initially, giant tech companies were reluctant to intervene and block the extremist content emphasizing the “right to freedom of expression”, but the scale and consequences of the terrorists’ online campaigns convinced them to take action. On February 2016, Twitter revealed that in the past six months it suspended 125, 000 accounts associated with the Islamic State, adding that “there is no “magic algorithm” for identifying terrorist content on the internet, so global online platforms are forced to make challenging judgement calls based on very limited information and guidance” (Twitter Inc. 2016). Other big tech companies took similar actions, but content moderation is not as simple as it may sound. Google employees, who watch 4-5 hours of extremist videos per day to moderate content on Youtube, have reported PTSD, chronic anxiety, and other long-term mental issues (Newton 2019). Terrorists know that their communications could be monitored and deliberately adjust their public messages, to avoid either artificial or human sensors on social media. However, big tech companies have increased the labor force and engineering resources devoted to content moderation and it is paying off. Terrorists and violent extremists are a lot more restricted on the internet today than they were five years ago. Subsequently, they rely more and more on the Dark web, which offers less control and more space for maneuvering.

2. Dark Web

Now, let us clarify what the Dark Web is and what it is not, since there are many popular misconceptions. Some people refer to is as a darknet, which is correct. However, dark web is often used interchangeably with deep web, which is a mistake. Deep web constitutes all the data on the world wide web that you cannot access through regular search engines. For example, all the data that is publicly available through a google search is part of the “surface web” or “Clearnet”, which are synonyms. Everything else is part of the deep web. All the password-protected information, such as email contents, bank account or company intranet are all considered part of the deep web, which accounts for more than 95% of the information on the internet. Most of us use the deep web regularly, maybe without even knowing. Dark web is only a tiny part on of the deep web that was initially created by the US Government for secure online communication.

In the mid 1990’s, a mathematician and two computer scientists, working for a project funded by the US Navy, wrote a software called onion routing that conceals the privacy of the internet user with many layers of protection (which explains the name choice onion). Since 2004, The Onion Routing (TOR), has become a free and open-source software, so anyone, can download, use, edit, improve and share the program. In a short period of time, it became a popular software among netizens who like anonymity and privacy. According to the TOR project statistics, in the period of January- February 2021, an average of 2.5 million users per day accessed the internet directly via TOR (The TOR Project).

While the initial purpose of TOR was a secure line of communication for government agencies, afterward it became a popular platform among civil rights activists, and, unfortunately, criminals for illicit engagements online. It has turned into a contentious topic in the public discourse of security vs freedom of expression. Today, the TOR Project is supported with funding from the “U.S. Department of State Bureau of Democracy, Human Rights, and Labor”, Swedish International Development Cooperation Agency, Media Democracy Fund, among many other organizations and thousands of individual donors. Many famous news outlets, such the New York Times, Deutsche Welle, the Guardian have .onion websites for informants, who would like to make a pitch, but remain incognito. Unlike .com, .org, .net etc. on surface web, domains on TOR end in .onion.

According to various estimates, the number of .onion websites ranges between 55 and 80 thousand, but only around 15% of them are active (Stone, 2020). Dark web is a relatively new phenomenon and there is a limited number of researches in this area. Moreover, the secretive nature of the data traffic on the dark web creates additional challenges for research. One of the insightful studies was conducted by Dr. Gareth Owen from the University of Portsmouth, who observed the traffic on the Darknet over six months in 2014. Dr. Gareth concluded that illicit websites and markets accounted for more than 80% of the data traffic on the dark web, even though the majority of the websites on it can be classified as legal (Greenberg, 2017).

There are several other software tools, like I2P, Freenet, among others, to access the dark web, but TOR takes up the lion’s share of the market.“Traffic to hidden services on Tor represents approximately 1.5% of all the data passing across the internet on any given day” (Bausch, 2015).It is a big number and makes an even bigger difference. The dark web is a very challenging turf for the law enforcement to fight against illegal enterprises, because the Onion routing shields the privacy of its users.That is exactly the reason it has become an attractive platform for terrorists and violent extremists.Imagine a scenario, where law enforcement agencies have to detain criminals, who have no identity, who speak their own encrypted language and who can make payments with their own currency without a trace. That is the opportunity that the dark web presents to the terrorist groups.

3. Terrorists’ quest

Terrorists have been using the dark web for as long as it has been available to the general public. However, there have been two important changes in the past several years that made the dark web even more attractive for terrorist groups: 1. Shrinking space for terrorist content on the surface web; 2. New opportunities on the dark web. Previously they relied on the dark web mainly for communication and coordination purposes. However, since the invention of the crypto-currencies and booming illicit darknet markets they face a whole new array of opportunities that did not exist previously. They can fundraise money, sell and purchase drugs, hacker services, small arms and even chemical weapons.

Terrorists feel the restrictions on the surface web not only because of the counter-measures by tech companies and law enforcement agencies, but also due to cyber-attacks from independent hacker groups. One of the pivoting points in the terrorists’ transition to the dark web was the aftermath of the November 2015 Paris attacks, when terrorists backed by the Islamic State coordinated three simultaneous attacks in various parts of Paris on Friday, November 13, 2015. The horrible attacks that took the lives of 130 people led to the outrage of a hacker group called Anonymous, an international decentralized group of hacktivists (activist hackers), who announced, November 14, that they are launching their biggest operation ever against ISIS. The same day Al-Hayat Media Center, the media wing of ISIS, shared through forums and its Twitter and Telegram channels a new “.onion” domain, a mirror of ISIS propaganda site on the darknet, adding that it is not able to maintain its website on the surface web. This was one of the posts on ISIS affiliated Telegram channels that was viewed by 7,629 followers of the channel:

“Due to severe constraints imposed on the #Caliphate_Publications [Isdarat Releases] website, any new domain is deleted after being posted.
We announce the launch of the website for “dark web.”
*It will work for the Tor users and the normal users.
Link for the Tor users: http://isdratetp4donyfy.onion” (INSITE Blog).

This was not the beginning of the war between Anonymous and ISIS, but only of a new major operation. By November 13, 2015, the hacktivists were already claiming close to 149 000 Islamic State-linked websites dismantled and roughly 101, 000 Twitter accounts and 5900 propaganda videos flagged (Brooking, 2019). In December 2015, “al-Aqsa IT Team” affiliated with Al-Qaeda distributed a manual among its networks “Tor Browser Security Guidelines”, to ensure online anonymity (Weimann, 2016).

An even more important turning point for terrorist engagements on the dark web was the introduction of Bitcoin, digital currency, which is the “first decentralized peer-to-peer payment network that is powered by its users with no central authority or middlemen” according to the Bitcoin Foundation established in 2012. Bitcoin has been around since 2009, following the international banking crisis of 2008, but it started gaining traction after 2011 (Lopatto, 2019). The main feature of bitcoin is that it keeps all transactions private, “names of buyers and sellers are never revealed – only their wallet IDs” (Yellin, et al, 2013). This has made bitcoin the currency of choice for illicit activities online. A number of other cryptocurrencies have emerged in the last several years: Ethereum, Litecoin, Cardano among many others. However, Bitcoin is the largest shareholder in the crypto market. The exchange rate of crypto-currencies is very unstable, but as of spring 2022 the overall value of the crypto-market fluctuates around two trillion dollars.

The anonymous currency factor has stimulated the resurgence of black-markets in the darknet, where hackers and drug dealers were even offering Black Friday deals in 2020 (Gilbert, 2018). According to the Chainalysis 2021 Crypto Crime Report, $10 billion worth of bitcoins were spent on criminal activities in 2020, while in 2019 that number was roughly $21.4 billion (Chainalysis Team). The darknet markets, fueled by cryptocurrencies have opened a new breathing line to money launderers, drug dealers, criminal hackers, human traffickers, weapons dealers and several other illicit ventures. Bill Conner, CEO of SonicWall cybersecurity company writes that “the world of cybercrime has evolved from a hacker hobby into a capitalist market”, with hacker products such as, ransomware as a service (RaaS), malware-as-a-service and phishing-as-a-service (Conner, 2018). According to one report, full credit card details including associated data costs $12-20 on the dark web, while a complete set of documents and account details allowing identity theft can be obtained for $1,500 (Gomez, 2021). RAND Europe researchers, who collected data from the dark web for a weeklong period, found 18 darknet markets that were involved in arms dealing (Paoli, et al, 2017). In the summer of 2018, a hacker sold US Military Drone Documents on the Dark Web for just 200$ (Brewster, 2018).

This presents a whole array of unprecedented opportunities for the terrorists. Combatting terrorist financing has been a long-time challenge for law enforcement agencies. The resurgence of crypto-currencies has brought this challenge up to a new level. There are numerous reports of terrorists’ attempts to collect fundraising with bitcoins, pay for their logistical needs, or even attempts to buy Weapons of Mass Destruction.In 2016, anonline jihadist propaganda unit based in the Gaza Strip, The Ibn Taymiyya Media Center (ITMC) ran a social media fundraising campaign using bitcoins, which is the first verifiable instance of a terrorist group using bitcoin (Fanusie, 2016). In January 2017, terrorist activities of Islamist militants in Indonesia were funded through bitcoin (Zenko, 2017). According to the United States Department of Justice, in the beginning of 2019, the al-Qassam Brigades posted a call on its social media page for bitcoin donations to fund its campaign of terror, where the group also “boasted that bitcoin donations were untraceable and would be used for violent causes” (The Department of Justice).

There have also been numerous warnings and alerts from high-level state officials and subject matter experts on what these cryptocurrencies could be used for in the dark web. In April 2016, speaking to a group of 50 heads of state and foreign ministers in Washington, D. C. President Obama described how a terrorist group had bought isotopes through brokers on the Dark Web (Weimann, 2016). At a meeting of the UN Security Council June 28, 2017, U.N. High Representative for Disarmament Affairs Izumi Nakamitsu said that “the global reach and anonymity of the dark web provides non-state actors with new marketplaces to acquire dual-use equipment and materials”, while a senior official with the Organization for the Prohibition of Chemical Weapons (OPCW), Joseph Ballard, added that “the use by non-state actors of chemical weapons is no longer a threat, but a chilling reality” (Besheer, 2017).

4. The counter-terrorism strategies

The United Nations Global Counter-Terrorism Strategy was adopted in 2006, but it has been reviewed several times to adjust it to the changing security landscape. However, the latest review from 2018 makes no reference to the dark web. The 17-page document uses the word “internet” four times but does not capture the complexity of the threats posed by the darknet markets. The most notable reference to the internet is that the UN General Assembly “expresses concern at the increasing use, in a globalized society, by terrorists and their supporters, of information and communications technologies, in particular the Internet and other media, and the use of such technologies to commit, incite, recruit for, fund or plan terrorist acts” (United Nations General Assembly, 2018).

The latest US National Strategy for Combatting Terrorism was adopted in 2018, but it has no mention of the “dark web”. The word internet is used only twice in the 25-page document and that is in the context of “terrorist propaganda”: “They take advantage of technology, such as the Internet and encrypted communications, to promote their malicious goals and spread their violent ideologies” (The United States, 2018). There is only one reference to the “dark web” in the Council of Europe Counter-Terrorism Strategy (2018-2022), which is a good step forward, but not enough to highlight the importance of the issue:

As such, it could be of benefit to member States to examine and share effective practices to monitor, survey, disrupt and interdict opportunistic collaboration between organised crime and terrorist actors… including where such activities take place on the internet and the so-called “dark web” (Council of Europe).

5. Conclusion

All this demonstrates that at the strategic level the approaches to counter-terrorism have not been adequately calibrated. Internet and dark web play an essential role for terrorist activities today. That means these digital platforms should be in the center stage of the counter-terrorism strategies, but they are not. The war on terror has achieved many tangible successes and check-marked most of its strategic objectives. The anti-terrorist coalition was able to hit and destroy the center of mass of the Salafi-Jihadist movement that involves terrorist organizations such as Al Qaeda and ISIS. However, the violent-jihadist movement was able not only to survive but even grow in numbers. This was possible largely due to the role of the internet. In recent years, terrorists have found completely new prospects on the dark web, enabling their malicious plans and activities. These developments require a revised counter-terrorism strategy that will put forth new targets and objectives. Tackling terrorist engagements on the internet and especially dark web should be top priorities.

Bibliography

Bausch, J. (2015, January 5). Researcher explores ‘dark net’ for 6 months, lists most visited hidden sites on the Web. Electronic Products. https://www.electronicproducts.com/researcher-explores-dark-net-for-6-months-lists-most-visited-hidden-sites-on-the-web/#

Berger, J. M. (2015, February 4). How ISIS Games Twitter. The Atlantic. https://www.theatlantic.com/international/archive/2014/06/isis-iraq-twitter-social-media-strategy/372856/

Besheer, M. (2017, June 28). UN: Terrorists Using “Dark Web” in Pursuit of WMDs. Voice of America. https://www.voanews.com/europe/un-terrorists-using-dark-web-pursuit-wmds

Brachman, J. (2014). Transcending Organization: Individuals and “The Islamic State” (pp. 1-2, Issue brief). National Consortium for the Study of Terrorism and the Study of Terrorism.

Brewster, T. (2018, July 11). A Hacker Sold U.S. Military Drone Documents On The Dark Web For Just $200. Forbes. https://www.forbes.com/sites/thomasbrewster/2018/07/11/a-hacker-sold-u-s-military-drone-documents-on-the-dark-web-for-just-200/ 

Brooking, E. (2019, July 23). Anonymous vs. the Islamic State. Foreign Policy. https://foreignpolicy.com/2015/11/13/anonymous-hackers-islamic-state-isis-chan-online-war/

Chinalysis Team. (2021, January 9). Chainalysis 2021 Crypto Crime Report. Chainalysis. https://blog.chainalysis.com/reports/2021-crypto-crime-report-intro-ransomware-scams-darknet-markets

Conner, B. (2018, February 21). Ransomware-As-A-Service: The Next Great Cyber Threat? Forbes. https://www.forbes.com/sites/forbestechcouncil/2017/03/17/ransomware-as-a-service-the-next-great-cyber-threat/

Council of Europe. Counter-Terrorism Strategy (2018-2022). Brussels: Committee of Ministers. https://search.coe.int/cm/Pages/result_details.aspx?ObjectId=09000016808afc96

Fanusie, Y. (2016, August 24). The New Frontier in Terror Fundraising: Bitcoin. The Cipher Brief. https://www.thecipherbrief.com/column/private-sector/the-new-frontier-in-terror-fundraising-bitcoin

Gilbert, D. (2018, November 22). Hackers and drug dealers are offering Black Friday deals. Vice News. https://www.vice.com/en/article/zmdm54/hackers-and-drug-dealers-are-offering-black-friday-deals

Gomez, M. (2021, February 25). Dark Web Price Index 2020. PrivacyAffairs. https://www.privacyaffairs.com/dark-web-price-index-2020/

Greenberg, A. (2017, July 20). Over 80 Percent of Dark-Web Visits Relate to Pedophilia, Study Finds. Wired. https://www.wired.com/2014/12/80-percent-dark-web-visits-relate-pedophilia-study-finds/

INSITE Blog. (2015, November 18). IS Shifts Propaganda Archive to the Dark Web. SITE Intelligence Group. https://news.siteintelgroup.com/blog/index.php/categories/jihad/entry/406-is-shifts-propaganda-archive-to-the-dark-web

Institute for Economics & Peace. Global Terrorism Index 2020: Measuring the Impact of Terrorism, Sydney, November 2020. Available from: http://visionofhumanity.org/reports (accessed 25 February 2021).

Jones, S. G., Vallee, C., Newlee, D., Harrington, N., Sharb, C., & Byrne, H. (2018). The Evolution of the Salafi-Jihadist Threat (p. 9, Rep.). Centre for Strategic and International Studies.

Khan, A. (2013, May 1). The Magazine that “Inspired” the Boston Bombers. The PBS Frontline. https://www.pbs.org/wgbh/frontline/article/the-magazine-that-inspired-the-boston-bombers/

Lister, C. (Researcher). (2018, June 26). Where is Isis today? [Video file]. Retrieved February 27, 2021, from https://www.mei.edu/multimedia/video/where-isis-today

Lopatto, E. (2019, January 3). How bitcoin grew up and became big money. The Verge. https://www.theverge.com/2019/1/3/18166096/bitcoin-blockchain-code-currency-money-genesis-block-silk-road-mt-gox

Mueller, B., Rashbaum, W. K., Baker, A., & Goldman, A. (2017, November 2). Prosecutors Describe Driver’s Plan to Kill in Manhattan Terror Attack. The New York Times. https://www.nytimes.com/2017/11/01/nyregion/driver-had-been-planning-attack-in-manhattan-for-weeks-police-say.html

Newburger, E. (2021, February 20). Elon Musk says bitcoin seems high after surpassing $1 trillion market value. CNBC. https://www.cnbc.com/2021/02/20/elon-musk-says-bitcoin-seems-high-after-surpassing-1-trillion-market-cap.html

Newton, C. (2019, December 16). Google and YouTube moderators speak out on the work that’s giving them PTSD. The Verge. https://www.theverge.com/2019/12/16/21021005/google-youtube-moderators-ptsd-accenture-violent-disturbing-content-interviews-video

Paoli, G. P., Aldridge, J., Ryan, N., & Warnes, R. (2017). The illicit trade of firearms, explosives and ammunition on the dark web. RAND Corporation. https://www.rand.org/content/dam/rand/pubs/research_reports/RR2000/RR2091/RAND_RR2091.pdf

Stone, J. (2020, May 5). How many dark web marketplaces actually exist? About 100. CyberScoop. https://www.cyberscoop.com/dark-web-marketplaces-research-recorded-future/

The Department of Justice. (2020, August 13). Global Disruption of Three Terror Finance Cyber-Enabled Campaigns. https://www.justice.gov/opa/pr/global-disruption-three-terror-finance-cyber-enabled-campaigns

Twitter Inc. (2016, February 5). Combating Violent Extremism. Twitter. https://blog.twitter.com/en_us/a/2016/combating-violent-extremism.html

United Nations General Assembly (2018). The United Nations Global Counter-Terrorism Strategy Review. New York: UN Office of Counter-Terrorism.https://www.un.org/en/ga/search/view_doc.asp?symbol=A/RES/72/284

United Nations Security Council. (2015, May 19). Analysis and recommendations with regard to the global threat from foreign terrorist fighters. United Nations. https://www.un.org/sc/ctc/wp-content/uploads/2015/06/N1508457_EN.pdf

United States. (2018). National strategy for Counterterrorism. Washington, D.C.: Executive Office of the President. https://www.dni.gov/files/NCTC/documents/news_documents/NSCT.pdf

Users – Tor Metrics. (2020–2021). The TOR Project. https://metrics.torproject.org/userstats-relay-country.html

Weimann, G. (2016). Terrorist Migration to the Dark Web. Terrorism Research Initiative, 10(3), 40–44. https://www.jstor.org/stable/26297596

Yellin, T., Aratari, D., & Paglieri, J. (2013, December). What is bitcoin? CNNMoney. https://money.cnn.com/infographic/technology/what-is-bitcoin/index.html

Zenko, M. (2017, August 17). Bitcoin for Bombs. Council on Foreign Relations. https://www.cfr.org/blog/bitcoin-bombs

Zimmerman, K. (2017, July 18). America’s Real Enemy: The Salafi-Jihadi Movement. Critical Threats. https://www.criticalthreats.org/analysis/americas-real-enemy-the-salafi-jihadi-movement

Cybernetics.blog

Category: Paper