Urban Density Data

Data and replication files for 'The economics of urban density'

by Gilles Duranton and Diego Puga

This page distributes and documents computer programs and data to replicate the results obtained by Gilles Duranton and Diego Puga in their article 'The economics of urban density', published in Journal of Economic Perspectives, 34(3), Summer 2020: 3-26.

Urban density boosts productivity and innovation, improves access to goods and services, reduces typical travel distances, encourages energy-efficient construction and transport, and facilitates sharing scarce amenities. However, density is also synonymous with crowding, makes living and moving in cities more costly, and concentrates exposure to pollution and disease. In this article, we explore the appropriate measurement of density and describe how it is both a cause and a consequence of the evolution of cities. We then discuss whether and how policy should target density and why the trade-off between its pros and cons is unhappily resolved by market and political forces.

This replication package calculates two measures of density: "naive density" (population per square kilometre) and "experienced density" (population within ten kilometres of the average resident) for us metropolitan areas and uses these data to produce the two panels in figure 1 in the article. It also calculates three elasticities for us metropolitan areas reported in the text of the article: the elasticity of experienced density with respect to city population, the elasticity of naive density with respect to city population, and the elasticity of average distance to the centre with respect to city population. Finally, it calculates experienced density for the entire Canada and for the entire United States.

Population or employment density is often used as a summary statistic to describe the spatial concentration of economic activity. In this context, density is commonly defined as the number of individuals per unit geographic area. Such "naive density" is easy to calculate. However, it may not appropriately reflect the density actually faced by the individual or firm at hand. One problem is that economic units are traditionally defined as aggregates of administrative units: for example, us metropolitan areas are defined based on counties. However, if a metro area includes some counties with substantial rural portions, such calculation will understate the density experienced by most economic actors. In particular, the match between urban and county boundaries is systematically looser for younger and less dense metropolitan areas in the West.

De la Roca and Puga (2017) and and Henderson, Kriticos, and Nigmatulina (2020) have proposed measuring "experienced density" by counting population within a given radius around each individual. Such experienced density, in addition to dealing with the uneven tightness of area boundaries, captures better how close the typical individual is to other people when population is unevenly distributed. To give an illustrative example at the level of countries, where boundaries are given, the United States has nearly nine times the population of Canada with a slightly smaller surface area, so its naive density is ten times higher. And yet walking around cities and towns in both countries, one likely perceives similar concentrations of people nearby. Indeed, the average inhabitant in Canada has about 343,000 people living within a ten-kilometre radius, compared with about 306,000 in the United States.

The replication files

The full replication package is available for download from this site as a zip file: density_replication.zip (6.92 Gb) .

For researchers not intending to replicate the Python/ArcGIS part of the analysis, a much smaller partial replication package is also available for download as a zip file: density_replication_notif.zip (0.71 Gb) . This stills replicates all the results, but relies on intermediate data files from our own run of the Python/ArcGIS scripts. The only difference with respect to the full replication package is that two large population grids (data/src/grid/can_ppp_2010_UNadj.tif and data/src/grid/usa_ppp_2010_UNadj.tif) are not included.

Instructions and overview of the replication files

After downloading and placing the full uncompressed replication package under some directory on your computer that will be the root directory of the replication files:

Edit code/_density_run.do to specify in the line with global PathProjectRoot the path to the root directory of the replication files. This is the directory where the subdirectories code, data and results are located on your computer.
On a first run of the replication code, leave the flag global InstallPackages = 1 in code/_density_run.do to install the required Stata packages (see Software and hardware notes below for details). Change the flag to global InstallPackages = 0 for subsequent runs.
To run only the Stata portion of the data construction, leave the flags global GeocodeAgain = 0 and global DisableArc = 1 in code/_density_run.do, as in the distributed version of the file.
To also re-geocode city centres, make sure you satisfy the requirements under Python and Google API Key in the Software and hardware notes below and set the flag global GeocodeAgain = 1 in code/_density_run.do. This is not required for a full replication of the results, since the intermediate file generated by this step is provided with this replication package.
To also perform the experienced density calculations, make sure you satisfy the requirements under ArcGIS Pro in the Software and hardware notes below and set the flag global DisableArc = 0 in code/_density_run.do and the correct PathProjectRoot in both code/arcgis/density_exp.py and density_exp_isocode.py. This is not required for a full replication of the results, since the intermediate files generated by this step are provided with this replication package.
Run code/_density_run.do in Stata.

The Stata script code/_density_run.do first runs code/1_density_data.do to perform the data construction. This is done on the basis of the data described under Source data below and located in the directories data/src/blkg, data/src/county, and data/src/grid.

If the flag global GeocodeAgain = 1 is set in code/_density_run.do, then code/1_density_data.do in turn runs the Python script code/python/python_batch_geocoding.py to re-geocode city centres, otherwise it relies on the intermediate data file from our run of this Python script. If the flag global DisableArc = 0 is set in code/_density_run.do, then code/1_density_data.do in turn runs the ArcGIS/Python scripts code/arcgis/density_exp.py and code/arcgis/density_exp_isocode.py, otherwise it relies on the intermediate data files from our run of these ArcGIS/Python scripts. The intermediate data files, described under Intermediate data below, are located in the directory data/intermediate.

After the Stata script code/1_density_data.do creates all data files used for the analysis and places them in the directory data/processed, the Stata script code/_density_run.do automatically runs code/2_density_analysis.do to perform the analysis of the processed data (described under Processed data below) and stores all the results (described under Results below) in the results/ directory.

Experienced density calculations for these and other data

The ArcGIS/Python scripts used to calculate experienced density have been written so that they can easily be used on data for other areas as well. We now discuss some important considerations to keep in mind when doing so.

We define experienced density as population within 10 kilometres of the average resident. To calculate experienced density for us metropolitan areas, we first measure the number of people within a 10 kilometres radius of each cell in a population grid for the entire United States. We then compute, for all grid cells in each metropolitan area, the population-weighted average of this count of people within 10 kilometres. Weighting by population is important, since otherwise we would be calculating population within ten kilometres of the average place instead of within ten kilometres of the average person.

Measuring the number of people within a 10 kilometres radius of each cell in a population grid requires approximating a circle with a jagged shape made up of cells (48,301 cells of 3 arc-seconds by 3 arc-seconds each in our case). We must then take into account that, on a grid with a geographic spatial projection, the actual surface area of those cells varies across the grid in proportion to the cosine of the latitude. Our code applies a correction factor so that our calculations reflect the number of people in a neighbouring area corresponding to a circle with a 10 kilometre radius.

The ArcGIS/Python script code/arcgis/density_exp.py calculates experienced density for us metropolitan areas. It takes as inputs the population grid in Geotiff format data/src/grid/usa_ppp_2010_UNadj.tif and the geographical boundaries for metropolitan areas in Shapefile format data/src/grid/msa1999_boundaries.shp (and the associated files with .dbf, .prj and .shx extensions).

By editing the header of code/arcgis/density_exp.py to point to a different population grid and a different set of city boundaries, interested users can easily calculate experienced density for cities in any other country or for us cities with alternative city definitions. Note that the script expects a population grid with a geographic projection and 3 arc-seconds by 3 arc-seconds resolution, but such grids are readily available for countries throughout the world from https://www.worldpop.org.

The ArcGIS/Python script code/arcgis/density_exp_isocode.py calculates experienced density for the entire United States and for the entire Canada. It is written so that the same calculation can be done for any country in the world simply by placing the https://www.worldpop.org grid for that country with number of people per pixel and total country population matching the corresponding official United Nations population estimates and then running the script code/arcgis/density_exp_isocode.py with the country's ISO Alpha-3 Code as an argument. For instance, running "C:/Program Files/ArcGIS/Pro/bin/Python/Scripts/propy.bat" density_exp_isocode.py CAN from the command prompt calculates experienced density for Canada. The Stata script code/1_density_data.do defines local isocodelist "CAN USA". If one also wanted to calculate experienced density for Mexico, it would just be a matter of editing this line to local isocodelist "CAN USA MEX" and re-running the replication code after downloading mex_ppp_2010_UNadj.tif from https://www.worldpop.org and placing it in data/src/grid/mex_ppp_2010_UNadj.tif.

Software and hardware notes

All of the results and figures in the Journal of Economic Perspectives article have been produced using the code and data provided, Stata version 16.1, Python version 3.8.2, and ArcGIS Pro version 2.5.

The code has been written to be as portable as possible. Nevertheless, the following considerations should be kept in mind (most, if not all, of these consideration will be irrelevant if one skips the re-geocoding of city centres and the re-calculation of experienced density and relies on the intermediate data files provided for these two steps of the data construction):

Stata: The following additional Stata packages are required:
- shp2dta: module to converts shape boundary files to Stata datasets, by Kevin Crow.
- grstyle: module to customize the overall look of graphs, by Ben Jann.
- palettes: module to provide color palettes, symbol palettes, and line pattern palettes, by Ben Jann.
These required Stata packages can be installed automatically by setting the flag global InstallPackages = 1 in code/_density_run.do. This should be done on the first run of the code, then changing the flag to global InstallPackages = 0 for subsequent runs.

The code has been tested with Stata version 16.1. If re-geocoding city centres (controlled by the flag global GeocodeAgain = 1 in code/_density_run.do, Stata version 16 or higher is needed because the code takes advantage of the Python integration in Stata introduced in version 16 to run the Python code used for this geocoding. If one skips the re-geocoding of city centres (setting the flag global GeocodeAgain = 0), the code will run in versions of Stata older than 16, and it is still possible to replicate all the results, relying on the provided intermediate data file data/intermediate/msa1999_geocoded.csv. In either case, if run on versions newer than 16.1, the code will impose version 16.1 for more robust replicability.
Python: Python is run from within Stata only if re-geocoding city centres. This is controlled by the flag global GeocodeAgain = 1 in code/_density_run.do. If one skips the re-geocoding of city centres (setting the flag global GeocodeAgain = 0), Python is not needed. If re-geocoding city centres, the following additional Python libraries are required:
- pandas: a library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language, by Wes McKinney and many others.
- requests: an elegant and simple HTTP library for Python, built for human beings, by Kenneth Reitz and many others.
These required Python libraries will not be installed automatically. They can be installed following the standard procedure in your Python setup (e.g. pip3 install --upgrade pandas requests). If the flag global GeocodeAgain = 1 is set, the code will check the required Python libraries are present and return an error otherwise.

The code has been tested with Python version 3.8.2. The Python script code/python/python_batch_geocoding.py adapts the script of the same name written by Shane Lynn and available at https://github.com/shanealynn/python_batch_geocode. To re-geocode city centres, a Google API Key is also needed.
Google API Key: The Python code used to determine the coordinates of the centres of the main city in each us metropolitan area queries the Google Maps Platform using its Geocoding API for this purpose. To use the Geocoding API you must have an API key. The API key is a unique identifier that is used to authenticate requests associated with your project for usage and billing purposes. At the time of writing, instructions can be found at https://developers.google.com/maps/documentation/geocoding/get-api-key?hl=ko and Google provides a $200 monthly free credit that can be used to perform small-scale geocoding tasks such as this one for free. Since the API key links the requests to your billing account and free credit, it is important to secure your API key, for instance by restricting its scope to the Geocoding API, by restricting its use to requests coming from a narrow range of IP addresses, by setting a quota on the maximum number of request per day, and by keeping it private. The API key must be specified in code/python/python_batch_geocoding.py (replacing the mock key in API_KEY = 'AIzaSy012345678901234567890123456789012' with your own actual key). As already noted, one can skip the re-geocoding of city centres (setting the flag global GeocodeAgain = 0 in code/_density_run.do) and still replicate all the results, relying on the provided intermediate data file data/intermediate/msa1999_geocoded.csv. By default, the flag global GeocodeAgain = 0 is set because over time Google Maps may make minor modifications to the coordinates assigned to the centre of the main city in each us metropolitan area and an exact replication will need to rely on the coordinates obtained in our run of the code on 2 May 2020.
ArcGIS Pro: ArcGIS Pro is called from Stata only if performing the experienced density calculations. This is controlled by the flag global DisableArc = 0 in code/_density_run.do. If one skips this part of the data construction (setting the flag global DisableArc = 1), ArcGIS Pro is not needed. It is still possible to replicate all the results, relying on the provided intermediate data files from our own run of the ArcGIS/Python part of the data construction. Note that while the ArcGIS Pro code is written in Python, it must be run within an ArcGIS Pro conda Python environment on a computer where this software is licensed. For this reason, the ArcGIS/Python code is called by Stata using the shell command and the batch file propy.bat that is part of the ArcGIS Pro installation instead of Stata's python command. The standard location of propy.bat is specified in code/_density_run.do as global PathArcPy "C:/Program Files/ArcGIS/Pro/bin/Python/Scripts/propy.bat" and you most likely will not need to change this. The batch file propy.bat launches an ArcGIS/Python script using the active ArcGIS Pro conda environment. By default, ArcGIS Pro has a single conda Python environment, arcgispro-py3, that includes all Python libraries used by ArcGIS Pro as well as several others. The following additional Python library, not part of the default arcgispro-py3 environment, is required:
- simpledbf: a Python library for converting basic DBF files to CSV files, and pandas dataframes, by Ryan Nelson.
The default arcgispro-py3 environment is read-only and cannot be modified, so to install this additional library, you must first clone the default environment. This can be done using the Manage Environments dialog box in ArcGIS Pro, simply by clicking the Clone button next to the default arcgispro-py3 environment. The new environment is a copy of the arcgispro-py3 environment, but can be modified. Activate the new environment by ticking the radio button next to it and clicking OK. Then install simpledbf with the command conda install -c rnelsonchem simpledbf. The code has been tested with ArcGIS Pro version 2.5.
Operating system: None of the Stata code is operating-system specific. The Python code used to geocode the city centres is also fully portable. The ArcGIS/Python code used to calculate experienced density requires ArcGIS Pro, which is only available for Microsoft Windows. MacOS and Linux/Unix users as well as Windows users without an ArcGIS Pro license can set the flag global DisableArc = 1 in code/_density_run.do and still replicate all the results relying on the provided intermediate data files from our own run of the ArcGIS/Python code.
Hardware: The run of the code producing the results reported in the published version of the article was done on a VMWare vSphere virtual machine running Microsoft Windows Server 2016. The virtual machine was allocated 12 virtual cores and 96Gb of RAM. The physical machine where the virtual machine was installed is a 2012 Dell PowerEdge R715 with two 16-core AMD Opteron 6282SE processors and 128 Gb of ddr3-1600MHz RAM. The run was started on 2 May 2020 and took 6 hours and 40 minutes, of which only 16 seconds was devoted to data analysis and the rest to data construction. The log of this run is provided with the replication files in code/logs/log_2020.05.02_18.47.36.txt. The replication files take about 9Gb of disk space. Running the code does not require a particularly powerful computer, but if running the ArcGIS/Python portion of the data construction, we recommend having at least 100Gb of disk space available for temporary data files. If one skips the re-geocoding of city centres (setting the flag global GeocodeAgain = 0) and the re-calculation of experienced density (setting the flag global DisableArc = 1), the code will typically run in less than 30 seconds, including the remainder of the data construction process.

Source data

To calculate experienced density (population within ten kilometres of the average resident), we use gridded population data at 3 arc-second resolution (approximately 100m at the equator) from WorldPop (2018). These gridded population data are available to download in Geotiff format from https://www.worldpop.org. The units are number of people per pixel, with total country population matching the corresponding official United Nations population estimates. We use 2010 population grids for Canada (data/src/grid/can_ppp_2010_UNadj.tif) and the United States (data/src/grid/usa_ppp_2010_UNadj.tif).

For the United States, we calculate experienced density not just for the entire country, but for all 275 metropolitan areas in the conterminous United States. This calculation also uses the geographical boundaries for these metropolitan areas in Shapefile format in data/src/grid/msa1999_boundaries.shp (and the associated files with .dbf, .prj and .shx extensions). This Shapefile merges three Shapefiles obtained from the us Bureau of the Census (https://www.census.gov/geographies/mapping-files.html): ma99_99.shp for Metropolitan Statistical Areas, cm99_99.shp for Consolidated Metropolitan Statistical Areas, and ne99_d00.shp for New England County Metropolitan Areas. The Shapefile data/src/grid/msa1999_boundaries.shp also contains the area of each metropolitan area (variable area_ha, expressed in hectares), and we use this to calculate naive density for them.

To calculate naive density for metropolitan areas in the conterminous United States, in addition to their area, we need their population. We use population for us Counties from the 2010 Census obtained from us Census Bureau (2011) in data/src/county/co-est00int-tot.csv. This was downloaded from https://www2.census.gov/programs-surveys/popest/datasets/2000-2010/intercensal/county/co-est00int-tot.csv.

To assign 2010 County populations to metropolitan areas, we use Metropolitan Statistical Area (MSA) and Consolidated Metropolitan Statistical Area (CMSA) definitions outside of New England and New England County Metropolitan Area (NECMA) definitions in New England, as set by the Office of Management and Budget on 30 June 1999. These definitions are available in data/src/county/99mfips.txt for MSA/CMSAs and in data/src/county/99nfips.txt for NECMAs. These files were downloaded from https://www.census.gov/population/estimates/metro-city/99mfips.txt and https://www.census.gov/population/estimates/metro-city/99nfips.txt.

To estimate the elasticity of average distance to the city centre with respect to city population, we first determine the location of the centre of each metropolitan area from the location of its core municipality reported by Google Maps. This query is automatically done by the replication code. We then compute, for each metropolitan area, the population-weighted average distance to the centre of its Census block groups, using five-year 2008-2012 data from the 2012 American Community Survey obtained from the IPUMS-NHGIS project (Manson, Schroeder, Riper, and Ruggles, 2019). The data was downloaded from https://www.nhgis.org/. This includes the geographical boundaries for Census block groups corresponding to the 2012 American Community Survey in Shapefile format in data/src/blkg/US_blck_grp_2012.shp (and the associated files with .dbf, .prj, .shp.xml, and .shx extensions) and also the total population of each block group (files data/src/blkg/nhgis0026_ds191_20125_2012_blck_grp.dat, data/src/blkg/nhgis0026_ds191_20125_2012_blck_grp.do, and data/src/blkg/nhgis0026_ds191_20125_2012_blck_grp_codebook.txt). Only the subset of the 2012 American Community Survey data set strictly required for the replication is redistributed with this replication package, as per the guidelines in https://www.nhgis.org/research/citation.

Intermediate data

Since over time Google Maps may make minor modifications to the coordinates assigned to the centre of the main city in each us metropolitan area, and since querying Google maps for these coordinates also requires a Google API Key, we the provide the coordinates obtained in our run of the code on 2 May 2020. This allows skipping the re-geocoding of city centres (setting the flag global GeocodeAgain = 0 in code/_density_run.do) and still replicating all the results.

For the benefit of MacOS and Linux/Unix users as well as Windows users without an ArcGIS Pro license, we also provide the intermediate data files from our own run of the ArcGIS/Python code to calculate experienced density. This allows skipping this part of the data construction (setting the flag global DisableArc = 1 in code/_density_run.do) and still replicating all the results.

The intermediate data consist of the following files and variables:

data/intermediate/msa1999_geocoded.csv. This data file provides the coordinates assigned to the centre of the main city in each us metropolitan area and contains the following variables:
- msa. MSA/CMSA/NECMA FIPS code (1999 definitions).
- msa_name. MSA/CMSA/NECMA name.
- msa_maincity. Main city in MSA/CMSA/NECMA.
- formatted_address. Address as formatted by Google Maps.
- latitude. MSA/CMSA/NECMA centre latitude.
- longitude. MSA/CMSA/NECMA centre longitude.
- accuracy. Accuracy field within the geocoding response (APPROXIMATE when geocoding a city).
- google_place_id. Place ID uniquely identifying a place in the Google Places database and on Google Maps.
- type. Address type(s) (cities are typically tagged by Google with the locality and political type).
- input_string. Input string sent to Google Maps for geocoding.
- number_of_results. Number of potential location matches returned (1 when unambiguous, as in this case since city, state and country were all specified).
- status. Status field within the geocoding response.
- time. Date and time of the the geocoding response.
data/intermediate/msa1999_expden.csv. This data file provides the result of the experienced density calculation for each us metropolitan area and contains the following variables:
- msa. MSA/CMSA/NECMA FIPS code (1999 definitions).
- msa_name. MSA/CMSA/NECMA name.
- density_exp10k_2010. MSA/CMSA/NECMA population within 10km of average resident 2010.
data/intermediate/can_expden.csv and data/intermediate/usa_expden.csv. These data files provide the result of the experienced density calculation for, respectively, Canada and the United States and contain the following variables:
- isocode. Country ISO Alpha-3 Code.
- density_exp10k_2010. Population within 10km of average resident 2010.

Processed data

The processed data on which the data analysis is performed are provided with this replication package, but also fully recreated by the replication code from the original sources. The processed data consist of the following files and variables:

data/processed/county2msa1999.dta. This data file maps us counties to 1999 county-based metropolitan areas and contains the following variables:
- msa. MSA/CMSA/NECMA FIPS code (1999 definitions).
- msa_name. MSA/CMSA/NECMA name.
- pmsa. PMSA FIPS code (1999 definitions)
- pmsa_name. PMSA name.
- fips. County FIPS code.
- state. State FIPS code for the county.
- county_name. County name.
data/processed/density_msa.dta. This file includes the data at the MSA/CMSA/NECMA level and contains the following variables:
- msa. MSA/CMSA/NECMA FIPS code (1999 definitions).
- msa_name. MSA/CMSA/NECMA name.
- msa_maincity. Main city in MSA/CMSA/NECMA.
- msa_lon. MSA/CMSA/NECMA centre longitude.
- msa_lat. MSA/CMSA/NECMA centre latitude.
- west_mississippi. West of Mississippi indicator.
- msa_area. MSA/CMSA/NECMA land area (hectares).
- msa_pop_2010. MSA/CMSA/NECMA population 2010.
- msa_density_exp10k_2010. MSA/CMSA/NECMA population within 10km of average resident 2010.
data/processed/density_blkg.dta. This file includes the data at the Census Block Group level and contains the following variables:
- state. State code.
- state_name. State name.
- county. County code.
- county_name. County name.
- tract. Census Tract code.
- blkg. Census Block Group code.
- blkg_lat. Census Block Group latitude.
- blkg_lon. Census Block Group longitude.
- blkg_area. Census Block Group land area (square metres).
- msa_lat. MSA/CMSA/NECMA centre latitude.
- msa_lon. MSA/CMSA/NECMA centre longitude.
- dist2cbd. Distance to city centre (km).
- blkg_pop. Census Block Group population 2008-2012.
- msa. MSA/CMSA/NECMA FIPS code (1999 definitions).
- msa_name. MSA/CMSA/NECMA name.
data/processed/density_country.dta. This file includes the data at the country level (for the United States and Canada) and contains the following variables:
- isocode. Country ISO Alpha-3 Code.
- density_exp10k_2010. Population within 10km of average resident 2010.

All of these processed data files are also provided in comma-delimited format with the same file names, but a .csv instead of .dta extension. These comma-delimited files are also fully recreated by the replication code.

Results

All the results are placed in the results/ directory.

Figure 1 plots density vs. population for us metropolitan areas. Panel (a), for experienced density, is saved in Encapsulated PostScript format as results/density_fsrc_exp_pop.eps. Panel (b), for naive density, is saved in Encapsulated PostScript format as results/density_fsrc_raw_pop.eps. Both panels are also saved in Portable Network Graphics format with the same file names, but a .png instead of .eps extension.

Results mentioned in the text are saved to the file results/density_text_results.txt, in which the relevant paragraphs are automatically written incorporating the numbers calculated by code/2_density_analysis. This output file reads as follows:

'The economics of urban density', by Gilles Duranton and Diego Puga

Results mentioned in the text

Section 2

The average inhabitant in Canada has about 343,000 people living within a ten-kilometre radius, compared with about 306,000 in the United States.

Panel (a) of figure 1 plots for us metropolitan areas experienced density, measured as population within ten kilometres of the average resident, against total population. The implied elasticity is 0.51. If we use instead naive density, dividing total population by total land area within the official boundaries of the metropolitan areas, we find the same elasticity with respect to total population, 0.51, but the fit is poorer with an R² of 0.49 instead of 0.76.

Section 4

Earlier, we provided an estimate of the elasticity of density with respect to population for us metropolitan areas of 0.51. In addition to lowering their housing consumption, residents also react to higher housing prices by moving to cheaper, less-accessible locations. When we estimate the elasticity of average distance to the centre with respect to city population, we get 0.30.

References

De la Roca, Jorge and Diego Puga. 2017. Learning by working in big cities. Review of Economic Studies 84(1): 106-142.

Duranton, Gilles and Diego Puga. 2020. The economics of urban density. Journal of Economic Perspectives 34(3): 3-26.

Henderson, J. Vernon, Sebastian Kriticos, and Jamila Nigmatulina. 2020. Measuring urban economic density. Journal of Urban Economics (forthcoming).

Manson, Steven, Jonathan Schroeder, David Van Riper, and Steven Ruggles. 2019. Integrated Public Use Microdata Series, National Historical Geographic Information System: Version 14.0. Minneapolis: ipums.

us Census Bureau. 2011. Intercensal Estimates of the Resident Population for Counties and States: April 1, 2000 to July 1, 2010. Washington, dc: us Census Bureau.

WorldPop. 2018. Global High Resolution Population Denominators Project. Southampton: WorldPop (https://www.worldpop.org). Funded by The Bill and Melinda Gates Foundation (opp1134076).