Residential Land Data
Data and replication files for 'Decomposing the growth in residential land in the United States'
by Henry G. Overman, Diego Puga, and Matthew A. Turner
This site distributes and documents the dataset of residential land, population and households for individual US states and large metropolitan areas in 1976 and 1992 created by by Henry G. Overman, Diego Puga, and Matthew A. Turner and used in their article 'Decomposing the growth in residential land in the United States', published in Regional Science and Urban Economics 38(5), September 2008: 487-497, as well as the computer code required to replicate their results. Users of this dataset are asked to cite the Regional Science and Urban Economics article as the source. We would also appreciate it if you let us know the details of any paper in which you use the data by sending an email to Diego Puga (diego.puga@cemfi.es
).
These data and replication files, documented below, are freely available for download from this site as a zip file: landpop_data.zip
(12 Kb) . This contains:
- The state-level data in Stata version 8/9 format:
landpop_state.dta
. - The state-level data in comma-delimited ASCII format:
landpop_state.csv
. - The metropolitan-level data in Stata version 8/9 format:
landpop_msa.dta
. - The metropolitan-level data in comma-delimited ASCII format:
landpop_msa.csv
. - A Stata do file that replicates the decompositions, tables, and figure contained in the article 'Decomposing the growth in residential land in the United States':
sprawl_regr.do
.
The state-level data includes the following variables:
- state: State FIPS code.
- state_ab: State 2-letter code.
- state_name: State name.
- respix_1992: Residential 30x30m pixels 1992. Derived from the dataset created by Marcy Burchfield, Henry G. Overman, Diego Puga, and Matthew A. Turner and used in their article 'Causes of sprawl: A portrait from space', published in the Quarterly Journal of Economics 121(2), May 2006: 587-633. This in turn combines the 1992 National Land Cover Data and the Land Use and Land Cover GIRAS Spatial Data as detailed in https://diegopuga.org/data/sprawl/. The variable is the number of 30x30m pixels classified as being more than 30% covered by constructed materials and primarily in residential use in 1992 (codes 21 and 22) in the National Land Cover Data. Multiply by 900 to convert to square meters.
- respix_1976: Residential 30x30m pixels 1976. Number of 30x30m pixels classified as being more than 30% covered by constructed materials (codes 21, 22, and 23) in the National Land Cover Data that was classified as primarily in residential use in 1976 (code 11) in the Land Use and Land Cover GIRAS Spatial Data. Since the Land Use and Land Cover GIRAS Spatial Data data actually corresponds to different dates circa 1976, we correct for data not from 1976 by first determining the portions of each county with data collected in each given year, then estimating the percentage of urban land in each of these county portions by assuming a constant local annual growth rate over the period, then splitting urban land into residential and commercial according to the proportions recorded in the data for each county portion, and finally aggregating up to the county level. The metropolitan area, state and national figures used in our calculations are computed as aggregates of the county numbers. Multiply by 900 to convert to square meters.
- pop_1992: Population 1992. Aggregated from intercensal county-level population estimates for 1992 from the US Bureau of the Census, obtained from http://www.census.gov/popest/archives/EST90INTERCENSAL/STCH-Intercensal/STCH-icen1992.txt.
- pop_1976: Population 1976. Aggregated from intercensal county-level population estimates for 1976 from the US Bureau of the Census, obtained from http://www.census.gov/popest/archives/pre-1980/co-asr-1976.xls.
- hhds_1992: Households 1992. Number of households in 1992, aggregated from county-level numbers. The county-level numbers were obtained by linearly interpolating the total number of households in each county in 1990 and 2000 from the US Bureau of the Census to calculate a county-level average number of people per household in 1992, and then combining this with the intercensal county-level population estimates to obtain the number of households in each county in 1992.
- hhds_1976: Households 1976. Number of households in 1976, aggregated from county-level numbers. The county-level numbers were obtained by linearly interpolating the total number of households in each county in 1970 and 1980 from the US Bureau of the Census to calculate a county-level average number of people per household in 1976, and then combining this with the intercensal county-level population estimates to obtain the number of households in each county in 1976.
The metropolitan-level data includes the following variables (calculated as for the state-level data unless otherwise indicated):
- msa: MSA/CMSA/NECMA FIPS code. We use the Metropolitan Statistical Area and Consolidated Metropolitan Statistical Area definitions (New England County Metropolitan Area definitions for New England) for 1999.
- msa_name: MSA/CMSA/NECMA name.
- respix_1992: Residential 30x30m pixels 1992.
- respix_1976: Residential 30x30m pixels 1976.
- pop_1992: Population 1992.
- pop_1976: Population 1976.
- hhds_1992: Households 1992.
- hhds_1976: Households 1976.
- sprawl_1992: Sprawl index for 1992 development. Percentage of land not developed in the square kilometer around an average residential development in each metropolitan area in 1992. This is part of the urban sprawl data from Burchfield, Overman, Puga, and Turner (2006).
References
Burchfield, Marcy, Henry G. Overman, Diego Puga, and Matthew A. Turner. 2006. Causes of sprawl: A portrait from space. Quarterly Journal of Economics 121(2): 587-633.
Overman, Henry G., Diego Puga, and Matthew A. Turner. 2008. Decomposing the growth in residential land in the United States. Regional Science and Urban Economics 38(5): 487-497.
U.S. Environmental Protection Agency. 1994. 1:250,000-scale Quadrangles of Landuse/Landcover GIRAS Spatial Data in the Conterminous United States. Washington, DC: United States Environmental Protection Agency, Office of Information Resources Management.
U.S. Geological Survey. 1990. Land Use and Land Cover Digital Data from 1:250,000- and 1:100,000-scale Maps: Data User Guide 4. Reston VA: United States Geological Survey.
Vogelmann, James E., Stephen M. Howard, Limin Yang, Charles R. Larson, Bruce K. Wylie, and Nick Van Driel. 2001. Completion of the 1990s National Land Cover data set for the conterminous United States from Landsat Thematic Mapper data and ancillary data sources. Photogrammetric Engineering & Remote Sensing 67(6):650-684.