Digital Data Used to Relate Nutrient Inputs to Water Quality in the Chesapeake Bay Watershed, Version 1.0.
By John W. Brakebill and Stephen D. Preston
The use of trade, product, or firm names in this report is for identification purposes only and does not imply endorsement by the U.S. Geological Survey.
Abstract
Digital data sets were compiled by the U. S. Geological Survey (USGS) and used as input for a collection of Spatially Referenced Regressions On Watershed attributes for the Chesapeake Bay region. These regressions relate streamwater loads to nutrient sources and the factors that affect the transport of these nutrients throughout the watershed. A digital segmented network based on watershed boundaries serves as the primary foundation for spatially referencing total nitrogen and total phosphorus source and land-surface characteristic data sets within a Geographic Information System. Digital data sets of atmospheric wet deposition of nitrate, point-source discharge locations, land cover, and agricultural sources such as fertilizer and manure were created and compiled from numerous sources and represent nitrogen and phosphorus inputs. Some land-surface characteristics representing factors that affect the transport of nutrients include land use, land cover, average annual precipitation and temperature, slope, and soil permeability. Nutrient input and land-surface characteristic data sets merged with the segmented watershed network provide the spatial detail by watershed segment required by the models. Nutrient stream loads were estimated for total nitrogen, total phosphorus, nitrate/nitrite, amonium, phosphate, and total suspended soilds at as many as 109 sites within the Chesapeake Bay watershed. The total nitrogen and total phosphorus load estimates are the dependent variables for the regressions and were used for model calibration. Other nutrient-load estimates may be used for calibration in future applications of the models.
Introduction
The U.S. Geological Survey's (USGS) Chesapeake Bay Ecosystem Program began in May 1996, and is part of the overall USGS scientific effort to develop an understanding and provide scientific information for the restoration of the Chesapeake Bay and its watershed. The objectives of the program are to: (1) determine the response of water-quality and selected living resources of the bay watershed and estuary to changes in nutrient inputs and climatic variability; (2) better define and evaluate the natural and anthropogenic controls on water quality and living-resource response; (3) and provide resource managers with the management implications of the scientific findings so they may evaluate the effectiveness of various nutrient-reduction strategies (Phillips and Caughron, 1997).
The Chesapeake Bay Program (CBP), a multi-agency taskforce charged with coordinating efforts to improve water-quality conditions in the bay, developed restoration goals and tributary strategies from management actions designed to reduce nutrient inputs entering the bay from its watershed. The USGS, an active member of the CBP, is cooperating with other agencies to provide preliminary water-quality and living-resource information that is being used to evaluate the success of the Nutrient-Reduction Goals and revision of the tributary strategies set forth by the CBP (Phillips and Caughron, 1997).
Currently, the CBP is using a hydrologic and water-quality watershed model developed on the basis of a Hydrologic Simulation Program - Fortran (HSPF) modeling framework (Donigian and others, 1994). This model is used to estimate nutrient loads and evaluate land-use changes and the potential benefits of best management practices throughout the watershed.
In support of the CBP's modeling efforts and their need to target areas in the Chesapeake Bay watershed for nutrient-reduction actions, the USGS has initiated the development of a set of spatially referenced regression models. Using finer resolution nutrient inputs and water-quality loads, these models provide a spatial statistical approach to relating water quality to nutrient sources and the land-surface characteristics that affect the transport of these nutrients throughout the watershed (Preston and others, 1998). SPAtially Referenced Regressions On Watershed attributes (SPARROW) is the method used for developing the regression models (Smith and others, 1997), and is currently being applied in the Chesapeake Bay watershed.
Purpose and Scope
This report describes digital spatial data sets used by the USGS in the initial applications of SPARROW models relating total nitrogen and total phosphorus inputs to water quality in the Chesapeake Bay watershed. Revised and updated digital spatial data sets will be created and distributed by the USGS as planned enhancements and applications for the SPARROW models are completed.
Metadata for spatial data sets used with the SPARROW models that are being distributed by the USGS were produced. The metadata describe in detail the data sources, applications, methods, and procedures used to create the data sets in addition to distribution information. Spatial data sets discussed in this report, which are not distributed by the USGS, can be obtained from referenced agencies listed in this report.
Acknowledgments
The authors thank the Chesapeake Bay Program Office for providing necessary data sets. We would also like to thank Richard Smith, Gregory Schwarz, and Richard Alexander of the U.S. Geological Survey, developers of the SPARROW approach, for providing data sets and technical assistance. The reviewers of this report are also acknowledged for their contributions.
Data Sets
Numerous digital data sets were created within the Chesapeake Bay watershed and surrounding area (figure 1) from various sources using Environmental Systems Research Institute's (ESRI) Arc/Info Geographic Information System (GIS) software. These data sets include a segmented watershed network, nitrogen and phosphorus input sources, water-quality data, and land-surface characteristics. All data sets reside in the Albers Equal-Area Conic projection with a central meridian of ninety-six degrees in the North American Datum (NAD) of 1983 (Snyder, 1987). Data sets mentioned in this report are being distributed for general use by the USGS, except where noted.
Figure 1. The Chesapeake Bay watershed and the surrounding area.
Water-Quality Data
Nutrient stream-loading estimates were derived from water-quality and flow data collected by State and Federal agencies (Langland and others, 1995) using the methods described in Smith and others (1997). Load-estimate regressions were fit on the basis of actual flow and concentration data. Loads were then estimated using those regressions and an average discharge time series (average daily flow values for the period 1950 to 1995), specifying the year 1987 for the trend component. By use of this method, load estimates may differ from those previously published because of the average discharge values used as input for the regressions. Nutrient stream loads were estimated for total nitrogen, total phosphorus, nitrate/nitrite, amonium, phosphate, and total suspended soilds at as many as 109 sites within the Chesapeake Bay watershed (figure 2). The nitrogen (79 sites) and phosphorus (94 sites) load estimates were used as the dependent variables in the calibration of the SPARROW models. Other nutrient-load estimates may be used for calibration in future applications of the SPARROW models.
Figure 2. Streamflow data-collection sites used in SPARROW applications in the Chesapeake Bay watershed.
Latitude and longitude coordinates of the streamflow sampling sites used for nutrient stream-load estimates were obtained from the USGS National Water Information System (NWIS) and used to generate a point-location data set. Attributes for this point data set consist of a station identification number for the streamflow sampling sites and their associated water-quality sites, and 1987 load estimates for total nitrogen and total phosphorus.
Segmented-Watershed Network
A digital segmented-watershed network was created using approximately 1,300 stream reaches modified from 1:500,000 scale [River Reach File 1 (RF1), DeWald and others, 1985] and a flow-direction grid (Verdin, 1997) derived from a 1-kilometer grid cell DEM (Verdin and Greenlee, 1996) (figure 3). The modified stream-reach data contain a unique identification number (erf1##) as well as estimates of mean streamflow and mean water velocity (R.B. Alexander and R.E. Brew, written commun., 1996). Other attributes contained in the data set and used as input for the SPARROW models include reservoir time of travel (days), reach time of travel (days), and stream length.
Figure 3. A segmented-watershed network used in the Chesapeake Bay watershed.
Stream reaches were split into two separate reach segments at the location of the streamflow collection sites. A new identification number (erf1##) was calculated for each split reach up and downstream of the streamflow collection sites. Stream reaches representing shorelines of major estuaries within the Chesapeake Bay watershed were also split at arbitrary locations along the shoreline and attributed with a unique identification number. Reaches representing shorelines do not include streamflow, velocity estimates, or any other attributes mentioned above, and were not used in model applications.
A 1-kilometer cell-size raster grid of the stream reaches was generated and used with the flow-direction raster grid to generate a 1-kilometer cell-size raster grid of watershed boundaries for each stream reach (Environmental Systems Research Institute, 1992). The watershed boundaries were attributed with the unique identification number (erf1##) found in the stream-reach data set and converted to a polygon-vector format. The watershed boundaries could subsequently then be analyzed within a grid cell (raster) or polygon (vector) environment, providing the basis for SPARROW's segmented-watershed network.
Nutrient-Input Sources
Digital data sets of atmospheric wet deposition of nitrate, point-source locations of nutrient discharges, land cover, and agricultural sources such as fertilizer and manure, were created and compiled from numerous sources to represent nutrient-input sources to the Chesapeake Bay watershed. This section describes the sources and processes used to compile and create the data sets.
Atmospheric Deposition
Linear spatial interpolation of National Atmospheric Deposition Program (NADP) for 188 point measurements within the United States provided 1987 mean wet-deposition atmospheric estimates for nitrate. (Smith and others, 1997; National Atmospheric Deposition Program, 1988). The latitude and longitude coordinates of the monitoring stations were used to create a spatial data set of the sampling sites attributed with the 1987 mean deposition value. The point data set was then converted into a Triangulated Irregular Network (TIN) (Environmental Systems Research Institute, 1992) to interpolate data values between the sampling locations (Smith and others, 1997). The area within the Chesapeake Bay region was extracted and converted into a 1-kilometer cell-size grid (figure 4). Using Arc/Info's GRID module (Environmental Systems Research Institute, 1992), the atmospheric-deposition grid was then used with the watershed grid to calculate mean wet deposition of nitrate for each watershed segment.
Figure 4. Atmospheric wet deposition of nitrate in the Chesapeake Bay region.
Point Sources
Locations of point-source discharges, (Figure 5) average-annual flow, average-annual concentrations, and loads of nitrogen and phosphorus for sites within the Chesapeake Bay watershed, were obtained from the CBP office for the years 1984 through 1995 (Wiedeman and Cosgrove, 1997). The data are based on information from the Environmental Protection Agency's (EPA) Permit Compliance System (PCS) State National Pollutant Discharge Elimination System (NPDES) discharge monitoring reports, with modifications from individual State agencies responsible for monitoring point-source discharges. The original point-source data compiled by the CBP resides as monthly-discharge data, by facility, in pounds per year (lb/yr). Details on the calculation of annual loads using concentration and flow data by discharging facility, can be found in Appendix F of the EPA report Chesapeake Bay Watershed Model Application and Calculation of Nutrient and Sediment Loadings (Wiedeman and Cosgrove, 1997).
Figure 5. Point-source discharge locations in the Chesapeake Bay watershed.
Latitude and longitude coordinates of discharging facilities were used to generate a point data set attributed with facility, facility type, NPDES number, and nitrogen and phosphorus load averaged by facility and contaminant from 1986, 1987, and 1988 and were then converted to kilograms per year (kg/yr). SPARROW watershed segment numbers (erf1##) were assigned to the locations by merging the polygon data set representing SPARROW watershed segments with the point-source location data set, except where watersheds were generated from reaches representing shorelines of major estuaries. Loads of nitrogen and phosphorus for each watershed were then summed for each contaminant within each watershed segment.
Land Cover
The spatial land-cover data set used in the SPARROW applications was provided by the CBP. It was derived from a combination of land-cover data sets from various years including the Environmental Monitoring and Assessment Program (EMAP), National Oceanic and Atmospheric Administration (NOAA) Coastal Change Assessment Program (C-CAP), and USGS Geographic Information Retrieval and Analysis System (GIRAS) (Gutierrez-Magness and others, 1997).
The original data set1 was a 25-meter cell-size raster grid, with land-cover classifications of high intensity urban, low intensity urban, woody urban, herbaceous urban, herbaceous, woody, herbaceous wetland, and exposed land. A separate grid for each land-cover classification was created and re-sampled to a 1-kilometer cell size. Similar land-cover areas were combined and acres of agriculture (herbaceous), forest (woody), urban (high intensity, low intensity, woody, and herbaceous urban), and wetlands (herbaceous wetland) within each SPARROW watershed segment were used as model inputs for land-surface characteristics (figure 6). It was assumed for the SPARROW applications that the herbaceous category within the data set represents agricultural land, and the woody category represents forest.
Figure 6. Modified land cover in the Chesapeake Bay watershed.
Acres of 1985 conventional-till, conservation-till, and hay land uses were calculated by the CBP within the Chesapeake Bay watershed for each county and CBP watershed model segment (CBPWS) (Donigian and others, 1994) using Crop Tillage and county Agricultural Census data bases (Gutierrez-Magness and others, 1997). To spatially distribute the acres of land use within each SPARROW watershed segment, it was assumed that conventional till, conservation till, and hay land uses were equally distributed throughout the herbaceous (agricultural) classification of the land-cover data set. A percentage of land use for each SPARROW watershed segment was calculated by multiplying the acres of agricultural land (herbaceous) within a county, within a CBPWS, and within a SPARROW watershed segment, then divided by the total acres of agricultural land within a county, within a CBPWS. The resulting weighting factor was used with fertilizer application rates described in the agricultural sources section of this report (below) to calculate loading estimates of nitrogen and phosphorus from agricultural sources for each SPARROW watershed segment.
1 A copy of the original land-cover data set can be obtained from the Chesapeake Bay Program Office, 410 Severn Avenue, Suite 110, Annapolis, Md. 21403, (800)-968-7229.
Agricultural Sources
Manure and commercial fertilizer application rates by conventional-till, conservation-till, and hay land uses for total nitrogen and total phosphorus by CBPWS were obtained from the CBP (Donigian and others, 1994). A weighting factor was calculated by multiplying the acres of agricultural land (herbaceous) within a county, within a CBPWS, and within a SPARROW watershed segment, then divided by the total acres of agricultural land within a county, within a CBPWS. A load for each SPARROW watershed segment was calculated by multiplying the application rate for each land use by the weighting factor, and the total acres of conventional-till, conservation-till, and hay land uses within each watershed segment. The load values for each application (manure and commercial fertilizer) by land use were then summed to calculate load estimates of total nitrogen and total phosphorus by manure (figure 7) and fertilizer (figure 8) for each watershed segment. These load estimates are the final attributes contained in the data set.
Figure 7. Load estimates of nitrogen and phosphorus from manure in the Chesapeake Bay watershed.
Figure 8. Load estimates of nitrogen and phosphorus from commercial fertilizer in the Chesapeake Bay watershed.
Land-Surface Characteristics
Land-surface characteristics represent potential factors that affect the transport of nutrients and were used in SPARROW model calibrations. The sources and processes used to create and compile data sets of average annual precipitation and temperature, slope, and soil permeability are described in the this section.
Precipitation
Total monthly precipitation data and point locations for 1,695 sites within the Chesapeake Bay region were obtained from the National Climatic Data Center for 1950-94. Average precipitation was calculated for each site and a point data set was created from the latitude and longitude coordinates provided. A TIN of the point data set was then created to interpolate between the points within the Chesapeake Bay region (Environmental Systems Research Institute, 1992). The TIN was converted to a 1-kilometer cell-size grid (figure 9) and used with the watershed grid to calculate an average precipitation value for each SPARROW watershed segment.
Figure 9. Average annual precipitation in the Chesapeake Bay region.
Temperature
Monthly temperature data and point locations for 149 sites were obtained from the U.S. Historical Climatology Network (HCN) for 1950 - 94. Average annual temperature was calculated for each site and a point data set was created from the latitude and longitude coordinates provided. A TIN of the point data set was then created to interpolate between the points within the Chesapeake Bay region (Environmental Systems Research Institute, 1992). The TIN was converted to a 1-kilometer cell-size grid (figure 10) and used with the watershed grid to calculate an average temperature value for each SPARROW watershed segment.
Figure 10. Average annual temperature in the Chesapeake Bay region.
Slope
Slope for the Chesapeake Bay watershed was calculated from a 1-kilometer grid cell DEM (Verdin, 1997) generated by the USGS. The area containing the Chesapeake Bay watershed was extracted, and the slope function in Arc/Info's GRID module (Environmental Systems Research Institute, 1992) was used to create a 1-kilometer grid cell with percent values from each cell ranging from 0 to 125 percent (Figure 11). An average percent slope for each watershed was then calculated by using the zonal functions of GRID (zonalmean) (Environmental Systems Research Institute, 1992) and the SPARROW watershed segment grid.
Figure 11. Slope shown as percent in the Chesapeake Bay region.
Soil Permeability
Soil data originating from the State Soil Geographic Data Base (STATSGO) (Schwarz and Alexander, 1995) was converted into a raster grid format using Arc/Info's GRID module. The grid was attributed with a numeric value representing the permeability of the soil in inches per hour (in/hr). The value was calculated as a layer-thickness weighted average across soil layers of a simple average of high and low measurements of the soil layer contained in the original STATSGO data set (U.S. Soil Conservation Service, 1994). The Chesapeake Bay watershed area was extracted (figure 12), and an average permeability was calculated for each SPARROW watershed segment.
Figure 12. Slope permeability in the Chesapeake Bay watershed.
Summary
Approximately 1,300 stream reaches were used to generate a digital segmented network based on watershed boundaries generated from 1:500,000-scale stream data and a 1-kilometer grid-cell Digital Elevation Model. The segmented network provides the primary foundation for spatially referencing total nitrogen and total phosphorus source and delivery-factor data sets within a Geographic Information System. Nutrient source and land-surface characteristic data sets were compiled from various sources, and are being distributed for general use. Data sets not distributed by the U.S. Geological Survey can be obtained from other agencies referenced in this report. Revised and updated digital spatial data sets will be created and distributed as planned enhancements and applications for the Spatially Referenced Regressions On Watershed models in the Chesapeake Bay region are completed.
Selected References
Alexander, R.B., Brakebill, J.W., Brew, R.E., and Smith, R.A., Enhanced River Reach File 1.2 (ERF1), 1999: U.S. Geological Survey Open-File Report 99-457 (scale 1:500,000) accessed January 10, 1999, at URL http://water.usgs.gov/GIS/metadata/usgswrd/erf1.html
DeWald, T., Horn, R. Greenspun, R., Taylor, P., Manning, L., and Montalbano, A., 1985, STORET Reach Retrieval Documentation: U.S. Environmental Protection Agency, Washington, D.C.
Donigian, A.S., Bicknell, B.R., Patwardhan, A.S., Linker, L.C., and Chang, C., 1994, Chesapeake Bay Program Watershed Model Application to Calculate Bay Nutrient Loadings-Final Facts and Recommendations: Report # EPA 903-R-94-042, U.S. Environmental Protection Agency Chesapeake Bay Program Office, Annapolis, Maryland, 283 p.
Environmental Systems Research Institute, Inc., 1992, Arc/Info Users Guide, Cell-based Modeling with Grid, 2nd Edition: Redlands, California, 267 p.
____1992, Arc/Info Users Guide, Surface Modeling with Tin, 2nd Edition: Redlands, California, 192 p.
Gutierrez-Magness, A.L., Hannawald, J.E., Linker, L.L., and Hopkins, K.J., 1997, Chesapeake Bay Watershed Model Application and Calculation of Nutrient and Sediment Loadings, Appendix E: Report # EPA 903-R-97-019, U.S. Environmental Protection Agency Chesapeake Bay Program Office, Annapolis, Maryland, 142 p.
Langland, M.J., Lietman, P.L., and Hoffman, S.A., 1995, Synthesis of Nutrient and Sediment Data for Watersheds Within the Chesapeake Bay Drainage Basin: U.S. Geological Survey Water-Resources Investigations Report 95-4233, 121 p.
National Atmospheric Deposition Program, NADP/NTN Annual data summary: Precipitation chemistry in the United States 1987: National Resource Ecology Laboratory, Colorado State University, Fort Collins, 1988
National Climatic Data Monthly Precipitation Data for U.S. Cooperative and National Weather Service Sites, 1997, accessed January 12, 1998, at URL http://www.ncdc.noaa.gov/ol/ncdc.html
Phillips, S.W., and Caughron, W.R., 1997, Overview of the U.S. Geological Survey Chesapeake Bay Ecosystem Program: U.S. Geological Survey Fact Sheet FS-124-97, 4 p.
Preston, S.D. and Brakebill, J.W., 1999, Applications of Spatially Referenced Regression Modeling for the Evaluation of Total Nitrogen Loading in the Chesapeake Bay Watershed: U.S. Geological Survey Water-Resources Investigations Report 99-4054, 12p. Accessed at URL http://md.usgs.gov/publications/wrir-99-4054/report.html
Preston, S.D., Smith, R.A., Schwarz, G.E., Alexander, R.B., and Brakebill, J.W., 1998, Spatially Referenced Regression Modeling of Nutrient Loading in the Chesapeake Bay: Proceedings of the First Federal Interagency Hydrologic Conference: Las Vegas, Nevada, (April 19-23), 1998, 8 p.
Smith, R.A., Schwarz, G.E., and Alexander, R.B., 1997, Regional Interpretation of Water-Quality Monitoring Data: Water Resources Research, 33 (12). Accessed at URL http://water.usgs.gov/nawqa/sparrow/wrr97/results.html
Schwarz, G.E., and Alexander, R.B., 1995, State Soil Geographic (STATSGO) Data Base for the Conterminous United States version 1.1: U.S. Geological Survey Open-File Report 95-449 (scale 1:250,000), accessed February 10, 1998, at URL http://water.usgs.gov/nsdi/usgswrd/ussoils.html
Snyder, J. P., 1987, Map Projections-A Working Manual: U.S. Geological Survey Professional Paper 1395, 383 p.
U.S. Historical Climatology Network (HCN) Serial Temperature Data, accessed January 12, 1998, at URL http://www.ncdc.noaa.gov/ol/climate/research/ushcn/ushcn.html
U.S. Soil Conservation Service, 1994, State Soil Geographic (STATSCO) Data Base: National Soil Survey Center, Publication number 1492, 88 p.
Verdin, K.L., HYDRO 1K Elevation Derivative Database, 1997, U.S. Geological Survey accessed January 12, 1998, at URL http://edcwww.cr.usgs.gov/landdaac/gtopo30/hydro/namerica.html
Verdin K. L., and Greenlee, S.K., 1996, Development of Continental Scale Digital Elevation Models and Extraction of Hydrographic Features: Proceedings, Third International Conference/Workshop on Integrating GIS and Environmental Modeling, Santa Fe, New Mexico, (January 21-26), 1996. National Center for Geographic Information and Analysis, Santa Barbara, California: accessed November 3, 1997 at URL http://edcwww.cr.usgs.gov/landdaac/gtopo30/gtopo30.html
Wiedeman, Allison, and Cosgrove, Amy, 1997, Chesapeake Bay Watershed Model Application and Calculation of Nutrient and Sediment Loadings, Appendix F: Report # EPA 903-, U.S. Environmental Protection Agency Chesapeake Bay Program Office, Annapolis, Maryland, 50 p.
For additional information contact:
District Chief
U.S.Geological Survey, WRD
8987 Yellow Brick Road
Baltimore, Md. 21237


