Application of Spatially Referenced Regression Modeling for the Evaluation of Total Nitrogen Loading in the Chesapeake Bay Watershed
by Stephen D. Preston and John W. Brakebill
The reduction of stream nutrient loads is an important part of current efforts to improve water quality in the Chesapeake Bay. To design programs that will effectively reduce stream nutrient loading, resource managers need spatially detailed information that describes the location of nutrient sources and the watershed factors that affect delivery of nutrients to the Bay. To address this need, the U.S. Geological Survey has developed a set of spatially referenced regression models for the evaluation of nutrient loading in the watershed. The technique applied for this purpose is referred to as "SPARROW" (SPAtially Referenced Regressions On Watershed attributes), which is a statistical modeling approach that retains spatial referencing for illustrating predictions, and for relating upstream nutrient sources to downstream nutrient loads. SPARROW is based on a digital stream-network data set that is composed of stream segments (reaches) that are attributed with traveltime and connectivity information. Drainage-basin boundaries are defined for each stream reach in the network data set through the use of a digital elevation model. For the Chesapeake Bay watershed, the spatial network was developed using the U.S. Environmental Protection Agency's River Reach File 1 digital stream network, and is composed of 1,408 stream reaches and watershed segments.
To develop a SPARROW model for total nitrogen in the Chesapeake Bay watershed, data sets for sources and basin characteristics were incorporated into the spatial network and related to stream-loading information by using a nonlinear regression model approach. Total nitrogen source variables that were statistically significant in the model include point sources, urban area, fertilizer application, manure generation and atmospheric deposition. Total nitrogen loss variables that were significant in the model include soil permeability and instream-loss rates for four stream-reach classes. Applications of SPARROW for evaluating total nitrogen loading in the Chesapeake Bay watershed include the illustration of the spatial distributions of total nitrogen yields and of the potential for delivery of those yields to the Bay. This information is being used by the Chesapeake Bay Program to target nutrient-reduction areas (Priority Nutrient Reduction Areas) and to design nutrient-load reduction plans that are specific to each tributary (Tributary Strategies).
The Chesapeake Bay has been adversely affected by excessive nutrient loading from tributaries that drain the watershed. Excessive nutrient loading has resulted in eutrophication of the Bay and in related ecological shifts that have adversely affected water quality and important aquatic species. Specific adverse impacts have included depression of dissolved oxygen levels, which affect benthic organisms, and loss of submerged aquatic vegetation, which provides fish habitat. Excessive nutrient loading results from human activities in the watershed, and efforts are currently underway to identify these activities and mitigate their effects.
The Chesapeake Bay Program (CBP) is a multi-agency effort that was established to help restore the water quality and ecological integrity of the Bay. The CBP was established in 1983 through an agreement between the U.S. Environmental Protection Agency (USEPA), the District of Columbia, and the States of Pennsylvania, Maryland, and Virginia. Other Federal Agencies such as the U.S. Geological Survey (USGS) support the CBP by collecting, compiling, and interpreting water-quality and related data. In the 1987 Chesapeake Bay agreement, the CBP established a goal to improve dissolved oxygen in the Bay by reducing 1985 controllable nutrient loads by 40 percent by the year 2000. As part of the effort to reach that goal, the CBP has been conducting studies to define the sources of nutrient loading, the processes that affect delivery of nutrients to the Bay, and appropriate management actions to limit nutrient loading.
Through the efforts of the CBP and its partner agencies, various types of data have been collected that document the extent of nutrient loading and the environmental factors that affect it. Stream-discharge and nutrient-concentration data have been collected at many locations throughout the watershed. Data sets that document major sources of nutrients have also been developed, including those that quantify point-source discharges, agricultural-fertilizer application, manure generation and atmospheric deposition. In addition, data sets that describe the geographic characteristics of the watershed that may affect loading, such as land cover, physiography, and soil characteristics, have been compiled. All of this information is important for understanding nutrient loading and accomplishing the goals of the CBP. The environmental factors that affect loading are interrelated; thus the different types of data that describe these factors must be integrated to understand the role of any individual factor.
The CBP is integrating nutrient-source, delivery, and load data through the implementation of a deterministic watershed model that describes water movement and nutrient transport throughout the watershed. The CBP watershed model is based on software referred to as the Hydrologic Simulation Program - FORTRAN (HSPF) (Donigian and others, 1994). The CBP HSPF model is a temporally and spatially variable model that allows the simulation of nutrient loading on the basis of information collected in the watershed. The CBP is using the model to simulate nutrient loading under various land-cover and land-management scenarios to evaluate the effects of environmental factors in the watershed, and to design nutrient-management strategies. The CBP HSPF model is spatially defined by 87 watershed segments that are on average more than 700 square miles in area. Calibration is performed for 14 of the 87 watershed segments on the basis of loading information collected at the downstream end of the segments (Donigian and others, 1994).
To supplement the CBP watershed model, the USGS has developed a set of spatially referenced regression models. These models can be used to provide a statistical basis to watershed modeling and to provide additional spatial detail on nutrient sources and transport processes. The method used for developing the regressions is referred to as "SPARROW" (SPAtially Referenced Regressions On Watershed attributes) (Smith and others, 1997). The SPARROW methodology has been successfully applied at the national scale for estimating total nitrogen and total phosphorus loads for streams in the continental United States (Smith and others, 1997). This report describes a separate application of SPARROW at the scale of the Chesapeake Bay watershed (fig. 1).
The application of SPARROW for watershed assessment is a new approach that offers three major features when compared to other techniques. First, the statistical basis of SPARROW provides an objective means of identifying relations between stream-water quality and environmental factors such as contaminant sources in the watershed, and land-surface characteristics that affect contaminant delivery to streams. Second, SPARROW's spatially detailed network and traveltime data provide a means of estimating instream-loss rates. These loss rates allow upstream watershed factors to be related to downstream loads in a more integrated way than has been possible previously, and allow the simultaneous evaluation of many factors that affect loads. Third, SPARROW provides a means of retaining detailed spatial information about all environmental factors considered in the regression model. Because the regression models are linked to spatial information, predictions and subsequent analytical results can be illustrated through detailed maps that provide information about nutrient loading at detailed spatial scales.
This report describes the basic methodology of SPARROW and its application in the Chesapeake Bay watershed. As an example, the total nitrogen SPARROW model is presented for the year 1987. As part of the development of that model, digital data sets that describe nutrient sources and geographic characteristics of the Bay watershed were compiled (Brakebill and Preston, 1999). In addition, a stream-load data set that includes total nitrogen load estimates for 79 stream locations was also compiled (Brakebill and Preston, 1999; Langland and others, 1995). All of these data sets have been placed within a spatial framework that is defined by a digital stream network and related basin segmentation. SPARROW allows the source and basin characteristic information to be related to stream loads, while maintaining all referencing within the spatial framework. The intent of this report is to provide a basic understanding of SPARROW and its capabilities.
Work described in this report was funded by the Integrated Natural Resource Sciences Program (formerly the Ecosystem Initiative Program) of the USGS and is part of a broader effort by the USGS to contribute to and support the work of the CBP and others to understand and restore the water quality and ecological integrity of the Chesapeake Bay.
The Chesapeake Bay watershed extends over 64,000 square miles and covers parts of New York, Pennsylvania, Maryland, Delaware, West Virginia, and Virginia (fig. 1). Major urban areas within the watershed include Scranton, Pa., Harrisburg, Pa., Baltimore, Md., Washington, D.C., Richmond, Va., and Norfolk, Va. Major tributaries that drain the watershed include: (1) the Susquehanna River, which drains much of Pennsylvania and New York; (2) the Potomac River, which drains much of Maryland, Virginia, and West Virginia; and (3) the James River, which drains much of southern Virginia. Numerous smaller tributaries drain the coastal margins of the Bay and the Eastern Shore of Maryland and Virginia.
On the basis of data from 1987, most of the area in the Chesapeake Bay watershed can be classified in three major land-cover categories: urban area (10 percent), agricultural area (29 percent) and forest area (60 percent) (fig. 2) (Gutierrez-Magness and others, 1997). Urban area is concentrated around the major urban centers, and to a lesser degree, around numerous smaller cities that are located throughout the watershed. Agricultural area is located primarily in valley areas of the western and central part of the watershed, and on the Eastern Shore of Maryland and Virginia. The remainder of the watershed is dominated by forest area.
Areas of similar physiographic and geologic characteristics in the Chesapeake Bay watershed have been grouped into units called hydrogeomorphic regions (HGMR's) (fig. 3) that describe the major features of the Bay watershed (Bachman and others, 1998). These features are important to nutrient loading because the hydrology and environmental chemistry of each unit varies with the characteristics of that unit. For example, the permeable soils and flat terrain of the Coastal Plain cause a substantial amount of precipitation to percolate through the soils and flow through Groundwater aquifers to streams or directly to the Bay. In contrast, less permeable soils in the Mesozoic Lowland cause more water to flow overland to streams, which in turn causes stream runoff to occur more rapidly in response to precipitation. These features can have an important effect on nutrient loading by limiting or delaying the amount of nutrients that are delivered from the land surface to streams.
Methodology for Spatially Referenced Regression Modeling
SPARROW is a statistical model that relates stream-nutrient loads to upstream sources and land-surface characteristics. Spatial referencing is accomplished by linking nutrient-source, land-surface characteristic, and loading information to a geographically defined stream-reach data set that serves as a network for relating upstream and downstream loads. Nutrient inputs to each stream reach include loads from upstream and loads from individual sources in the part of the basin that drains directly to the reach. Land-surface characteristics that affect delivery of nutrients to the stream reach are included by quantifying the relative amount of each characteristic for each reach drainage. For example, the average land-surface slope is calculated for the area draining to each stream reach. Spatial referencing is retained for all variables in the model so that predictions can be presented and interpreted in a spatial context. Further details of the methodology are described in Smith and others (1997) and Smith and others (1993).
The mathematical form of the SPARROW model is a nonlinear regression model in which source data are weighted by estimates of loss due to land-surface and instream processes (fig. 4). Stream-load estimates from throughout the watershed represent the "dependent" variables that are used for model calibration. The model relates the stream-load estimates to three types of "independent" or explanatory variables, including source variables, land-to-water delivery variables, and instream-loss variables. Separate model parameters are estimated for each of the independent variables to evaluate the statistical significance of that variable for explaining the spatial variation in stream load.
Source parameters () are included to determine the statistical significance of nutrient sources ( Snj ) in explaining the variation of loads among reaches. Total nitrogen sources considered in the Chesapeake Bay SPARROW model include point sources, urban land area, fertilizer application rates, livestock production, and atmospheric deposition.
The land-to-water delivery parameters () determine the statistical significance of different types of land-surface characteristics ( Zj ) for increasing or decreasing the delivery of nutrients from the land surface to the stream reach. For example, large percentages of impermeable surface area might be expected to increase delivery from the land surface to stream reaches. Land-surface characteristics that were considered in the Chesapeake Bay SPARROW model include air temperature, precipitation, land-surface slope, soil permeability, stream density, and wetland area. Delivery of point-source loads to stream reaches was assumed to be unaffected by land-surface characteristics, and the value of the delivery term () for point sources is set equal to one.
Estimating instream-loss parameters () is important for relating upstream sources to downstream loads because a percentage of nutrients is usually lost due to stream processes, such as denitrification. These losses must be accounted for to determine the importance of upstream sources for delivery of nutrients to the Chesapeake Bay. There are a variety of chemical and biological processes that contribute to instream loss of nutrients. SPARROW does not distinguish or identify individual processes, however, because adequately detailed information on them generally is not available. Instead, SPARROW combines the effects of all processes into one instream-loss estimate that is simulated as an exponential loss
(), and is a function of the instream-loss rate () and the traveltime in the stream reach.
Instream losses can vary by stream size; for the Chesapeake Bay SPARROW model, instream-loss parameters were estimated for stream-reach classes defined by discharge level. Initially, 10 stream-reach size classes were arbitrarily defined by discharges that ranged from less than or equal to 25 cubic feet per second (ft3/s) to greater than 10,000 ft3/s. These stream-reach classes were adjusted to optimize the fit of the model, and classes that were not significant in the model were combined with others. The final set of stream-reach classes were defined by discharge intervals of less than or equal to 200 ft3/s, greater than 200 and less than or equal to 1,000 ft3/s, and greater than 1,000 ft3/s. A fourth instream-loss parameter was defined to account for losses with transport through reservoirs.
Parameters in the Chesapeake Bay SPARROW model were estimated by applying a nonlinear least-squares algorithm to the equation presented in figure 4. The error term in the model is assumed to be multiplicative, and the estimation algorithm is applied after both sides of the equation are converted to logarithmic form. The robustness of the parameter estimates is evaluated by applying a bootstrap algorithm in which the model parameters are repeatedly estimated on the basis of subsamples of the stream load and predictor data. This procedure provides statistical distributions of model parameters that can be used to evaluate the potential range of parameter estimates. Further details of the bootstrap analysis are described in Smith and others (1997).
Description of Data Sets
In the current version (Version 1.0) of the Chesapeake Bay SPARROW model, most dependent and independent variable data sets were compiled from published data bases that are consistent with the CBP HSPF model-input data sets (Donigian and others, 1994; Brakebill and Preston, 1999). An important initial goal of the Chesapeake Bay SPARROW model is to provide information that is consistent with and supplemental to the CBP HSPF model. For this reason, the same input data sets were used for both models whenever possible. A separate load data base was developed for the SPARROW model because its statistical nature allows calibration using loading information from many more locations than are used by the CBP HSPF model. Most of the nutrient-source and land-characteristic data sets used in the SPARROW model, however, are the same as those used in the CBP HSPF model.
An important difference between the Chesapeake Bay SPARROW models and the CBP HSPF model is that HSPF is spatially and temporally variable, whereas SPARROW is only spatially variable. HSPF provides load predictions for a limited number of locations over a specific time period and can be used for temporal trend evaluation. SPARROW provides load predictions for many more locations but only for one time period, typically one year. In this manner, SPARROW provides detailed spatial information that represents a "snapshot" in time. As such, a time period must be selected for the application of SPARROW and all data sets must be defined for that time period. SPARROW can be applied for evaluating temporal changes; however, a separate model must be developed for each point in time.
For the purposes of Version 1.0 of the Chesapeake Bay SPARROW model, the year 1987 was selected as the time period of interest and all data sets have been defined for that year. The year 1987 was chosen because the land-cover data set was based on that time period and few land-cover data sets exist for more recent years. Furthermore, the CBP HSPF model (Donigian and others, 1994) and national SPARROW models (Smith and others, 1997) were based on the time period around 1987, and that year was selected to provide a basis of comparison with those modeling efforts.
Sections that follow describe the data sets that were used in developing this version of the Chesapeake Bay SPARROW models. These descriptions are brief and are provided here primarily for context in interpreting the information presented in the remainder of this report. More detailed descriptions of the data sets with geographic illustrations can be found in Brakebill and Preston (1999).
Stream-Reach and Basin Network
The network for developing the Chesapeake Bay SPARROW model (fig. 5) is based on USEPA's River Reach File 1 (RF1) (DeWald and others, 1985). RF1 is a 1:500,000-scale, digital stream data set that is attributed with stream-reach length, average stream discharge, and average flow-velocity data. This information is used to classify stream reaches into size categories and to calculate traveltime (length/velocity) for estimating instream-loss rates. Stream reaches are also attributed with connectivity information, which is used to relate upstream and downstream reaches. Nationally, RF1 consists of about 60,000 stream reaches, which include 1,366 stream reaches in the Chesapeake Bay watershed. For the purposes of the Bay SPARROW model, some reaches were subdivided at stream-load site locations so that load estimates were consistently at the downstream end of a reach. This modification resulted in 1,408 stream reaches in the Chesapeake Bay watershed.
To provide a basis for relating continuous spatial information to stream reaches, the Chesapeake Bay watershed was segmented using RF1 and a 1-kilometer grid-cell digital elevation model (DEM) (Verdin and Greenlee, 1996). A 1-kilometer flow-direction grid was generated from the DEM and used to delineate drainage-basin boundaries for each digital stream reach (Environmental Systems Research Institute, Inc., 1992). Basin delineation produced 1 basin unit for each reach, or 1,408 basins in all. Thus, the final network consists of 1,408 stream reaches and watershed segments for estimating loads and for illustrating load predictions.
Stream reaches in the coastal margins of the watershed are not included in the network because streams in these areas were not included in RF1. As a result, the coastal margin of the Bay is not represented in the current Chesapeake Bay SPARROW models (fig. 5); this represents a significant limitation because some large total nitrogen sources, such as sewage-treatment plants, are located in coastal areas, and these are not represented in the SPARROW models. Efforts are underway to address this limitation in future versions of SPARROW.
Stream Nutrient-Loading Data
Stream nutrient-loading data were derived from stream-discharge and water-quality data collected at gaged monitoring sites throughout the Bay watershed by a variety of State and Federal agencies (Langland and others, 1995). Data from 147 sites were initially compiled for a period extending from 1972 through 1992. Data from many of those sites were eliminated because they were collected for an inappropriate time period. For example, sites that did not have data within one year of 1987 were eliminated because of potential error related to extrapolating temporal trends in load estimation. Data also were eliminated from sites that were representative of a drainage-basin size that was inappropriate for the network. For example, sites that were representative of small drainage basins (less than 10 square miles) were eliminated because they were inconsistent with the scale of the stream-network data set. The final stream-load data set consisted of loading information from 109 sites (shown in fig. 5), 79 of which included total nitrogen data.
Stream-load estimates at gaged monitoring sites were generated from stream-discharge and water-quality data through the use of a log-linear regression model called ESTIMATOR (Cohn and others, 1989). The ESTIMATOR model estimates daily concentration values that are based on flow, season, and temporal-trend terms. Estimated daily concentration values are subsequently multiplied by measured daily-discharge values and summed to calculate an annual stream load. Typically, the ESTIMATOR model is applied to estimate stream loads for a period extending over a number of years or for evaluating trends in stream loads.
For the purposes of generating stream-load data for calibrating the SPARROW model, the ESTIMATOR model was used to estimate annual load that was based on a long-term average daily-discharge time series, which is made up of the average flow for each day of the year over the period of record. The long-term average daily-discharge time series is used in the analysis to prevent error due to random spatial variations in precipitation and discharge during any given year. It is assumed that the long-term average hydrologic time series minimizes random spatial variations in meteorological processes that could interfere with accurate estimation of spatial variations in loading processes. To generate load estimates that are based on the long-term average daily discharge, the ESTIMATOR model was calibrated by using the actual data. The average daily-discharge time series was subsequently used as input to the calibrated ESTIMATOR model to calculate the load estimate used in the SPARROW models. For the remainder of this report, stream loads that were estimated as described above are referred to as "observed" loads to distinguish them from those predicted by the SPARROW models.
Data sets that document the size and spatial distribution of nutrient sources throughout the Bay watershed were generated on the basis of data sets developed for the CBP HSPF model. Nutrient sources considered for the Bay SPARROW model include municipal and commercial point sources, urban area, agricultural sources (fertilizer and manure application), and atmospheric deposition. Detailed descriptions of these data sets with illustrations showing the spatial distributions of the sources can be found in Brakebill and Preston (1999).
Point-source discharge information was compiled by the CBP (Wiedeman and Cosgrove, 1997) from the USEPA's Permit Compliance System (PCS) point-source discharge monitoring reports that are generated for each State. Included in the original data sets are locations of each point source and monthly estimates of waste discharge in pounds per year (lb/yr). For application in the SPARROW model, each point source was linked with a stream reach on the basis of the digital network described previously. The point-source load for each stream reach was calculated by taking the average annual waste discharge for the period 1986-88.
Urban area was considered a possible source of nutrients because of the potential for accumulation of nitrogen on impermeable surfaces and subsequent wash-off to streams. An urban-area variable was defined on the basis of the digital land-cover data set developed by the CBP (Gutierrez-Magness and others, 1997) (fig. 2). That data set includes a number of land-cover categories that were aggregated for the purposes of SPARROW application. Urban area was defined by combining the high-intensity, low-intensity, woody, and herbaceous urban categories. Acres of urban area within each watershed segment were calculated and included as input to the SPARROW model.
Loading from agricultural fertilizer and manure sources was quantified by using land-cover data, county-level agricultural statistics, and CBP application rates. Total agricultural land within SPARROW model segments was quantified by the herbaceous category in the CBP land-cover data set. The acreage of herbaceous area was subdivided into conventional till, conservation till, and hay land area by using county-level agricultural statistics. The fractions of each category were quantified for each county, and the total agricultural area within a given watershed segment and county was multiplied by those fractions to calculate the area of each of the three classes of agricultural land. Loading due to fertilizer or manure application was calculated by multiplying the acres of each type of agricultural area by application rates in pounds per acre per year (lb/acre/yr) that were defined for each area of the watershed by the CBP (Gutierrez-Magness and others, 1997). The total fertilizer or manure load was subsequently calculated by combining the loads of all three agricultural land categories.
Atmospheric deposition of nitrogen for the Chesapeake Bay watershed was quantified on the basis of point-location measurements collected by the National Atmospheric Deposition Program (NADP) (National Atmospheric Deposition Program, 1988). Point measurements of atmospheric deposition were converted to a spatial data set through linear spatial interpolation (Smith and others, 1997). The spatial data set was then merged with the watershed network, and the total nitrogen from atmospheric deposition was calculated for each watershed segment.
Watershed Characteristics Data
Meteorological and land-surface characteristics data were compiled from a variety of tabular and spatial data sets. Each of the variables was considered to be a potentially important factor in controlling the delivery of nutrients from the land surface to streams. Application of the statistical methods in SPARROW provides a way of testing the significance of each of the variables so that the environmental factors that are most related to nutrient loading can be identified. Land-characteristic variables considered as potentially important include air temperature, precipitation, land-surface slope, soil permeability, stream density, and wetland area.
Meteorological variables that are potentially important to stream nitrogen loading include air temperature and precipitation. Air temperature can affect the amount of nitrogen that reaches streams by affecting the rate of biological processes such as denitrification. Precipitation can affect delivery by determining the volume and rate of overland flow in areas of the watershed. To evaluate the potential importance of these variables, tabular air temperature and precipitation data sets were obtained from the National Climatic Data Center for the period 1950-94 (National Climatic Data Center, 1997). Long-term average values were calculated for that period, compiled into a spatial data set, and an average air temperature and precipitation value was then calculated for each watershed segment.
Basin characteristics, such as slope and stream density, were calculated by using the network data sets. The average slope of each watershed segment was calculated from the DEM by averaging the slope (in percent) of individual 1-square-kilometer (0.39-square-mile) cells within the segment. Stream density was calculated as the ratio of the reach length to the segment area. Slope and stream density are assumed to enhance overland delivery to stream reaches. Higher slope is assumed to increase the rate at which water flows overland or in small streams and thus, increase the potential for delivery to streams. Higher stream density is assumed to increase the potential delivery to streams because it is an indication that there are shorter traveltimes from the land surface to a stream. In both cases, the variables were included in the model in reciprocal form, in order to mathematically impose an assumption of a positive effect on land-to-water delivery.
The remaining two land-characteristic variables, soil permeability and wetland area, are assumed to limit delivery to streams. Higher soil permeability is an indication of increased potential for percolation to Groundwater, which results in longer traveltime and more potential for biological modification. Similarly, wetland area is associated with slower traveltime, and greater potential for biological uptake and fixation of nutrients. An average soil-permeability value for each watershed segment was derived from the State Soil Geographic Data Base (Schwarz and Alexander, 1995) and wetland areas were derived from the land-cover data set (fig. 2) by quantifying the acreage of wetland area in each watershed segment.
Calibration of the Total Nitrogen SPARROW Model
Calibration results of the total nitrogen SPARROW model for the Chesapeake Bay watershed are summarized in table 1 and in figure 6. Table 1 summarizes the application of the statistical regression model, including parameter estimates and probability levels of significance. The table lists all explanatory variables that were considered in the model, but provides parameter estimates only for those variables that were statistically significant at the 0.10 probability level. Figure 6 is used to compare the predicted and the observed total-nitrogen load values and shows the statistical distribution of the regression residuals. In general, the fit of the model to the load data in the Chesapeake Bay watershed was good as indicated by a coefficient of determination (R2) of 0.961. The coefficient of determination is a measure of the fraction of variance in the load data that was accounted for by the independent variables used in the regression model. In this case, the model as defined in figure 4 accounted for nearly all of the variance in the load data, and this is shown graphically in figure 6. Although the coefficient of determination value of 0.961 is considered good, it is not necessarily the only statistical diagnostic tool that should be considered to evaluate the reliability of the model. It is presented here as part of a set of diagnostics that are intended to provide a means of evaluating the calibration of the model.
All the source variables were strongly significant as indicated by probability levels of less than or equal to 0.01 in all cases (table 1). Urban area was the weakest of the five source variables, but was statistically significant. Because sources are already in units of load, parameter estimates for sources should approximate the value 1, if all other total nitrogen sources are described, and if losses are accounted for through land-to-water delivery factors. For example, point sources discharge directly to streams and because point-source loads are in the same units as stream loads, the point-source parameter should be close to the value 1. In contrast, urban area is in units of area because the loading rate from urban area is not known. Thus, the urban-area parameter estimate must compensate for the differences in units, and is not expected to approximate the value 1.
Results of the calibration (table 1) indicate that the value of many of the source coefficients differs substantially from 1, indicating that some other sources or losses are unaccounted for. The value of the point-source parameter is greater than 1 ( ~1.5), and reasons for this are still being investigated. Possible explanations are that: (1) the available point-source total nitrogen load data do not fully account for all of the point-source loads; or (2) septic tanks, which are not accounted for, contribute a substantial amount of nitrogen to streams. The agricultural source (fertilizer and manure) parameter estimates are both much less than 1, and this is probably a reflection of crop uptake and other environmental processes and factors that are not accounted for elsewhere in the model. The urban-area parameter estimate is much greater than 1 because the area units are in acres, and must be converted to load in lb/yr.
Only one of the land-to-water delivery factors was statistically significant in the model. Soil permeability was weakly significant (probability level of 0.095) as a factor that limits transport of nitrogen from the land surface to streams. Presumably, soil permeability was significant because it is an indicator of the potential for nitrate to flow through Groundwater pathways that are slower and provide more potential for loss through denitrification and other processes. Potential reasons that more land-to-water delivery factors were not significant are that: (1) the number of load data that are available for model development is small and may not have provided a sufficient level of statistical detail; and/or (2) the accuracy or level of detail in the land-to-water delivery factor data sets was inadequate to establish statistical significance. Expanding the size of the load data base to increase the potential for identifying relations between stream load and other watershed characteristics that may affect loading is being evaluated. In addition, alternative land-to-water delivery factor data sets are also being evaluated for potential improvements in the level of accuracy and detail.
Instream-loss parameters were statistically significant for all three stream-size classes and for reservoirs (table 1). In all four cases, the parameters were strongly significant (probability level of <0.005), indicating the importance of instream processing in limiting the amount of nitrogen that reaches the Chesapeake Bay. Of the loss rates for the three stream-size classes, the highest was for the smallest stream class, and the value decreased monotonically with increasing stream size. This is consistent with the calibration results of the national SPARROW model and could be related to stream depth (Smith and others, 1997). Smaller, shallower streams have more contact with bottom sediments and may have a greater potential for total nitrogen loss due to biological processing at benthic surfaces. The estimated loss rate for reservoirs is high (4=0.4145), and may indicate the importance of reservoirs in trapping particulate nitrogen, which may be in the form of algal biomass. This result is especially important because recent studies have indicated that the life spans of the major reservoirs of the Susquehanna River are limited, and future loads to the Bay could be increased substantially as the reservoirs fill with sediment (Langland and Hainly, 1997).
As previously stated, an important goal of the Chesapeake Bay application of SPARROW is to support the ongoing work of the CBP and its HSPF watershed modeling effort. To ensure that the models are consistent and to provide an added basis of evaluation of the SPARROW model, predicted loads from the two models were compared. A plot of predicted total nitrogen load from the SPARROW model against predicted total nitrogen load from the HSPF model is shown in figure 7. In general, the two models are in close agreement (r2=0.86); however, differences are greater among the smaller basins. In particular, the SPARROW model predicts much higher loads for two basins at the lower end of the scale. Reasons for these differences are being investigated, but in general the models provide consistent values of predicted total nitrogen load.
Selected Applications of the Total Nitrogen SPARROW Model
A calibrated SPARROW model can be applied in a number of ways to provide information about nutrient loading. Because all information in the SPARROW model is spatially referenced, many of these applications can be shown in the form of maps. Such maps provide information about the spatial distribution of environmental factors that affect nutrient loading and provide spatial detail that is currently not available using most other watershed modeling tools. Spatially detailed illustrations of nutrient loads and the factors that affect them can be used as tools for targeting management practices or for other purposes. To illustrate ways in which a SPARROW model can be applied, selected examples of the total nitrogen model applications are shown in figures 8-12, and are described below.
Knowledge of the spatial distribution of local nutrient generation is important for water-resource managers and planners in prioritizing areas for management actions. To show local generation of total nitrogen, the SPARROW model was used to quantify what is referred to here as "incremental" yield (fig. 8). Incremental yield is the amount (load per area) of total nitrogen generated in each reach basin independent of upstream load. Thus, each reach basin is treated as an independent unit, and the amount of total nitrogen generated in each unit is quantified. These independent yields are referred to as "incremental," because in reality the total nitrogen generated by each unit is accumulated with progression downstream. Treating them separately is a way of quantifying the increment of load added by each watershed unit. In the Chesapeake Bay watershed, there are 1,408 watershed units for which an incremental yield is calculated. These yields illustrate the spatial distribution of local generation of total nitrogen in the watershed. As shown in figure 8, nutrient generation is particularly high in areas of south-central Pennsylvania, central Maryland and Virginia, and the Eastern Shore of Maryland and Delaware.
In addition to understanding the spatial distribution of local generation of total nitrogen, it is also important to understand how much of the total nitrogen that is generated locally reaches the Chesapeake Bay. Only a fraction of the total nitrogen generated in each watershed unit will reach the Bay because of losses due to instream processes. The amount of total-nitrogen loss is dependent on the traveltime and on the instream-loss rate in each reach. The cumulative loss of total nitrogen over all the reaches between the local watershed unit and the Bay determines the amount that will be delivered.
The spatial distribution of incremental yield of total nitrogen that is delivered to the Chesapeake Bay is shown in figure 9. "Delivered" yield is calculated by weighting the incremental yield of each reach by the fraction that remains after instream processing in travel from the reach to the Bay. The delivered yields are much lower than the incremental yields for many reaches, particularly those that are farther from the Bay, and therefore have longer travel- times and greater potential for instream loss (fig. 9). Comparison of figures 8 and 9 provides a contrast between the spatial distributions of local generation of total nitrogen and the potential for delivery of total nitrogen to the Bay. Both of these pieces of information are important for the control of total-nitrogen loading because areas need to be prioritized for management actions on the basis of both total-nitrogen generation and on the potential for delivery of the nitrogen to the Bay.
Another way to evaluate delivery of total nitrogen in the watershed is to quantify the "potential for delivery" to the Bay. The delivered load shown in figure 9 represents the amount of total nitrogen generated in each watershed unit and the potential for its delivery. It may also be valuable to consider the potential for delivery separately, so that water-resource managers can consider the effect of future total-nitrogen generation in areas of the watershed when prioritizing areas for total-nitrogen controls. As described above, the potential for delivery to the Bay can be represented as the fraction of total nitrogen that remains after accounting for instream losses that occur in travel from each reach to the Bay. The spatial distribution of the percentage of total nitrogen that is delivered from each reach is shown in figure 10. Areas with high potential for delivery include those that are close to the Bay, have short traveltimes, and thus have little potential for instream loss. Other areas with high potential for delivery include watershed units that are associated with the larger reaches. The larger reaches have lower estimated loss rates (table 1), thereby reducing the amount of instream loss that might occur in transit to the Bay.
Land managers in the Chesapeake Bay community need to know what are the largest sources of total nitrogen and where are they most important. SPARROW provides a means of separating the effects of each statistically significant total-nitrogen source and evaluating its relative contribution in a spatial context. The spatial distributions of the incremental yields of total nitrogen due to point sources and due to agricultural sources throughout the Bay watershed are shown in figures 11 and 12. Figure 11 indicates that point sources are most important in the major urban areas where large sewage-treatment plants discharge to stream reaches. In contrast, agricultural total-nitrogen generation is much more widely dispersed (fig. 12), and is greater than 7.94 lb/acre/yr over a large part of the watershed, including many of the same areas where the total incremental yields are high (fig. 8). These types of results provide a basis for separating the effects of different sources and for determining where they are most important. This information can be used by land- and water-resource decision makers to prioritize areas for management actions and to select management actions that are appropriate for the largest sources in any given area.
Potential Improvements to Chesapeake Bay SPARROW Models
Potential improvements of SPARROW for the Chesapeake Bay watershed include enhancements to the data sets and to the capabilities of the models. The current version of Chesapeake Bay SPARROW models (Version 1.0) has a number of limitations. First, the load data set on which the models are based was limited to 109 sites, of which only 79 had enough total-nitrogen data for estimating loads. Limited load data sets reduce the potential for demonstrating statistical significance of explanatory variables in SPARROW models, and could be one reason why more land-to-water delivery variables were not significant in the model described in this report. A second limitation of the current models is that they do not have the capability for prediction of yields in the coastal margins. Digital stream reaches are not defined in the coastal margins at the scale of the current stream-reach data set. Thus, there are no traveltime data for estimating instream loss. New versions of SPARROW for the Chesapeake Bay watershed could include enhancements that would be designed to mitigate the limitations described above.
One potential new version of Chesapeake Bay SPARROW models is similar to the current version, but would address the limitations discussed above and would explore the benefits of alternative data sets for defining explanatory variables. Efforts are currently underway to expand the stream-load data set and to develop the capability for predicting nutrient yields in the coastal margins. These improvements would benefit any new versions of Chesapeake Bay SPARROW models. In addition, previously unavailable data sets would be useful for creating new explanatory variables for consideration in the model, or to improve explanatory variables already in the current model. New explanatory variables, if they are statistically significant, could potentially improve the fit of the models and/or identify previously undetected relations between stream nutrient loads and watershed characteristics.
Another potential version of Chesapeake Bay SPARROW models could provide a tool for evaluating the influence of Groundwater on stream-nutrient loading. Much of the nitrogen reaching the Chesapeake Bay is transported from the land surface to streams through Groundwater pathways (Bachman and others, 1998). For this reason, there is interest in methods that are designed to evaluate the importance of watershed factors that influence Groundwater transport of nitrogen. A SPARROW model could be designed to predict total discharge, base flow, total load, and base-flow load. The spatial distribution of the environmental factors that affect these variables would provide another piece of information that would be useful to resource managers in prioritizing areas for management actions.
Bachman, L.J., Lindsey, B.D., Brakebill, J.W., and Powars, D.S., 1998, Groundwater discharge and base-flow nitrate loads of nontidal streams, and their relation to a hydrogeomorphic classification of the Chesapeake Bay watershed, Middle Atlantic Coast: U.S. Geological Survey Water-Resources Investigations Report 98-4059, 71 p.
Brakebill, J.W., and Preston, S.D., 1999, Digital data used to relate nutrient inputs to water quality in the Chesapeake Bay watershed, Version 1.0: U.S. Geological Survey Open-File Report 99-60, at URL http://md.water.usgs.gov/publications/ofr-99-60/.
Cohn, T.A., Delong, L.L., Gilroy, E.J., Hirsch, R.M., and Wells, D.K., 1989, Estimating constituent loads: Water Resources Research, v. 25, n. 5, p. 937-942.
Cohn, T.A., Caulder, D.L., Gilroy, E.J., Zynjuk, L.D., and Summers, R.M., 1992, The validity of a simple statistical model for estimating fluvial constituent loads: An empirical study involving nutrient loads entering Chesapeake Bay: Water Resources Research, v. 289, no. 9, p. 2353-2363.
DeWald, T., Horn, R., Greenspun, R., Taylor, P., Manning, L., and Montalbano, A., 1985, STORET Reach Retrieval Documentation: U.S. Environmental Protection Agency, Washington, D.C.
Donigian, A.S., Bicknell, B.R., Patwardhan, A.S., Linker, L.C., and Chang, C., 1994, Chesapeake Bay Program watershed model application to calculate Bay nutrient loadings--Final facts and recommendations: Report No. EPA 903-R-94-042, U.S. Environmental Protection Agency Chesapeake Bay Program Office, Annapolis, Maryland, 283 p.
Environmental Systems Research Institute (ESRI), Inc., 1992, Arc/Info Users Guide, Cell-based Modeling with Grid, 2d Edition: Redlands, California, 192 p.
Gutierrez-Magness, A.L., Hannawald, J.E., Linker, L.L., and Hopkins, K.J., 1997, Chesapeake Bay watershed model application and calculation of nutrient and sediment loadings, Appendix E: Report No. EPA 903-R-97-019, U.S. Environmental Protection Agency Chesapeake Bay Program Office, Annapolis, Maryland, 142 p.
Langland, M.J., and Hainly, R.A., 1997, Changes in bottom-surface elevations in three reservoirs on the lower Susquehanna River, Pennsylvania and Maryland, following the January 1996 flood--Implications for nutrient and sediment loads to Chesapeake Bay: U.S. Geological Survey Water-Resources Investigations Report 97-4138, 34 p.
Langland, M.J., Leitman, P.L., and Hoffman, S.J., 1995, Synthesis of nutrient and sediment data for watersheds within the Chesapeake Bay drainage basin: U.S. Geological Survey Water-Resources Investigations Report 95-4233, 121 p.
National Atmospheric Deposition Program, 1988, NADP/NTN Annual data summary: Precipitation chemistry in the United States 1987: National Resources Ecology Laboratory, Colorado State University, Fort Collins, Colorado.
National Climatic Data Center, 1997, National Climatic Data Monthly Precipitation Data for U.S. Cooperative and National Weather Service Sites, accessed January 12, 1998, at URL http://www.ncdc.noaa.gov/ol/ncdc.html.
Schwarz, G.E., and Alexander, R.B., 1995, State soil geographic (STATSGO) data base for the conterminous United States version 1.1: U.S. Geological Survey Open-File Report 95-449 (scale 1:250,000), accessed February 10, 1998, at URL http://water.usgs.gov/-usgswrd/ussoils.html.
Smith, R.A., Alexander, R.B., Tasker, G.D., Price, C.V., Robinson, K.W., and White, D.A. 1993, Statistical modeling of water quality in regional watersheds, in Proceedings of Watershed 93, A national conference on watershed management, Alexandria, Virginia, March 21-24, 1993.
Smith, R.A., Schwarz, G.E., and Alexander, R.B., 1997, Regional interpretation of water-quality monitoring data: Water Resources Research, v. 33, no.12, p. 2781-2798.
Verdin, K.L., and Greenlee, S.K., 1996, Development of Continental Scale Digital Elevation Models and Extraction of Hydrographic Features, in Proceedings, Third International Conference/Workshop on Integrating GIS and Environmental Modeling, Sante Fe, New Mexico, January 21-26, 1996, National Center for Geographic Information and Analysis, Santa Barbara, California.
Wiedeman, A., and Cosgrove, A., 1997, Chesapeake Bay Watershed Model Application and Calculation of Nutrient and Sediment Loadings, Appendix F: Report No. EPA 903-R-94-042, U.S. Environmental Protection Agency Chesapeake Bay Program Office, Annapolis, Maryland, 50 p.
We gratefully acknowledge the role of Richard A. Smith, Gregory E. Schwarz, and Richard B. Alexander of the USGS, who developed SPARROW and made this work possible by providing data sets, computer code, feedback, and assistance.
We also acknowledge the efforts of James M. Gerhart, Jeffrey M. Fischer, Earl A. Greene, and Scott W. Phillips of the USGS, and Gary Shenk of the CBP, who reviewed this report and provided valuable feedback on its content and organization.
Publication support staff included Valerie M. Gaine, who provided editorial assistance, and Timothy W. Auer, who designed and produced the layout of this report.
For further information contact:
Chesapeake Bay Program Coordinator
U.S. Geological Survey
5522 Research Park Drive
Baltimore, MD 21228
or visit the USGS Chesapeake Bay Homepage:
For more information on the Chesapeake Bay Program: http://www.chesapeakebay.net
Report WRIR 99-4054