Digital data sets compiled by the U.S. Geological Survey were used as input for a collection of Spatially Referenced Regressions On Watershed (SPARROW) attributes for the Chesapeake Bay region including parts of Delaware, Maryland, New York, Pennsylvania, Virginia, West Virginia, and the District of Columbia. These regressions use a nonlinear statistical approach to relate nutrient sources and land-surface characteristics to nutrient loads of streams throughout the Chesapeake Bay watershed. A digital segmented-watershed network serves as the primary framework for spatially referencing nutrient-source and land-surface characteristic data within a geographic information system.
Flow direction and flow accumulation generated from a 30-meter cell-size Digital Elevation Model and attributes from 1:500,000-scale stream data were used to generate stream and watershed networks. Spatial data sets representing nutrient inputs of total nitrogen and total phosphorus from the early 1990's were created and compiled from numerous sources. Data include atmospheric deposition, septic systems, point-source locations, land use, land cover, and agricultural sources such as commercial fertilizer and manure. Some land-surface characteristic data sets representing factors that affect the transport of nutrients also were compiled. Data sets include land use, land cover, average-annual precipitation and temperature, slope, hydrogeomorphic regions, and soil permeability.
Nutrient-input and land-surface characteristic data sets merged with the segmented-watershed network provide the spatial detail by watershed segment required by SPARROW. Stream-nutrient load estimates for 132 sampling sites representing the early 1990's (103 for total nitrogen and 121 for total phosphorus) serve as the dependent variables for the regressions. These estimates were used to calibrate models of total nitrogen and total phosphorus depicting 1992 land-surface conditions. Examples of model predictions consist of stream-nutrient load and source percentages contributed locally to each stream reach, as well as percentages of the load that reach Chesapeake Bay.
The data set CBWS_2 represents an attributed segmented-watershed network generated from digital elevation data (USGS NED, 1999) that was used to support SPARROW for the early 1990's (Version 2.0) in the Chesapeake Bay watershed. The data is distributed in both a raster and vector format and is esential when displaying nutrient yield estimates presented in the USGS open-file report 01-251 Digital Data Used to Relate Nutrient Inputs to Water Quality in the Chesapeake Bay Watershed, Version 2.0.
The dataset CBWS_2 was created specifically to enhance the application of SPARROW in the Chesapeake Bay watershed. Using an automated process, watershed boundaries for each stream reach segment in the Chesapeake Bay watershed (Brakebill and others, ERF1_2, 2000) were generated based on 30m digital elevation data (USGS NED, 1999). These segmented watersheds were used to aggregate nutient input and land surface characteristic information neceaasry for the SPARROW. They are also used to display nutrient loading predictions and source percentage estimated by SPARROW.
Revised and updated digital spatial data sets will be created and distributed by the USGS as planned enhancements and applications for SPARROW are completed.
Any use of trade, product, or firm names is for descriptive purposes
only and does not imply endorsement by the
U.S. Geological Survey.
Although this Federal Geographic Data Committee-compliant metadata
file is intended to document the data set in nonproprietary form,
as well as in ARC/INFO format, this metadata file may include some
ARC/INFO-specific terminology.
In order to improve the existing network, a new stream reach network
was created using synthetic stream channels generated from 30-meter
grid digital elevation data. The USGS DEMs, organized by 4
digit hydrologic units, (0205, 0206, 0207, and 0208) were acquired
from preliminary versions of the National Elevation Data Base
(USGS NED, 1999). This elevation data had been projected into an
Albers Equal Area projection and converted into an intiger grid
(IMAGEGRID) by team members at the USGS EROS Data Center.
Arc/INFO's flowdirection function was used to create a flow
direction GRID. Flow accumulation, was then calculated for each
30-meter cell within each hydrologic unit.
FLOWDIRECTION (DEM)
FLOWACCUMULATION(Direction GRID)
A synthetic stream network was generated for each unit using a
threshold of 5,000 flow-accumulated cells that will flow into a
single cell (condition > 5000). The number 5,000 was chosen as a
threshold because it yielded the best results in test areas that
were comparable to the desired final scale of 1:500,000, or
comparable to the modified stream reaches used in Version I of the
SPARROW applications (Brakebill and Preston, 1999).
MAINC = con(Flowaccumulation > 5000)
The synthetic stream network was converted from a raster form to a
vector form (GRIDLINE) and was compared to the National Hydrography
Dataset (NHD) 1:100,000 stream data for positional accuracy
(USGS NHD, 1999). Where the DEMs failed to yield satisfactory
stream networks (typically in flat coastal areas or near wide
rivers, lakes and reservoirs), the NHD vector data was inserted. The
stream data by hydrologic unit was then appended together, forming
one dataset which was tested for connectivity to ensure proper
topology using TRACE in Arcplot. FLIP was used to correct any reaches
pointed in the wrong direction.
The process using flow direction and flow accumulation generated more
synthetic stream reaches than were necessary to build the improved
segmented network. A flag in the dataset was attributed if the
existing stream reach corresponded to a modified ERF1 stream reach
used in Version I (FLAG = 1). This "main channel" was then selected
out of the data set to produce a subset of stream reaches that
corresponded to the Version I reach data (Reselect FLAG = 1). These
reaches were then attributed with the same unique reach identifier
that corresponds to the Version I reach file (ERF1##), creating a
one to one relationship between the two data sets. This method was
accomplished by the following commands complien in AML and AML menus:
editc erf1_1;editf arc;sel
&s id [show select 1];&s erfno [show arc %id% ITEM ERF1##]
removeedit erf1_1
editc mainc
editf arc;draw;sel
calc ERF1## = %erfno%
In earlier reach files (RF1, ERF1, ERF1_1), one ARC was equivalent
to one reach. The process of converting the raster stream network to
a vector coverage exceeded the 500 vertices per arc limitation for
some reaches. Therefore, since the attribute data is reach based, it
was necessary to duplicate some information on an ARC level.
An example would be reach 4278 (E2RF1## = 4278)
Three arcs represent this reach. The traveltime for the reach
is represented in the RCHTOT attribute, and is duplicated for each
ARC record (cover# number). Since traveltime is a function of
length, local traveltime (LRCHTOT) was also calculated for each
ARC. The sum of LRCHTOT for each ARC with the same reach number
(E2RF1##) would give you the traveltime (RCHTOT) for that reach.
Other than having multiple reach numbers, LRESTOT and TOTLEN are
affected by this limitation.
To ensure load estimations used for model calibration were
constantly referenced to the downstream end of a reach, a node was
placed at each streamflow sampling location and attributed with the
USGS station identification number (STAID).
sel;split;ef node;sel;moveitem
All reaches upstream of a sampling station were assigned a new unique
reach identifier (E2RF1##) between 10000 and 11000 along with the
STAID of the USGS streamflow sampling site. Placing a node at the
streamflow sampling location and assigning it a unique E2RF1## value
also ensured that a watershed would be generated at each calibration
site on its associated reach. ERF1## and E2RF1## values are the same
for any reach that is not associated with a sampling site.
Calc E2RF1## = ERF1##
For any reach associated with a sampling location from Version I
and not Version II, the downstream ERF1## value from Version I was
calculated in E2RF1##. E2RF1## now represents the new unique
stream-reach identifier for Version II and is used as a common
field identifier.
Attributes essential to the modeling were transferred for each
reach from the modified ERF1 data set using the reach identification
number (ERF1##) as the related field in each data set. The
attributes transferred included: mean water velocity, mean
streamflow, hydrologic unit code, primary stream name, time of
travel for reaches within reservoirs, and stream segment number.
Stream reaches that did not previously exist in ERF1 and were added
to the data set were attributed with estimated velocity and
streamflow values based on existing reaches with similar lengths
and watershed characteristics. Traveltime for stream reaches
(length/velocity), a necessary attribute used to estimate
instream-losses, was calculated separately for each reach, as was
reach length. Node topology (FNODE# and TNODE#), which determines
upstream and downstream connectivity of the stream reach network,
is also necessary input for the models.
relate add;r;erf1_1.aat;INFO;ERF1##;ERF1##;linear
calc MEANV = r//MEANV
calc HEANQ = r//MEANQ
calc HUC = r//HUC
calc PNAME = r//PNNAME
calc RESTOT = r//RESTOT
calc SEG = r//SEG
calc RCHTOT = length * .00003797 / MEANV
Shoreline locations of major estuaries within the Chesapeake Bay
watershed were added to the stream data set. Nodes were placed at
arbitrary locations along the shoreline creating new reach segments
and each E2RF1## was calculated with a unique value greater than
80,000. Streamflow and velocity information for these reaches were
then estimated based on various watershed characteristics from
existing reasches of similar length.
The 30-meter flow direction grid and the new attributed synthetic
reach network were used together to generate a 30-meter grid of
watershed drainage areas for each stream reach. This was
accomplished by converting the reach network back into a 30-meter
grid using the unique reach identifier (E2RF1##) as a value item.
Watershed areas for each reach (all cells with the same E2RF1##
value) were generated using all reach cells which represent the
stream channel as the source GRID in the WATERSHED function, or the
lowest points within the watershed
(Environmental Systems Research Institute, 1992a). By using this
method, all cells that represent a single reach are used as pour
points, not just a single cell representing the absolute lowest
point on the downstream end of a reach that is typically used.
This method also maintains the E2RF1## value from the associated
reach network in the watershed data set. It serves as an
identification tool as well as a common field to related data sets.
LINEGRID mainc mainc_g E2RF1##
CBWS_G = watershed(flowdirection,mainc_g)
CBWS_G.VAT: - GRID Value Attribute table
COLUMN ITEM NAME WIDTH OUTPUT TYPE N.DEC ALTERNATE NAME
1 VALUE 4 10 B - E2RF1##
5 COUNT 4 10 B -
Value - Value of cell, represents unique reach identification
number (E2RF1##). Used as common column to relate data
presented in appendix of the report.
Count - Automatic item, number of cells containing specific value
CBWS.PAT - Vector polygon attribute table
1 AREA 4 12 F 3
5 PERIMETER 4 12 F 3
9 CBWS# 4 5 B -
13 CBWS-ID 4 5 B -
17 E2RF1## 4 8 B -
E2RF1## - represents unique reach identification number (E2RF1##).
Used as common column to relate data presented in
this report.
Although these data have been used by the U.S. Geological Survey, U.S. Department of the Interior, no warranty expressed or implied is made by the U.S. Geological Survey as to the accuracy of the data.
The act of distribution shall not constitute any such warranty, and no responsibility is assumed by the U.S. Geological Survey in the use of this data, software, or related materials.
Generated by mp version 2.2.5 on Tue Mar 13 14:05:17 2001