|
|
||
section will emphasize an approach which makes no such statistical assumptions and is called
<exploratory data analysis'.
Exploratory data analysis is the approach to analysis popularized by John W. Tukey (1977)
and his students. In this approach the assumption is that data is available and there may be no design
whatsoever to its collection. The task of exploratory data analysis is to find trends or statistical
relationships as efficiently as possible and with few preconceived assumptions.
Because this approach employs well-known accepted statistical procedures and exists mainly
as a point of view, there are two ways to employ the approach just as in other statistical endeavors.
The first way is to identify trends and use these to form predictive statistical relationships (models). The
second way is to view the procedure as a means of discovering possible relationships from which to
form new hypotheses for future studies. The two ways of employing this approach differ mostly in
intent. The power of the approach is that it allows the most complete analysis of data.
The tools for data analysis begin with examination of the data. This is sometimes facilitated with
tables of summary statistics but most often by simple graphs or scatter plots. Here the analyst can
visibly detect potential trends. Repetition of the plots employing subsets or novel sorts of the data can
bring out many potential relationships. Statistics then is ready to supply confidence to them or to help
locate hidden relationships.
The tools required for statistical and graphical analysis are often different. However, the first
approach to data management discussed here listed two software tools capable of doing nearly
everything. Software packages designed specifically for statistical analysis include, for example, Systat
which is a comprehensive statistics package capable of graphical representation as well. One package,
Data Desk, was written specifically for exploratory data analysis. Graphical display can be
accomplished using many off-the-shelf software packages such as SigmaPlot and DeltaGraph.
Graphical software such as these have fuller capability than available with spreadsheet programs.
Isoplots (three-dimensional graphics) may be created by DeltaGraph and some other very powerful
programs such as Surfer and Spyglass. All of these programs continue to add capability and power.
2.3.5.1 Specialized Analyses
Data analysis can pursue simple exploratory methods or it can employ methods with specific
goals. The data can be used to provide input to a variety of computational techniques ranging from
tools such as SELECT to simple models such as PROFILE or BATHTUB to complex models such as
CEQUAL and its many manifestations.
In one view all of these tools and models may be compared in terms of their simplicity, their
ease of use, their generality, precision, and their realism. As often stated, all three of the last qualities
are not often found together. The quest for realism has led to very sophisticated approaches such as
2.3-9
|
||