New Technologies for Data Analysis
D. Jensen, "New Technologies for Data Analysis," Office of Technology
Assessment. Washington, DC. December 12, 1991.
- Abstract
- Most familiar data analysis methods were developed before 1950.
These methods include statistical significance testing, linear
regression, and discriminant analysis. They emphasize mathematical
equations, statistical tables, and theoretical assumptions. The
speed and accessibility of these methods changed with the introduction
of the computer in the 1950s, but their essential character remained
the same. Their theoretical assumptions are often violated by
real-world data, and their results can be cryptic and unenlightening.
Instead of assisting analysts, these methods often artificially
limit inquiry.
Over the past two decades, researchers at Bell Labs and several
universities developed a new generation of computer-based tools
that focus on interactive graphics and computer-intensive statistical
methods. They facilitate data exploration and often avoid limiting
assumptions. Compared to conventional methods, such tools allow
analysts to interact with data in a much more flexible and natural
way.
Software implementing these tools recently became available outside
of research laboratories. The talk discussed available software
and its application to OTA studies. Such tools could assist the
design of graphics for presentation, make quantitative analysis
more accessible, and reveal previously hidden relationships.
- Recommended Reading
- Chambers, J., W. Cleveland, B. Kleiner, and P. Tukey (1983). Graphical Methods for Data Analysis, Belmont, CA: Wadsworth and Boston: Duxbury.
- Graphics for data analysis.
Cleveland, W. (1985). Elements of Graphing Data. Monterey, CA: Wadsworth.
Graph construction
Cleveland, W. and R. McGill (1985). "Graphical Perception and
Graphical Methods for Analyzing Scientific Data." Science 229:828-833.
How people perceive graphs
Cleveland, W. and R. McGill, Eds. (1988). Dynamic Graphics for Statistics. Pacific Grove, CA: Wadsworth and Cole.
Early work on dynamic graphics.
Efron, B. and R. Tibshirani (1991). "Statistical Data Analysis
in the Computer Age." Science 253: 390-395.
Computer intensive statistical techniques.
Tufte, E. (1983). The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press.
Wide variety of static graphics. Delightful reading.
Tufte, E. (1990). Envisioning Information. Cheshire, CT: Graphics Press.
More delightful reading about graphics.
Tukey, J. (1977). Exploratory Data Analysis. Reading, MA: Addison-Wesley.
Graphical and non-graphical methods for exploring data.
Velleman, P. (1989). Learning Data Analysis with Data Desk. New York: Freeman.
Basic statistics using conventional statistical tools and dynamic
graphics. Examples use Velleman's software -- Data Desk.
- Links
- Office of Technology Assessment