Building Simple Models: A Case Study with Decision Trees
David Jensen, Tim Oates, and Paul R. Cohen. "Building Simple Models:
A Case Study with Decision Trees." To appear in Proceedings of
the Second International Symposium on Intelligent Data Analysis.
July 1997.
- Abstract
- Building correctly-sized models is a central challenge for induction
algorithms. Many approaches to decision tree induction fail this
challenge. Under a broad range of circumstances, these approaches
exhibit a nearly linear relationship between training set size
and tree size, even after accuracy has ceased to increase. These
algorithms fail to adjust for the statistical effects of comparing
multiple subtrees. Adjusting for these effects produces trees
with little or no excess structure.
- Text
- A Postscript version of this paper is available (187K).