Mining of Concurent Text and Time Series

V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, and J. Allan (2000). Mining of concurent text and time series. Papers of the ACM SIGKDD 2000 Workshop on Text Mining. pp. 37-44.

Abstract
We present a unique approach to identifying text segments that forecast the behavior of a time series. Specifically, we describe the design and implementation of AEnalyst, a system that correlates news stories with trends in time series. We identify trends in time series using piecewise linear fitting and then assign labels to the trends according to an automated clustering algorithm. We use language models to find and represent text that is highly associated with particular labeled trends. AEnalyst can then identify news stories that are highly indicative of future trends, and thereby forecast the behavior of the time series. In this paper we evaluate our system by correlating financial news stories and stock prices. We show that language models successfully capture relationships between new stories and trends. We also evaluate the system in terms of its ability to predict the behavior of the stock market. Combining the two models gives us the ability to mine relationships between concurrent text and time series data.
Text
A PDF version of this paper is available.


Feedback Back to main page Fineprint