DocumentCode :
397938
Title :
Machine quantification of text-based economic reports for use in predictive modeling
Author :
Gao, Lu ; Beling, Peter A.
Author_Institution :
Dept. of Syst. & Inf. Eng., Virginia Univ., Charlottesville, VA, USA
Volume :
4
fYear :
2003
fDate :
5-8 Oct. 2003
Firstpage :
3536
Abstract :
To quantify text-based unstructured information, we propose a method called the direct scoring algorithm (DSA). DSA uses keywords in the document, subjectively-determined numerical weights, and subjectively-designed grammar rules to score individual sentences. We use our methods to score the Beige books produced by the U.S. Federal Reserve, which contain subjective text-based commentary on state of the economy. To assess whether our scores have value in a predictive sense, we use them to construct a linear regression model of future growth in U.S. gross domestic product (GDP). We then compare the performance characteristics of this model with those a similar model based on scores of the same documents produced though subjective reading by professional economists. The comparison demonstrates that the DSA model using the Beige book significantly contributes to the prediction of GDP growth, explaining as much as 69% of the variance compared to the scores created by economic experts. We also add the extracted section scores to a GDP time series prediction model, which uses only structured data as input. The results of this experiment suggest the unstructured information in the Beige books has predictive value that goes beyond that of the structure information used in the time series model, and that our approach has some potential as a means of extracting this information in a semi-automated fashion.
Keywords :
data analysis; data mining; economic indicators; regression analysis; text analysis; Beige books; U.S. Federal Reserve; U.S. gross domestic product; direct scoring algorithm; linear regression model; machine quantification; predictive modeling; subjectively-designed grammar rules; subjectively-determined numerical weights; syntactical analysis; text mining; text-based economic reports; Books; Data mining; Economic forecasting; Economic indicators; Electric breakdown; Frequency; Linear regression; Predictive models; Systems engineering and theory; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2003. IEEE International Conference on
ISSN :
1062-922X
Print_ISBN :
0-7803-7952-7
Type :
conf
DOI :
10.1109/ICSMC.2003.1244437
Filename :
1244437
Link To Document :
بازگشت