Title :
Algorithms for Fast Large Scale Data Mining Using Logistic Regression
Author :
Rouhani-Kalleh, Omid
Author_Institution :
Microsoft, Redmond, WA
fDate :
March 1 2007-April 5 2007
Abstract :
This paper proposes two new efficient algorithms to train logistic regression classifiers using very large data sets. Our algorithms will lower the upper bound time complexity that the existing algorithm in the literature has and our experiments confirm that our proposed algorithms significantly improve the execution time. For our data sets, which come from Microsoft´s Web logs, the execution time was reduced up to 353 times as compared to the algorithm often referenced in the literature. The improvement will be even greater for larger data sets
Keywords :
data mining; regression analysis; very large databases; fast large scale data mining; logistic regression classifiers; upper bound time complexity; very large data sets; Classification algorithms; Computational intelligence; Data mining; Equations; Large-scale systems; Least squares methods; Logistics; Proposals; USA Councils; Upper bound;
Conference_Titel :
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0705-2
DOI :
10.1109/CIDM.2007.368867