• DocumentCode
    2724399
  • Title

    Algorithms for Fast Large Scale Data Mining Using Logistic Regression

  • Author

    Rouhani-Kalleh, Omid

  • Author_Institution
    Microsoft, Redmond, WA
  • fYear
    2007
  • fDate
    March 1 2007-April 5 2007
  • Firstpage
    155
  • Lastpage
    162
  • Abstract
    This paper proposes two new efficient algorithms to train logistic regression classifiers using very large data sets. Our algorithms will lower the upper bound time complexity that the existing algorithm in the literature has and our experiments confirm that our proposed algorithms significantly improve the execution time. For our data sets, which come from Microsoft´s Web logs, the execution time was reduced up to 353 times as compared to the algorithm often referenced in the literature. The improvement will be even greater for larger data sets
  • Keywords
    data mining; regression analysis; very large databases; fast large scale data mining; logistic regression classifiers; upper bound time complexity; very large data sets; Classification algorithms; Computational intelligence; Data mining; Equations; Large-scale systems; Least squares methods; Logistics; Proposals; USA Councils; Upper bound;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0705-2
  • Type

    conf

  • DOI
    10.1109/CIDM.2007.368867
  • Filename
    4221291