• DocumentCode
    2774426
  • Title

    An Efficient Decision Tree Construction for Large Datasets

  • Author

    Uyen Nguyen Thi Van ; Chung, Tae Choong

  • Author_Institution
    KyungHee Univ., Seoul
  • fYear
    2007
  • fDate
    18-20 Nov. 2007
  • Firstpage
    21
  • Lastpage
    25
  • Abstract
    In this paper, we propose a new data structure and a new framework of building decision tree classifiers that is especially suitable for large datasets. The most prominent feature of our algorithm is that in order to build a decision tree, only one scan over the entire database is needed. Compared with previous methods, where at each level of the tree one scan over the whole database is made, our algorithm is obviously much more efficient. Moreover, our algorithm provides onetime sort process for numeric attributes, which significantly reduces the sorting cost and hence the whole execution time. The experimental results show that our algorithm outperforms the RainForest algorithm - a well-known and efficient algorithm for decision tree construction - in time dimension. This proves that our algorithm can be applied into large datasets efficiently.
  • Keywords
    decision trees; pattern classification; tree data structures; data structure; decision tree classifiers; decision tree construction; Buildings; Classification tree analysis; Costs; Data structures; Decision trees; Partitioning algorithms; Sorting; Spatial databases; Training data; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovations in Information Technology, 2007. IIT '07. 4th International Conference on
  • Conference_Location
    Dubai
  • Print_ISBN
    978-1-4244-1840-4
  • Electronic_ISBN
    978-1-4244-1841-1
  • Type

    conf

  • DOI
    10.1109/IIT.2007.4430464
  • Filename
    4430464