• DocumentCode
    2913987
  • Title

    A bottom-up oblique decision tree induction algorithm

  • Author

    Barros, Rodrigo C. ; Cerri, Ricardo ; Jaskowiak, Pablo A. ; De Carvalho, André C P L F

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Sao Paulo (USP), São Carlos, Brazil
  • fYear
    2011
  • fDate
    22-24 Nov. 2011
  • Firstpage
    450
  • Lastpage
    456
  • Abstract
    Decision tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The majority of the oblique and univariate decision tree induction algorithms perform a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose a novel bottom-up algorithm for inducing oblique trees named BUTIA. It does not require an impurity-measure for dividing nodes, since we know a priori the data resulting from each split. For generating the splitting hyperplanes, our algorithm implements a support vector machine solution, and a clustering algorithm is used for generating the initial leaves. We compare BUTIA to traditional univariate and oblique decision tree algorithms, C4.5, CART, OC1 and FT, as well as to a standard SVM implementation, using real gene expression benchmark data. Experimental results show the effectiveness of the proposed approach in several cases.
  • Keywords
    data mining; decision trees; support vector machines; BUTIA; SVM implementation; axis-parallel splits; bottom-up oblique decision tree induction algorithm; clustering algorithm; data mining; impurity-based measure; knowledge discovery; model comprehensibility; real gene expression benchmark data; splitting nodes; support vector machine solution; top-down strategy; univariate trees; Accuracy; Clustering algorithms; Decision trees; Gene expression; Merging; Partitioning algorithms; Support vector machines; SVM; bottom-up induction; clustering; hybrid intelligent systems; oblique decision trees;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on
  • Conference_Location
    Cordoba
  • ISSN
    2164-7143
  • Print_ISBN
    978-1-4577-1676-8
  • Type

    conf

  • DOI
    10.1109/ISDA.2011.6121697
  • Filename
    6121697