• DocumentCode
    1680664
  • Title

    MR-MNBC: MaxRel based feature selection for the multi-relational Naïve Bayesian Classifier

  • Author

    Vaghela, Vimalkumar B. ; Vandra, Kalpesh H. ; Modi, Nilesh K.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Bharathiar Univ., Coimbatore, India
  • fYear
    2013
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    Today´s real time applications data are stored in relational databases. In conventional approach to mine data, we often use to join several relations to form a single relation using foreign key links, which is known as flatten. Flatten may cause problems such as time consuming, data redundancy and statistical skew on data. Hence, how to mine data directly on numerous relations become a critical issue. The solution of the given issue is the approach called multi-relational data mining (MRDM) which has been successfully applied in a variety of areas, such as marketing, sales, finance, fraud detection, natural sciences, biological. There has been many ILP-based methods proposed in previous researches, but there are still problems unsolved such as scalability i.e. how to handle large dataset with large number of relations with large number of features. Irrelevant or redundant attributes in a relation may not make contribution to classification accuracy. Thus, feature selection is an essential data preprocessing step in multi-relational data mining. By filtering out irrelevant or redundant features from relations for data mining, we improve classification accuracy, achieve good time performance, and improve comprehensibility of the models. We had proposed the method MR-MNBC which is based on MaxRel feature selection as a preprocessing task for Multi-relational Naïve Bayesian Classifier. MaxRel method uses InfoDist and Pearson´s Correlation parameters, which will be used to filter out irrelevant and redundant features from the multi-relational database and will enhance classification accuracy. We analyzed our algorithm over PKDD financial dataset and got the better accuracy compare to the existing features selection methods.
  • Keywords
    Bayes methods; data mining; pattern classification; relational databases; ILP-based methods; InfoDist correlation parameters; MR-MNBC; MRDM; MaxRel based feature selection; PKDD financial dataset; Pearson correlation parameters; data redundancy; flatten; multirelational data mining; multirelational naïve Bayesian classifier; relational databases; statistical skew; Accuracy; Algorithm design and analysis; Classification algorithms; Correlation; Data mining; Filtering algorithms; Relational databases; Feature Selection; Multi-relational classification; Navie Bayesian; Relational data mining; Semantic Relationship Graph; Tuple ID Propagation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering (NUiCONE), 2013 Nirma University International Conference on
  • Conference_Location
    Ahmedabad
  • Print_ISBN
    978-1-4799-0726-7
  • Type

    conf

  • DOI
    10.1109/NUiCONE.2013.6780067
  • Filename
    6780067