• DocumentCode
    1289987
  • Title

    Maximum Margin Bayesian Network Classifiers

  • Author

    Pernkopf, Franz ; Wohlmayr, Michael ; Tschiatschek, Sebastian

  • Author_Institution
    Dept. of Electr. Eng., Graz Univ. of Technol., Graz, Austria
  • Volume
    34
  • Issue
    3
  • fYear
    2012
  • fDate
    3/1/2012 12:00:00 AM
  • Firstpage
    521
  • Lastpage
    532
  • Abstract
    We present a maximum margin parameter learning algorithm for Bayesian network classifiers using a conjugate gradient (CG) method for optimization. In contrast to previous approaches, we maintain the normalization constraints on the parameters of the Bayesian network during optimization, i.e., the probabilistic interpretation of the model is not lost. This enables us to handle missing features in discriminatively optimized Bayesian networks. In experiments, we compare the classification performance of maximum margin parameter learning to conditional likelihood and maximum likelihood learning approaches. Discriminative parameter learning significantly outperforms generative maximum likelihood estimation for naive Bayes and tree augmented naive Bayes structures on all considered data sets. Furthermore, maximizing the margin dominates the conditional likelihood approach in terms of classification performance in most cases. We provide results for a recently proposed maximum margin optimization approach based on convex relaxation [1]. While the classification results are highly similar, our CG-based optimization is computationally up to orders of magnitude faster. Margin-optimized Bayesian network classifiers achieve classification performance comparable to support vector machines (SVMs) using fewer parameters. Moreover, we show that unanticipated missing feature values during classification can be easily processed by discriminatively optimized Bayesian network classifiers, a case where discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.
  • Keywords
    belief networks; conjugate gradient methods; convex programming; feature extraction; learning (artificial intelligence); maximum likelihood estimation; pattern classification; CG-based optimization; conditional likelihood learning; conjugate gradient method; convex relaxation; discriminative parameter learning; margin-optimized Bayesian network classifiers; maximum likelihood learning; maximum margin Bayesian network classifiers; maximum margin optimization approach; maximum margin parameter learning algorithm; missing feature handling; normalization constraints; probabilistic interpretation; Algorithm design and analysis; Bayesian methods; Fasteners; Niobium; Optimization; Random variables; Training; Bayesian network classifier; convex relaxation.; discriminative classifiers; discriminative learning; large margin training; missing features; Algorithms; Bayes Theorem; Humans; Learning; Pattern Recognition, Automated; Speech;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2011.149
  • Filename
    5975159