• DocumentCode
    3519689
  • Title

    Multi-Modal Multiple-Instance Learning with the application to the cannabis webpage recognition

  • Author

    Wang, Yinjuan ; Xie, Nianhua ; Hu, Weiming ; Yang, Jinfeng

  • Author_Institution
    Coll. of Aviation Autom., Civil Aviation Univ. of China, Tianjin, China
  • fYear
    2011
  • fDate
    28-28 Nov. 2011
  • Firstpage
    105
  • Lastpage
    109
  • Abstract
    With the development of the World Wide Web, there exists more and more illicit drug Webpages. Thus, how to screen cannabis Webpages on the internet is a quite important issue. Conventional methods that only use the keyword-based or image-based approaches are not sufficient. We propose a Multi-Modal Multiple-Instance Learning (MMMIL) approach combining both text and image information for cannabis webpage recognition. The main technical contributions of our work are two-fold. First, the text information associated with images is used to build a pre-classifier, which can pre-select pseudo positive training bags from new Webpages to update multi-modal classifier. This can be seen as a pseudo active learning process. Second, we design an efficient instance selection technique by utilizing text information to speed up the training process without compromising the performance. The experiments on a dataset containing over 40,000 images for more than 4,000 Webpages demonstrate the effectiveness and efficiency of the proposed approach.
  • Keywords
    Internet; learning (artificial intelligence); pattern classification; text analysis; World Wide Web; cannabis Web page recognition; illicit drug Web page; image information; image-based approach; instance selection technique; keyword-based approach; multimodal classifier; multimodal multiple-instance learning; preclassifier; pseudoactive learning process; pseudopositive training bag; text information; Bismuth; Educational institutions; Learning systems; Machine learning; Support vector machines; Training; Vectors; Cannabis Webpage Recognition; MIL; Multi-Modal;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ACPR), 2011 First Asian Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4577-0122-1
  • Type

    conf

  • DOI
    10.1109/ACPR.2011.6166680
  • Filename
    6166680