• DocumentCode
    1499913
  • Title

    Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning

  • Author

    Ying-Xin Li ; Shuiwang Ji ; Kumar, S. ; Jieping Ye ; Zhi-Hua Zhou

  • Author_Institution
    Nat. Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
  • Volume
    9
  • Issue
    1
  • fYear
    2012
  • Firstpage
    98
  • Lastpage
    112
  • Abstract
    In the studies of Drosophila embryogenesis, a large number of two-dimensional digital images of gene expression patterns have been produced to build an atlas of spatio-temporal gene expression dynamics across developmental time. Gene expressions captured in these images have been manually annotated with anatomical and developmental ontology terms using a controlled vocabulary (CV), which are useful in research aimed at understanding gene functions, interactions, and networks. With the rapid accumulation of images, the process of manual annotation has become increasingly cumbersome, and computational methods to automate this task are urgently needed. However, the automated annotation of embryo images is challenging. This is because the annotation terms spatially correspond to local expression patterns of images, yet they are assigned collectively to groups of images and it is unknown which term corresponds to which region of which image in the group. In this paper, we address this problem using a new machine learning framework, Multi-Instance Multi-Label (MIML) learning. We first show that the underlying nature of the annotation task is a typical MIML learning problem. Then, we propose two support vector machine algorithms under the MIML framework for the task. Experimental results on the FlyExpress database (a digital library of standardized Drosophila gene expression pattern images) reveal that the exploitation of MIML framework leads to significant performance improvement over state-of-the-art approaches.
  • Keywords
    bioinformatics; biological techniques; genetics; learning (artificial intelligence); support vector machines; Drosophila embryogenesis; Drosophila gene expression pattern annotation; anatomical ontology terms; annotation task; automated annotation; computational methods; controlled vocabulary; developmental ontology terms; embryo imaging; flyexpress database; gene functions; local expression patterns; machine learning framework; manual annotation; multiinstance multilabel learning; rapid accumulation; spatio-temporal gene expression dynamics; state-of-the-art approaches; support vector machine algorithms; two-dimensional digital imaging; Bioinformatics; Computational biology; Databases; Embryo; Gene expression; Head; Machine learning; Drosophila.; Gene expression pattern; image annotation; machine learning; multi-instance multi-label (MIML) learning; support vector machine; Animals; Computational Biology; Databases, Factual; Drosophila; Embryo, Nonmammalian; Gene Expression Regulation, Developmental; Molecular Sequence Annotation; Support Vector Machines;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2011.73
  • Filename
    5753882