• DocumentCode
    3777693
  • Title

    Extraction of latent concepts from an integrated human gene database: Non-negative matrix factorization for identification of hidden data structure

  • Author

    Katsuhiko Murakami

  • Author_Institution
    School of Bioscience and Biotechnology, Tokyo University of Technology, Tokyo, Japan
  • fYear
    2015
  • Firstpage
    346
  • Lastpage
    350
  • Abstract
    Information in genetic databases often describes complex concepts, such as diseases and gene functions having implicit relationships. However, such information is presented as independent concepts (for example, “genes” and “function”), making it difficult for the user, even specialists, to understand their meaning in relation to one another. This facilitates the need for extraction of hidden relationships among biological concepts, and for the addition of this information to databases. Therefore, we factorized a gene data matrix and extracted hidden relationships among both genes and their functional terms. We successfully identified composite concepts explained by plural genes and plural terms. This re-organization provides new insights for researchers and is helpful for interpretation of information.
  • Keywords
    "Databases","Gene expression","Proteins","Matrix decomposition","Data mining","DNA","Cost function"
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing and Pattern Recognition (SoCPaR), 2015 7th International Conference of
  • Type

    conf

  • DOI
    10.1109/SOCPAR.2015.7492771
  • Filename
    7492771