• DocumentCode
    2251536
  • Title

    Feature Extraction Using Restricted Bootstrapping

  • Author

    Hirokawa, Sachio

  • Author_Institution
    Res. Inst. for Inf. Technol., Kyushu Univ., Fukuoka, Japan
  • fYear
    2012
  • fDate
    May 30 2012-June 1 2012
  • Firstpage
    283
  • Lastpage
    288
  • Abstract
    The bootstrapping method is known as an application of the Page-rank technique for documents and words. The technique calculates the score of the words by mutually propagating the score of the words and the documents. However, sometimes the result is far away from the initial query word. The problem is known as ´topic drift´. This paper proposes to restrict the words to be to the top t words in the process of bootstrapping. The method is simpler than the technique known so far. The method is applied for the real bankruptcy information documents to extract the bankruptcy causes strongly related to the query. It is confirmed that the method prevents the topic drift.
  • Keywords
    computer bootstrapping; document handling; feature extraction; query processing; bankruptcy cause extraction; feature extraction; initial query word; page-rank technique; real bankruptcy information documents; restricted bootstrapping method; topic drift; Companies; Feature extraction; Materials; Search engines; Standards; Text mining; bankruptcy information; feature word; text mining; topic drift;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4673-1536-4
  • Type

    conf

  • DOI
    10.1109/ICIS.2012.50
  • Filename
    6211110