• DocumentCode
    2276231
  • Title

    Experience Mining: Building a Large-Scale Database of Personal Experiences and Opinions from Web Documents

  • Author

    Inui, Kentaro ; Abe, Shuya ; Hara, Kazuo ; Morita, Hiraku ; Sao, Chitose ; Eguchi, Megumi ; Sumida, Asuka ; Murakami, Koji ; Matsuyoshi, Suguru

  • Author_Institution
    Nara Inst. of Sci. & Technol., Ikoma
  • Volume
    1
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    314
  • Lastpage
    321
  • Abstract
    This paper proposes a new UGC-oriented language technology application, which we call experience mining. Experience mining aims at automatically collecting instances of personal experiences as well as opinions from an explosive number of user generated contents (UGCs) such as Weblog and forum posts and storing them in an experience database with semantically rich indices. After arguing the technical issues of this new task, we focus on the central problem, factuality analysis, among others and propose a machine learning-based solution as well as the task definition itself. Our empirical evaluation indicates that our factuality analysis task is sufficiently well-defined to achieve a high inter-annotator agreement and our factorial CRF-based model considerably outperforms the baseline. We also present an application system, which currently stores over 50M experience instances extracted from 150M Japanese blog posts with semantic indices and is scheduled to start serving as an experience search engine for unrestricted users in October.
  • Keywords
    data mining; document handling; learning (artificial intelligence); very large databases; UGC-oriented language; Web document; experience mining; factorial CRF-based model; factuality analysis; large-scale database; machine learning; personal experience; task definition; user generated content; Consumer products; Deductive databases; Explosives; Information services; Intelligent agent; Intelligent structures; Internet; Large-scale systems; User-generated content; Web sites; blog; experience mining; natural language processing; opinion mining; semantic analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.373
  • Filename
    4740466