• DocumentCode
    3136743
  • Title

    Japanese Ellipsis Resolution in "A NO B" Noun Phrases for Colloquial Inquiry Text Using Latent Topic Models

  • Author

    Harada, Tatsuya ; Suzuki, Nobuhiro ; Tsuda, Kazuhiko

  • Author_Institution
    Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tokyo, Japan
  • fYear
    2013
  • fDate
    2-5 Dec. 2013
  • Firstpage
    901
  • Lastpage
    908
  • Abstract
    Generally inquiries through Web forms and e-mails are increasing. These inquiry texts usually include many informal expressions use of the colloquial style, such as a spoken language, and many omitted words. An omitted word causes the meaning of a sentence to become ambiguous and may make the reader misread and misunderstand the context. In this paper we focus on the frequently omitted noun ``B´´ in the noun phrase ``A NO1 B´´ (usually meaning B of A) seen in the colloquial style inquiry text and propose a method to predict omitted noun ``B´´ from context and knowledge using topic information. From the results of the evaluation experiment, we have confirmed that our method improved 11.34 points from the conventional method, and predicted the omitted word with an accuracy rate of more than 75% using ``Latent Dirichlet Allocation´´ (LDA.).
  • Keywords
    natural language processing; text analysis; A NO1 B noun phrases; Japanese ellipsis resolution; LDA; colloquial style inquiry text; informal expressions; latent Dirichlet allocation; latent topic models; omitted noun prediction; omitted words; spoken language; topic information; Accuracy; Context; DVD; Educational institutions; Electronic mail; Mathematical model; Probability; Colloquial expressions; Ellipsis; Gibbs sampling; LDA; Statistical topic models;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal-Image Technology & Internet-Based Systems (SITIS), 2013 International Conference on
  • Conference_Location
    Kyoto
  • Type

    conf

  • DOI
    10.1109/SITIS.2013.147
  • Filename
    6727297