• DocumentCode
    3089479
  • Title

    Proposal and Evaluation of an Extraction Method for Inaccurate Example Sentences Using a Web Search Engine for Multilingual Parallel Texts

  • Author

    Fukushima, Taku ; Yoshino, Takashi ; Shigeno, Aguri

  • Author_Institution
    Grad. Sch. of Syst. Eng., Wakayama Univ., Wakayama, Japan
  • fYear
    2011
  • fDate
    22-25 March 2011
  • Firstpage
    538
  • Lastpage
    543
  • Abstract
    In this study, we have proposed an extraction method for inaccurate example sentences using a Web search engine for multilingual parallel texts. We developed a multilingual parallel-text sharing system named Tack Pad for multilingual communication in the medical field. However, it should be noted that parallel texts created by people can be inaccurate. Hence, we cannot use these parallel texts in fields where high levels of accuracy are required. Moreover, it is difficult for people to evaluate the parallel texts enough because these are large in number. Therefore, we proposed and evaluated an extraction method for inaccurate example sentences. This method uses the contents on the Web as wisdom of crowds. It splits an example sentence into n-grams and uses the Web search engine to locate the split words. Moreover, this method uses two thresholds to detect several mistakes which are typographical errors, grammatical errors, and so on. The contributions of this paper are the following results: (1) We proposed an extraction method that improves the accuracy of the example sentences using the Web search engine and (2) We showed an improvement in the accuracy of the example sentences using two thresholds.
  • Keywords
    data mining; medical information systems; natural language processing; search engines; text analysis; TackPad; Web search engine; grammatical errors; inaccurate example sentences extraction; multilingual communication; multilingual parallel text sharing system; typographical errors; Accuracy; Data mining; Engines; Hospitals; Proposals; Systems engineering and theory; Web search; automatic evaluation; data mining; example sentence; parallel-text;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications (WAINA), 2011 IEEE Workshops of International Conference on
  • Conference_Location
    Biopolis
  • Print_ISBN
    978-1-61284-829-7
  • Electronic_ISBN
    978-0-7695-4338-3
  • Type

    conf

  • DOI
    10.1109/WAINA.2011.97
  • Filename
    5763557