• DocumentCode
    2727633
  • Title

    Validating Transliteration Hypotheses Using the Web: Web Counts vs. Web Mining

  • Author

    Oh, Jong-Hoon ; Isahara, Hitoshi

  • Author_Institution
    Nat. Inst. of Inf. & Commun. Technol., Kyoto
  • fYear
    2007
  • fDate
    2-5 Nov. 2007
  • Firstpage
    267
  • Lastpage
    270
  • Abstract
    We describe a novel approach for validating transliteration hypotheses based on a Web mining technique. We implemented a machine transliteration system and generated Chinese, Japanese, and Korean transliteration hypotheses for given English words. Then, we mined the Web for features relevant to validating transliteration hypotheses. Finally we validated transliteration hypotheses using machine learning algorithms learned with the mined features. Comparing Web counts with our Web mining technique, our proposed method consistently performed better than systems based on Web counts, regardless of the language.
  • Keywords
    Internet; data mining; language translation; learning (artificial intelligence); natural languages; Chinese transliteration hypothesis; English words; Japanese transliteration hypothesis; Korean transliteration hypothesis; Web counts; Web mining; machine learning; machine transliteration system; Communications technology; Computational intelligence; Computational linguistics; Frequency; Machine learning algorithms; Natural languages; Search engines; Web mining; Web pages; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, IEEE/WIC/ACM International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-0-7695-3026-0
  • Type

    conf

  • DOI
    10.1109/WI.2007.139
  • Filename
    4427098