• DocumentCode
    2349287
  • Title

    MT on and for the Web

  • Author

    Boitet, Christian ; Blanchon, Hervé ; Seligman, Mark ; Bellynck, Valérie

  • Author_Institution
    GETALP, UPMF, Grenoble, France
  • fYear
    2010
  • fDate
    21-23 Aug. 2010
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    A Systran MT server became available on the minitel network in 1984, and on Internet in 1994. Since then we have come to a better understanding of the nature of MT systems by separately analyzing their linguistic, computational, and operational architectures. Also, thanks to the CxAxQ metatheorem, the systems´ inherent limits have been clarified, and design choices can now be made in an informed manner according to the translation situations. MT evaluation has also matured: tools based on reference translations are useful for measuring progress; those based on subjective judgments for estimating future usage quality; and task-related objective measures (such as post-editing distances) for measuring operational quality. Moreover, the same technological advances that have led to “Web 2.0” have brought several futuristic predictions to fruition. Free Web MT services have democratized assimilation MT beyond belief. Speech translation research has given rise to usable systems for restricted tasks running on PDAs or on mobile phones connected to servers. New man-machine interface techniques have made interactive disambiguation usable in large-coverage multimodal MT. Increases in computing power have made statistical methods workable, and have led to the possibility of building low-linguistic-quality but still useful MT systems by machine learning from aligned bilingual corpora (SMT, EBMT). In parallel, progress has been made in developing interlingua-based MT systems, using hybrid methods. Unfortunately, many misconceptions about MT have spread among the public, and even among MT researchers, because of ignorance of the past and present of MT R&D. A compensating factor is the willingness of end users to freely contribute to building essential parts of the linguistic knowledge needed to construct MT systems, whether corpus-related or lexical. Finally, some developments we anticipated fifteen years ago have not yet materialized, such as online writing - - tools equipped with interactive disambiguation, and as a corollary the possibility of transforming source documents into self-explaining documents (SEDs) and of producing corresponding SEDs fully automatically in several target languages. These visions should now be realized, thanks to the evolution of Web programming and multilingual NLP techniques, leading towards a true Semantic Web, “Web 3.0”, which will support ubilingual (ubiquitous multilingual) computing.
  • Keywords
    language translation; learning (artificial intelligence); natural language processing; semantic Web; CxAxQ metatheorem; Systran MT server; Web 2.0; machine learning; machine translation; man-machine interface techniques; multilingual NLP techniques; self-explaining documents; semantic Web; speech translation research; Computer architecture; Dictionaries; Humans; Internet; Pragmatics; Speech; Speech recognition; MT; Semantic Web MT; computational architecture; interactive disambiguation; linguistic architecture; operational architecture; self-explaining documents; speech MT; task-related evaluation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-6896-6
  • Type

    conf

  • DOI
    10.1109/NLPKE.2010.5587865
  • Filename
    5587865