• DocumentCode
    3581198
  • Title

    How to read the web in portuguese using the never-ending language learner´s principles

  • Author

    Duarte, Maisa C. ; Hruschka, Estevam R.

  • Author_Institution
    Dept. of Comput. Sci., Fed. Univ. of Sao Carlos, Sao Carlos, Brazil
  • fYear
    2014
  • Firstpage
    162
  • Lastpage
    167
  • Abstract
    An alternative to the traditional single function approximation method is the never-ending learning (NEL) approach i.e., a learning paradigm in which, the learner, in an autonomous way, manages to constantly, incrementally and continuously evolve with time. But, most important than just keep evolving, in this new paradigm acquired knowledge can, in a dynamic way, be used to expand the scope and improve the performance of the learning task as a whole. The first Never-Ending Learning system reported in the literature, which is called NELL (Never-Ending Language Learner), is applied to the task of autonomously building a knowledge base as a result of reading the web. Results reported so far reveal that very good results have been achieved when NELL is reading the web in English. When trying, however, to perform the same Machine Reading task (the task of reading the web) applied to web pages written in Portuguese, the previous reported approaches could not keep up with the good performance achieved in English. In this paper we describe an approach, different from previously proposed in the literature, and we present empirical results that corroborate the hypothesis that working on the preprocessing task of a sufficiently big corpus can be key to allow us to use the very same architecture proposed in NELL, but applied to the idea of reading the web in Portuguese (reading, and extracting knowledge from web pages written in Portuguese).
  • Keywords
    Internet; Web sites; learning (artificial intelligence); natural language processing; English; NELL; Portuguese; Web pages; World Wide Web; function approximation method; machine reading task; never-ending language learner principles; never-ending learning system; Blogs; ISO; Irrigation; Pipelines; Machine Learning; Never-Ending Learning; Read The Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications (ISDA), 2014 14th International Conference on
  • Print_ISBN
    978-1-4799-7937-0
  • Type

    conf

  • DOI
    10.1109/ISDA.2014.7066260
  • Filename
    7066260