• DocumentCode
    445506
  • Title

    Evolutionary algorithm for noun phrase detection in natural language processing

  • Author

    Serrano, J. Ignacio ; Araujo, Lourdes

  • Author_Institution
    Inst. de Automatica Ind., CSIC, Madrid
  • Volume
    1
  • fYear
    2005
  • fDate
    5-5 Sept. 2005
  • Firstpage
    640
  • Abstract
    Noun phrases of a document usually are the main information bearers. Thus, the detection of these units is crucial in many applications related to information retrieval, such as collecting relevant documents by search engines according to a user query, text summarizing, etc. We present an evolutionary algorithm for obtaining a probabilistic finite-state automaton, able to recognize valid noun phrases defined as a sequence of lexical categories. This approach is highly flexible in the sense that the automaton is able to recognize noun phrases similar enough to the ones given by the inferred noun phrase grammar. This flexibility can be allowed thanks to the very accurate set of probabilities provided by the evolutionary algorithm. It works with both, positive and negative examples of the language, thus improving the system coverage, while maintaining its precision. Experimental results show a clear improvement of the performance with respect to others systems
  • Keywords
    evolutionary computation; finite state machines; grammars; natural languages; probabilistic automata; evolutionary algorithm; lexical categories; natural language processing; noun phrase detection; probabilistic finite-state automaton; Automata; Data mining; Evolutionary computation; Humans; Information retrieval; Magnetic heads; Natural language processing; Neural networks; Proposals; Search engines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2005. The 2005 IEEE Congress on
  • Conference_Location
    Edinburgh, Scotland
  • Print_ISBN
    0-7803-9363-5
  • Type

    conf

  • DOI
    10.1109/CEC.2005.1554743
  • Filename
    1554743