• DocumentCode
    389550
  • Title

    Next word prediction in a connectionist distributed representation system

  • Author

    Rosa, João Luis Garcia

  • Author_Institution
    Mestrado em Sistemas de Computacao, PUC-Campinas, Campinas, Brazil
  • Volume
    3
  • fYear
    2002
  • fDate
    6-9 Oct. 2002
  • Abstract
    Connectionist natural language processing models that consider the temporal extension of sentence analysis often make use of local representation, allocating only one unit for each word at the input and output layers of the connectionist architecture. Thus, for increasing the lexicon, it is mandatory to modify the architecture and re-train the network. On the other hand, the proposed system Pred-DR attempts to predict the next word in declarative sentences presented sequentially one word at a time, giving meaning to the units of the connectionist architecture by means of distributed representations based on semantic features. The words are fractionated into their semantic microfeature arrays. Consequently, Pred-DR is able to generalize to new words without increasing the number of processors in its architecture, provided that their semantic features are supplied. This way, it has achieved a considerable performance on connectionist natural language processing using the classical semantic microfeature framework. The system learns to relate the input word array to its possible next word, "remembering" the previous words seen before in a semantically sound sentence. For each input word, Pred-DR gives, as outcome, a list of probabilities of occurrence of next words in the sentence context.
  • Keywords
    generalisation (artificial intelligence); learning (artificial intelligence); linguistics; multilayer perceptrons; natural languages; neural net architecture; Pred-DR; connectionist architecture; connectionist distributed representation system; connectionist natural language processing models; generalisation; learning; lexicon; multilayer neural network; next word prediction; performance; probability; semantic features; semantic microfeature arrays; semantic microfeature framework; sentence analysis; Cognition; Distributed computing; Encoding; Fractionation; Humans; Natural languages; Neural networks; Psychology; Signal design; Sliding mode control;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2002 IEEE International Conference on
  • ISSN
    1062-922X
  • Print_ISBN
    0-7803-7437-1
  • Type

    conf

  • DOI
    10.1109/ICSMC.2002.1176109
  • Filename
    1176109