DocumentCode
389550
Title
Next word prediction in a connectionist distributed representation system
Author
Rosa, João Luis Garcia
Author_Institution
Mestrado em Sistemas de Computacao, PUC-Campinas, Campinas, Brazil
Volume
3
fYear
2002
fDate
6-9 Oct. 2002
Abstract
Connectionist natural language processing models that consider the temporal extension of sentence analysis often make use of local representation, allocating only one unit for each word at the input and output layers of the connectionist architecture. Thus, for increasing the lexicon, it is mandatory to modify the architecture and re-train the network. On the other hand, the proposed system Pred-DR attempts to predict the next word in declarative sentences presented sequentially one word at a time, giving meaning to the units of the connectionist architecture by means of distributed representations based on semantic features. The words are fractionated into their semantic microfeature arrays. Consequently, Pred-DR is able to generalize to new words without increasing the number of processors in its architecture, provided that their semantic features are supplied. This way, it has achieved a considerable performance on connectionist natural language processing using the classical semantic microfeature framework. The system learns to relate the input word array to its possible next word, "remembering" the previous words seen before in a semantically sound sentence. For each input word, Pred-DR gives, as outcome, a list of probabilities of occurrence of next words in the sentence context.
Keywords
generalisation (artificial intelligence); learning (artificial intelligence); linguistics; multilayer perceptrons; natural languages; neural net architecture; Pred-DR; connectionist architecture; connectionist distributed representation system; connectionist natural language processing models; generalisation; learning; lexicon; multilayer neural network; next word prediction; performance; probability; semantic features; semantic microfeature arrays; semantic microfeature framework; sentence analysis; Cognition; Distributed computing; Encoding; Fractionation; Humans; Natural languages; Neural networks; Psychology; Signal design; Sliding mode control;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2002 IEEE International Conference on
ISSN
1062-922X
Print_ISBN
0-7803-7437-1
Type
conf
DOI
10.1109/ICSMC.2002.1176109
Filename
1176109
Link To Document