DocumentCode
238196
Title
Sentence generation from a bag of words using N-gram model
Author
Yadav, Arun Kumar ; Borgohain, Samir Kumar
Author_Institution
Dept. of Comput. Sci. & Eng., Nat. Inst. of Technol. Silchar, Silchar, India
fYear
2014
fDate
8-10 May 2014
Firstpage
1771
Lastpage
1776
Abstract
We are presenting in this paper, a method of sentence generation from a given bag of words. The task of sentence generation has its usage in text summarization, question answering system etc. The focus of our task is to generate all possible correct sentences from a given bag of words. The technique that we have applied is N-gram language model. The N-gram model is trained by a text corpus to generate only candidate sequences from a given bag of words. For N input words, instead of considering all possible N! permuted orders as candidate sequence, we have generated only candidate sequences less then N! by applying DFS (Depth First Search) filtering technique at run time. We have two corpora namely text corpus and annotated corpus of POS tags. We have extracted all valid POS trigram tags from the annotated corpus. Each of the generated candidate sequence has a probability score. The candidate sequences were ranked by matching it with valid trigram POS tag signature and probability score. Preliminary experimental work carried out in this direction by using the above mentioned model shows promising results.
Keywords
computational linguistics; natural language processing; probability; speech processing; text analysis; tree searching; DFS filtering technique; POS trigram tag extraction; annotated corpus; bag-of-words; correct-sentence generation method; depth-first search filtering technique; n-gram language model; n-input words; probability score; run time analysis; sequence generation; sequence matching; sequence ranking; text corpus; trigram POS tag signature; Depth First Search; N-gram Language Model; Part of Speech Tagging; Sentence Generation; Syntax;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Communication Control and Computing Technologies (ICACCCT), 2014 International Conference on
Conference_Location
Ramanathapuram
Print_ISBN
978-1-4799-3913-8
Type
conf
DOI
10.1109/ICACCCT.2014.7019414
Filename
7019414
Link To Document