Title of article :
A New Method for Sentence Vector Normalization Using Word2vec
Author/Authors :
Abdolahi, Mohamad Kharazmi International Campus Shahrood University of Technology, Shahrood, Iran , Zahedi, Morteza Kharazmi International Campus Shahrood University of Technology, Shahrood, Iran
Abstract :
Word embeddings (WE) have received much attention recently as word to numeric vectors architec-
ture for all text processing approaches and has been a great asset for a large variety of NLP tasks.
Most of text processing task tried to convert text components like sentences to numeric matrix to
apply their processing algorithms. But the most important problems in all word vector-based text
processing approaches are dierent sentences size and as a result, dierent dimension of sentences
matrices. In this paper, we suggest an ecient but simple statistical method to convert text sen-
tences into equal dimension and normalized matrices Proposed method aims to combines three most
ecient methods (averaging based, most likely n-grams, and words mover distance) to use their
advantages and reduce their constraints. The unique size resulting matrix does not depend on lan-
guage, Subject and scope of the text and words semantic concepts. Our results demonstrate that
normalized matrices capture complementary aspects of most text processing tasks such as coherence
evaluation, text summarization, text classication, automatic essay scoring, and question answering.
Keywords :
Text Preprocessing , Sentence Vector , Word Vector , Word Embedding , Sentence Normalization