Title of article :
Benefiting from Structured Resources to Present a Computationally Efficient Word Embedding Method
Author/Authors :
Jafarinejad ، Fatemeh Faculty of Computer Engineering - Shahrood University of Technology
From page :
505
To page :
514
Abstract :
In the recent years, new word embedding methods have improved the accuracy of NLP tasks. A review of the progress of these methods shows that the complexity of these methods is growing. Therefore, there is a requirement for methodological innovation to provide new word embeddings. Most current word embedding methods use a large corpus of unstructured data to train the word semantic vectors. The main idea of this paper is to directly use the knowledge embedded in the structure of structured data to introduce embedding vectors. Therefore, the need for high processing power, large amount of memory, and long processing time will be eliminated using structured resources, and conceptual knowledge hidden in them. For this purpose, a new embedding vector, Word2Node, is proposed. This method uses a well-known structured resource, the WordNet, as its training corpus. Our hypothesis is that it is possible to directly use the linguistic knowledge lies in WordNet s graphical structure to provide accurate and small embedding vectors. The evaluation of this method on the text classification task has shown that the proposed method works the same or better compared Word2Vec. This result has been achieved while the amount of training data has decreased by about 50000000%. Moreover, the comparison of the proposed method with some other knowledge graph based embedding methods indicates the superiority of the proposed method on the word semantic similarity task. These results show the capacity of structured data to improve the quality of existing word embedding methods and their resulting vectors.
Keywords :
Word Embeddings , WordNet , Graph Embeddings , Node2Vec. Word Semantic Similarity
Journal title :
Journal of Artificial Intelligence and Data Mining
Journal title :
Journal of Artificial Intelligence and Data Mining
Record number :
2736309
Link To Document :
بازگشت