Title :
An Algorithm for Identifying Authors Using Synonyms
Author :
Clark, Jonathan H. ; Hannon, Charles J.
Author_Institution :
Texas Christian Univ., Fort Worth
Abstract :
An approach for identifying the human source of a text by leveraging the significance of synonyms in language is presented. While others have attempted to identify authors in the past, they have focused on purely statistical approaches such as word length distribution, number of distinct words, and language models. We claim that an author´s choice of synonyms is idiosyncratic and can be used in determining the identity of an author, which we demonstrate via our algorithm for recognizing authors. This algorithm uses synonym sets from the WordNet lexical database to give more weight to words that have many common synonyms. The results of this method applied to the task of identifying the authors of classic literature show that there is a correlation between an author´s synonym choice and the author´s identity. With this new author recognition technology, we may now explore new avenues of intelligent and meaningful interaction with users.
Keywords :
natural language processing; text analysis; WordNet lexical database; author identification; author recognition; human source identification; idiosyncratic; synonyms; Artificial intelligence; Computer science; Databases; Humans; Information resources; Knowledge acquisition; Learning; Microcomputers; Microphones; Statistical distributions;
Conference_Titel :
Current Trends in Computer Science, 2007. ENC 2007. Eighth Mexican International Conference on
Conference_Location :
Michoacan
Print_ISBN :
978-0-7695-2899-1
DOI :
10.1109/ENC.2007.22