Title of article :
Design and Implementation of a Machine Learning-Based Authorship Identification Model
Author/Authors :
Anwar, Waheed Department of Computer Science & IT - the Islamia University of Bahawalpur, Bahawalpur, Pakistan , Sarwar Bajwa, Imran Department of Computer Science & IT - the Islamia University of Bahawalpur, Bahawalpur, Pakistan , Ramzan, Shabana Department of Computer Science - Govt. Sadiq College Women University, Pakistan
Pages :
15
From page :
1
To page :
15
Abstract :
In this paper, a novel approach is presented for authorship identification in English and Urdu text using the LDA model with n-grams texts of authors and cosine similarity. The proposed approach uses similarity metrics to identify various learned representations of stylometric features and uses them to identify the writing style of a particular author. The proposed LDA-based approach emphasizes instance-based and profile-based classifications of an author’s text. Here, LDA suitably handles high-dimensional and sparse data by allowing more expressive representation of text. The presented approach is an unsupervised computational methodology that can handle the heterogeneity of the dataset, diversity in writing, and the inherent ambiguity of the Urdu language. A large corpus has been used for performance testing of the presented approach. The results of experiments show superiority of the proposed approach over the state-of-the-art representations and other algorithms used for authorship identification. The contributions of the presented work are the use of cosine similarity with n-gram-based LDA topics to measure similarity in vectors of text documents. Achievement of overall 84.52% accuracy on PAN12 datasets and 93.17% accuracy on Urdu news articles without using any labels for authorship identification task is done.
Keywords :
Design , Implementation , Identification Model , Authorship , Machine Learning
Journal title :
Scientific Programming
Serial Year :
2019
Full Text URL :
Record number :
2611675
Link To Document :
بازگشت