Title :
A Blended Text Mining Method for Authorship Authentication Analysis
Author :
Sallis, Philip ; Shanmuganathan, Subana
Author_Institution :
Auckland Univ. of Technol., Auckland
Abstract :
The paper elaborates upon the interim results achieved in resolving a few newly discovered 16th century letters now alleged to be written by Queen Mary of Scots (QMS). Despite the significant progress seen in stylometry and its role in authorship attribute analysis especially in disputed writings/ texts controversies over the authorship of Shakespeare\´s literary work still continue as does research into this corpus of letters. Using more sophisticated computational and mathematical modelling techniques than in previously published research, this study still employs the use of stylometic measures, to show a distinct variation between the authentic writings of QMS and the newly discovered letters, claimed by numerous enthusiasts to be of her authorship. Incorporating additional advanced statistical methods, such as principle component analysis (PCA) and artificial neural networks (ANNs), especially Kohonen\´s self-organising map (SOM) based visualisation technique, a text mining approach for this application has been developed. The similarities between different pairs of the new and authentic letters and in some cases within individual letters become apparent when using "cusum" analysis adding further complexity to the task of resolving the anomaly seen among QMS loyalists, archaeologists, linguists and the like. The reasons for the inconclusive results of this study are presented with suggestions for future work but in essence, the data mining method used is regarded as being unique in its blend of conventional and non-conventional statistics and useful for this class of text analysis problem
Keywords :
data analysis; data mining; data visualisation; literature; message authentication; self-organising feature maps; statistical analysis; text analysis; Kohonen self-organising map based visualisation technique; Queen Mary of Scots; Shakespeare literary; authentic writing; authorship attribute authentication analysis; blended text mining method; cusum analysis; mathematical modelling technique; statistical method; stylometic measure; text analysis problem; text controversy; Artificial neural networks; Authentication; Data mining; Mathematical model; Principal component analysis; Statistical analysis; Text analysis; Text mining; Visualization; Writing; authorship authentication; stylometry; text mining;
Conference_Titel :
Modeling & Simulation, 2008. AICMS 08. Second Asia International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-0-7695-3136-6
Electronic_ISBN :
978-0-7695-3136-6
DOI :
10.1109/AMS.2008.99