Title :
Author Identification Using Compression Models
Author :
Pavelec, D. ; Oliveira, L.S. ; Justino, E. ; Neto, F. D Nobre ; Batista, L.V.
Author_Institution :
Pontificia Univ. Catolica do Parana, Curitiba, Brazil
Abstract :
In this paper we discuss the use of compression algorithms for author identification. We present the basic background about compression algorithms and introduce the prediction by partial matching algorithm, which has been used in our experiments. To better compare the results produced by the PPM algorithm, we present some experiments using stylometric features used very often by forensic examiners. In this case the authors are modeled using support vector machines. Comprehensive experiments performed on a database composed of 20 different authors show that the PPM algorithm is an interesting alternative for author identification, since all the process of feature definition, extraction, and selection can be avoided.
Keywords :
data compression; feature extraction; pattern matching; support vector machines; PPM algorithm; author identification; compression algorithm; compression model; feature definition; feature extraction; feature selection; forensic examiner; prediction by partial matching algorithm; stylometric features; support vector machine; Algorithm design and analysis; Compression algorithms; Feature extraction; Forensics; Frequency; History; Pediatrics; Spatial databases; Support vector machines; Text analysis; Author identification; compression models;
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2009.208