DocumentCode
2352488
Title
Improving the Efficiency of Legal E-Discovery Services Using Text Mining Techniques
Author
Joshi, Sachindra ; Deshpande, Prasad M. ; Hampp, Thomas
Author_Institution
IBM Res., India
fYear
2011
fDate
March 29 2011-April 2 2011
Firstpage
448
Lastpage
455
Abstract
E-Discovery Review is a type of legal service that aims at finding relevant electronically stored information (ESI) in a legal case. This requires manual reviewing of large number of documents by legal analysts, thus involving huge costs. In this paper, we investigate the use of IT, specifically text mining techniques, for improving the efficiency and quality of the ediscovery review service. We employ near duplicate detection and automatic classification techniques that can be used to create coherent groups of documents. Since a group characterizes a syntactic or a semantic theme all the documents in a group can be reviewed together. This leads to a faster and more consistent review of documents. Our experimental results on the publicly available Enron email corpus show that we can achieve high precision and recall in identifying the syntactic and semantic groups. We also conduct a user study that demonstrates 80% reduction in the review time and improved consistency in the review results, leading to better service quality.
Keywords
data mining; pattern classification; public administration; text analysis; Enron email corpus; automatic classification techniques; electronically stored information; legal case; legal e-discovery services; near duplicate detection; text mining techniques; Electronic mail; Indexes; Law; Manuals; Semantics; Syntactics;
fLanguage
English
Publisher
ieee
Conference_Titel
SRII Global Conference (SRII), 2011 Annual
Conference_Location
San Jose, CA
Print_ISBN
978-1-61284-415-2
Electronic_ISBN
978-0-7695-4371-0
Type
conf
DOI
10.1109/SRII.2011.97
Filename
5958120
Link To Document