Title :
Text Mining for Finding Acronym-Definition Pairs from Biomedical Text Using Pattern Matching Method with Space Reduction Heuristics
Author :
Rafeeque, P.C. ; Abdul Nazeer, K.A.
Author_Institution :
Govt. Eng. Coll., Wayanad
Abstract :
This paper deals with the problem of mining acronyms and their definitions from biomedical text. We propose an effective text mining system by using pattern matching method. Different stages of the design have been explained with pseudo code. We used space reduction heuristic constraints (D. Nadeau and P. Turney, 2005) which will increase the precision by reducing the number of candidate definitions and will include most of the true cases. The pattern matching method does not require training data to run as in the case of learning techniques. This will make the process simple and fast. Evaluation has been done by using three metrics - recall (measure of how much relevant information the system has extracted from text), precision (measure of how much information returned by the system is actually correct) and f-factor (combined value of recall and precision). Experimental results achieved 92% recall and 97.2% precision.
Keywords :
constraint theory; data mining; medical computing; optimisation; pattern matching; text analysis; acronym-definition pairs; biomedical text; pattern matching; pseudo code; space reduction heuristic constraints; text mining; Abstracts; Biomedical computing; Biomedical engineering; Biomedical measurements; Data mining; Databases; Educational institutions; Pattern matching; Space technology; Text mining;
Conference_Titel :
Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on
Conference_Location :
Guwahati, Assam
Print_ISBN :
0-7695-3059-1
DOI :
10.1109/ADCOM.2007.30