Title :
Sequential pattern mining to discover relations between genes and rare diseases
Author :
Béchet, Nicolas ; Cellier, Peggy ; Charnois, Thierry ; Cremilleux, Bruno ; Jaulent, Marie-Christine
Author_Institution :
GREYC, Univ. de Caen Basse-Normandie, Caen, France
Abstract :
Orphanet provides an international web-based knowledge portal for rare diseases including a collection of review articles. However, reviews and literature monitoring are manual. Thus, new documentation about a rare disease is a time-consuming process and automatically discovering knowledge from a large collection of texts is a crucial issue. This context represents a strong motivation to address the problem of extracting gene-rare diseases relationships from texts. In this paper, we tackle this issue with a cross-fertilization of information extraction and data mining techniques (sequential pattern mining under constraints). Experiments show the interest of the method for the documentation of rare diseases.
Keywords :
Internet; data mining; diseases; health care; information retrieval; medical computing; portals; text analysis; Orphanet; automatic knowledge discovery; data mining technique; gene-rare diseases relationship extraction problem; information extraction technique; international Web-based knowledge portal; rare diseases; relation discovery; sequential pattern mining; Data mining; Diseases; Itemsets; Natural language processing; Pragmatics; Training;
Conference_Titel :
Computer-Based Medical Systems (CBMS), 2012 25th International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4673-2049-8
DOI :
10.1109/CBMS.2012.6266367