DocumentCode :
2058380
Title :
Retrieval of Patent Documents from Heterogeneous Sources Using Ontologies and Similarity Analysis
Author :
Taduri, Siddharth ; Lau, Gloria T. ; Law, Kincho H. ; Kesan, Jay P.
Author_Institution :
Eng. Inf. Group, Stanford Univ., CA, USA
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
538
Lastpage :
545
Abstract :
In the past few years, there has been an explosive growth in scientific and legal information related to the patent system. Patents and related documents are siloed into multiple heterogeneous sources. Retrieving relevant information from diverse sources is a non-trivial task and poses many technical challenges. Among the challenges is the issue of terminological inconsistencies that are used in the documents. We tackle the terminological inconsistency issue by exploring domain knowledge through the use of ontology standards. Furthermore, we take advantage of cross-references and structural dependencies between the information sources to enhance terminological comparison. In this paper, we present a similarity analysis methodology which combines knowledge from two distinct sources -- (1) domain ontologies and (2) ontologies which describe the information sources to assist a user in identifying relevant documents across several information sources simultaneously. Specifically, we explore the use of a rule-based system to infer relationships between documents based on pre-defined heuristics. We present our results through a use case in the bio-patent domain with a collection of 1150 patents and 30 court cases.
Keywords :
distributed databases; information resources; information retrieval; knowledge based systems; ontologies (artificial intelligence); patents; pattern matching; scientific information systems; biopatent domain; document identification; heterogeneous source; legal information; ontology standard; patent document retrieval; rule-based system; scientific information; similarity analysis; terminological inconsistency; Context; Databases; Ontologies; Patents; Semantics; Terminology; Court cases; Information Retrieval; Knowledgebase; Ontology; Patent;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
Conference_Location :
Palo Alto, CA
Print_ISBN :
978-1-4577-1648-5
Electronic_ISBN :
978-0-7695-4492-2
Type :
conf
DOI :
10.1109/ICSC.2011.34
Filename :
6061369
Link To Document :
بازگشت