Title :
Using semantic and structural similarities for indexing and searching scientific papers
Author :
Rizvi, Syed Raza Ali ; Wang, Shawn Xiong
Author_Institution :
Dept. of Comput. Sci., California State Univ., Fullerton, CA, USA
Abstract :
Finding relevant scientific documents from a huge set of academic papers is a challenging task and with the tremendous growth in electronic publication, locating the most relevant and related scientific documents when going through a new research paper is becoming even more challenging. In this paper, we present a new way of indexing and searching the scientific documents to assist researchers in finding relevant documents when coming across a new research document. In particular, we explored how DT-Tree (DocumentTerm-Tree) - a new structure for the representation of scientific documents - can be used to create an index of scientific documents. We used MVP-Tree to create index using DT-Tree representation of the documents. We then performed search experiments, using new scientific documents as queries, to show that relevant documents are retrieved when DT-Tree structures are used to create MVP-Tree.
Keywords :
document handling; electronic publishing; indexing; query processing; scientific information systems; tree data structures; DT-Tree representation; DocumentTerm-tree; MVP-Tree; academic paper; electronic publication; query processing; research paper; scientific document representation; scientific paper indexing; scientific paper searching; semantic similarities; structural similarities; Algorithm design and analysis; Arrays; Indexing; Physics; Semantics; Silicon; Document clustering; Semantic Analysis; Similarity Measure; dimension reduction; index structures; k-means; key term extraction; search; sparsity; text mining;
Conference_Titel :
Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-8727-1
DOI :
10.1109/CSAE.2011.5952818