شماره ركورد كنفرانس :
4820
عنوان مقاله :
Effective retrieval of related documents based on spelling correction to improve information retrieval system
پديدآورندگان :
Houtinezhad Maryam m.houtinezhad@srbiau.ac.ir Islamic Azad University-Ferdows Branch , Ghaffary Hamid Reza hamidghaffary53@yahoo.com Islamic Azad University-Ferdows Branch
كليدواژه :
information retrieval , vector space model , statistical language model , crawler , similarity criterion , Search Engine
عنوان كنفرانس :
سومين كنفرانس ملي محاسبات تكاملي و هوش جمعي
چكيده فارسي :
Due to the increased documentation provided by users in the global communications network, the management, and control of the information contained in it was challenged. It is essential to extract useful knowledge and to use information in such an important context. Using information retrieval techniques can retrieve user needs in a large amount of data. Search engines are the first selection of users to find information. In this search, Web crawler plays a key role in search engines. A web crawler is a script that navigates the web on a regular basis during an automated process. In this paper, a query expansion method will be presented using a combination of vector space modeling and language statistical model to improve the Retrieve of related documents. In the first approach, according to the ontology, the concept vector of terms is extracted. After that, the conceptual similarity of the user s query and the documents is calculated. In the second approach, the probability similarity of queries that may be misspelling was estimated. And the correct term replaces it. Documents in the database The method proposed by the crawler of the web is compiled from various Wikipedia pages. The results of the conceptual retrieved of documents show that 84% accuracy, 88% recall and average precision of 56% have improved compared to other methods.