Title :
P-top-k queries in probabilistic framework from information extraction models
Author :
He, Ming ; Du, Yong-ping
Author_Institution :
Coll. of Comput. Sci., Beijing Univ. of Technol., Beijing, China
Abstract :
Many applications today need to manage data that is uncertain, such as information extraction (IE), data integration, sensor RFID networks, and scientific experiments. Top-k queries are often natural and useful in analyzing uncertain data in those applications. In this paper, we study the problem of answering top-k queries in a probabilistic framework from a state-of-the-art statistical IE model-semi-Conditional Random Fields (CRFs)-in the setting of Probabilistic Databases that treat statistical models as first-class data objects. We investigate the problem of ranking the answers to Probabilistic Databases query. We present efficient algorithm for finding the best approximating parameters in such a framework to efficiently retrieve the top-k ranked results. An empirical study using real data sets demonstrates the effectiveness of probabilistic top-k queries and the efficiency of our method.
Keywords :
probability; query processing; conditional random fields; data integration; information extraction; p-top-k queries; probabilistic databases query; probabilistic framework; scientific experiments; sensor RFID networks; Computational modeling; Data mining; Data models; Databases; Probabilistic logic; Training; Uncertainty; conditional random fields; information extraction; probabilistic databases; uncertain data;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
DOI :
10.1109/FSKD.2010.5569526