DocumentCode :
234856
Title :
On Modeling and Querying of Text Corpora
Author :
Dingjia Liu ; Guohua Liu ; Yuanyuan Liu
Author_Institution :
Sch. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
fYear :
2014
fDate :
15-16 Nov. 2014
Firstpage :
299
Lastpage :
303
Abstract :
This article proposes a novel data model for text corpora and discusses the issues on corpus query. First, a formalized definition of the corpus data is presented. Second, a data model is proposed in terms of the relational model, which is also proved to be complete. On this basis, we extend the query semantics of the traditional corpus query that generates KWIC (Keyword in Context) concordances and define the query problems. Finally, we investigate the data complexity of these querying problems and an experiment is also presented. These conclusions lay a theoretical foundation for the study of the modeling and querying of text corpora.
Keywords :
query processing; text analysis; KWIC; corpus data; corpus query; formalized definition; keyword in context; query problems; query semantics; text corpora modeling; text corpora querying; Calculus; Complexity theory; Data models; Databases; Educational institutions; Pragmatics; Semantics; corpus; data complexity; data mode; query; relational model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Security (CIS), 2014 Tenth International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4799-7433-7
Type :
conf
DOI :
10.1109/CIS.2014.37
Filename :
7016904
Link To Document :
بازگشت