DocumentCode
234856
Title
On Modeling and Querying of Text Corpora
Author
Dingjia Liu ; Guohua Liu ; Yuanyuan Liu
Author_Institution
Sch. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
fYear
2014
fDate
15-16 Nov. 2014
Firstpage
299
Lastpage
303
Abstract
This article proposes a novel data model for text corpora and discusses the issues on corpus query. First, a formalized definition of the corpus data is presented. Second, a data model is proposed in terms of the relational model, which is also proved to be complete. On this basis, we extend the query semantics of the traditional corpus query that generates KWIC (Keyword in Context) concordances and define the query problems. Finally, we investigate the data complexity of these querying problems and an experiment is also presented. These conclusions lay a theoretical foundation for the study of the modeling and querying of text corpora.
Keywords
query processing; text analysis; KWIC; corpus data; corpus query; formalized definition; keyword in context; query problems; query semantics; text corpora modeling; text corpora querying; Calculus; Complexity theory; Data models; Databases; Educational institutions; Pragmatics; Semantics; corpus; data complexity; data mode; query; relational model;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Security (CIS), 2014 Tenth International Conference on
Conference_Location
Kunming
Print_ISBN
978-1-4799-7433-7
Type
conf
DOI
10.1109/CIS.2014.37
Filename
7016904
Link To Document