• DocumentCode
    234856
  • Title

    On Modeling and Querying of Text Corpora

  • Author

    Dingjia Liu ; Guohua Liu ; Yuanyuan Liu

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
  • fYear
    2014
  • fDate
    15-16 Nov. 2014
  • Firstpage
    299
  • Lastpage
    303
  • Abstract
    This article proposes a novel data model for text corpora and discusses the issues on corpus query. First, a formalized definition of the corpus data is presented. Second, a data model is proposed in terms of the relational model, which is also proved to be complete. On this basis, we extend the query semantics of the traditional corpus query that generates KWIC (Keyword in Context) concordances and define the query problems. Finally, we investigate the data complexity of these querying problems and an experiment is also presented. These conclusions lay a theoretical foundation for the study of the modeling and querying of text corpora.
  • Keywords
    query processing; text analysis; KWIC; corpus data; corpus query; formalized definition; keyword in context; query problems; query semantics; text corpora modeling; text corpora querying; Calculus; Complexity theory; Data models; Databases; Educational institutions; Pragmatics; Semantics; corpus; data complexity; data mode; query; relational model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Security (CIS), 2014 Tenth International Conference on
  • Conference_Location
    Kunming
  • Print_ISBN
    978-1-4799-7433-7
  • Type

    conf

  • DOI
    10.1109/CIS.2014.37
  • Filename
    7016904