• DocumentCode
    901877
  • Title

    Extraction of salient textual patterns: synergy between lexical cohesion and contextual coherence

  • Author

    Chan, Samuel W K

  • Author_Institution
    Dept. of Decision Sci., Chinese Univ. of Hong Kong, China
  • Volume
    34
  • Issue
    2
  • fYear
    2004
  • fDate
    3/1/2004 12:00:00 AM
  • Firstpage
    205
  • Lastpage
    218
  • Abstract
    Most current information retrieval systems rely solely on lexical item repetition, which is notorious for its vulnerability. In this research, we propose a novel method for the extraction of salient textual patterns. One of our major objectives is to move away from keywords and their associated limitations in textual information retrieval. How individual sentences in text fit together to be perceived as a salient pattern is identified. A text network that exhibits textual continuity, arising from a connectionist model, is described. The network facilitates a dynamic extraction of salient textual segments by capturing semantics from two different categories of natural language, namely lexical cohesion and contextual coherence. We also present the results of an empirical study designed to compare our model with the performance of human judges in the identification of salient textual patterns. The preliminary results show that our model has the potential for automatic salient patterns discovery in text.
  • Keywords
    feature extraction; information retrieval systems; knowledge acquisition; natural languages; text analysis; connectionist model; contextual coherence; information retrieval system; knowledge extraction; lexical cohesion; natural language; pattern extraction; semantic relatedness; text continuity; textual information retrieval; textual patterns; Artificial intelligence; Coherence; Councils; Data mining; Humans; Information analysis; Information retrieval; Knowledge engineering; Natural languages; Psychology;
  • fLanguage
    English
  • Journal_Title
    Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4427
  • Type

    jour

  • DOI
    10.1109/TSMCA.2003.820570
  • Filename
    1268156