• DocumentCode
    394139
  • Title

    An efficient decomposition of human-written summary sentence

  • Author

    Le, Nguyen Minh ; Horiguchi, Susumu

  • Author_Institution
    Graduate Sch. of Inf. Sci., Japan Adv. Inst. of Sci. & Technol., Ishikawa, Japan
  • Volume
    2
  • fYear
    2002
  • fDate
    18-22 Nov. 2002
  • Firstpage
    705
  • Abstract
    The task of human written decomposition is based on the pair of original documents and its summary to define these components in a summary sentence come from somewhere in the document. This task aims to satisfy three requirements as follows: 1) Whether it is constructed from cutting and pasting? 2) Which components in the sentence come from the original document? 3) Where in the document do the components come from? The result of a decomposition program is considered as training data and data evaluation for reduction and combination steps in cut and paste summarization system. We propose a method to enhance the accuracy of decomposition task through checking position and semantic measure for each word within a summary sentence. The model we used in the paper is extended from the hidden Markov model described by Hongyan Jing and K.R. MacKeown, (1999). Experiment results in the DUC data shows that the proposed method is efficient.
  • Keywords
    hidden Markov models; natural languages; text analysis; cut and paste summarization system; data evaluation; decomposition program; hidden Markov model; human-written summary sentence decomposition; original document; semantic measure; summary sentence; training data; Dictionaries; Dynamic programming; Hidden Markov models; Humans; Information science; Large-scale systems; Mathematical model; Position measurement; Training data; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
  • Print_ISBN
    981-04-7524-1
  • Type

    conf

  • DOI
    10.1109/ICONIP.2002.1198149
  • Filename
    1198149