• DocumentCode
    2010840
  • Title

    Laplacian Eigenmaps for automatic news story segmentation

  • Author

    Liu, Zihan ; Xie, Lei ; Zheng, Lilei

  • Author_Institution
    Shaanxi Provincial Key Lab. of Speech & Image Inf. Process., Northwestern Polytech. Univ., Xi´´an, China
  • fYear
    2010
  • fDate
    23-25 Nov. 2010
  • Firstpage
    419
  • Lastpage
    424
  • Abstract
    This paper presents a novel lexical-similarity-based approach to automatic story segmentation in broadcast news. When measuring the connection between a pair of sentences, we take two factors into consideration, i.e. the lexical similarity and the distance between them in the text stream. Further investigation of pairwise connections between sentences is based on the technique of Laplacian Eigenmaps (LE). Taking advantage of the LE algorithm, we construct a Euclidean space in which each sentence is mapped to a vector. The original connective strength between sentences is reflected by the Euclidean distances between the corresponding vectors in the target space of the map. Further analysis of the map leads to a straightforward criterion for optimal segmentation. Then we formalize story segmentation as a minimization problem and give a dynamic programming solution to it. Experimental results on the TDT2 corpus show that the proposed method outperforms several state-of-the-art lexical-similarity-based methods.
  • Keywords
    dynamic programming; eigenvalues and eigenfunctions; image segmentation; video signal processing; Euclidean space; Laplacian eigenmaps; TDT2 corpus; automatic news story segmentation; broadcast news; dynamic programming; lexical similarity; optimal segmentation; original connective strength; text stream; Dynamic programming; Eigenvalues and eigenfunctions; Laplace equations; Speech; Speech processing; Speech recognition; Symmetric matrices;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Audio Language and Image Processing (ICALIP), 2010 International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-5856-1
  • Type

    conf

  • DOI
    10.1109/ICALIP.2010.5684548
  • Filename
    5684548