• Title of article

    Document Expansion for Speech Retrieval

  • Author/Authors

    Singhal، Amit نويسنده , , Pereira، Fernando نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 1999
  • Pages
    -33
  • From page
    34
  • To page
    0
  • Abstract
    Advances in automatic speech recognition allow us to search large speech collections using traditional information retrieval methods. The problem of "aboutness" for documents - is a document about a certain concept - has been at the core of document indexing for the entire history of IR. This problem is more difficult for speech indexing since automatic speech transcriptions often contain mistakes. In this study we show that document expansion can be successfully used to alleviate the effect of transcription mistakes on speech retrieval. The loss of retrieval effectiveness due to automatic transcription errors can be reduced by document expansion from 15-27% relative to retrieval from human transcriptions to only about 7-13%, even for automatic transcriptions with word error rates as high as 65%. For good automatic transcriptions (25% word error rate), retrieval effectiveness with document expansion is indistinguishable from retrieval from human transcriptions. This makes speech retrieval from automatic transcriptions, even poor ones, competitive with retrieval from perfect transcriptions.
  • Keywords
    Digital library , archival documents
  • Journal title
    SIGIR FORUM
  • Serial Year
    1999
  • Journal title
    SIGIR FORUM
  • Record number

    16788