• DocumentCode
    336815
  • Title

    Improving the suitability of imperfect transcriptions for information retrieval from spoken documents

  • Author

    Siegler, Matthew ; Withrock, M.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    505
  • Abstract
    There has been a considerable focus on information retrieval for multimedia databases. When speech is used as the source material for multimedia indexing, the effect of transcriber error on retrieval effectiveness must be considered. This paper describes a method for measuring the relevance of documents to queries when information about the probability of word transcription error is available. To support the use of this technique, a method is presented for estimating word error probability in speech recognition engines that use word graphs (lattices). An information retrieval experiment using this technique on a large corpus of spoken documents is discussed. The method was able to reduce the difference in retrieval effectiveness between reference texts and hypothesized texts by 13-38 % depending on the size of the document set
  • Keywords
    error statistics; information retrieval; multimedia databases; natural languages; search engines; speech recognition; computer speech recognition; document set size; hypothesized texts; imperfect transcriptions; information retrieval; multimedia databases; multimedia indexing; queries; reference texts; source material; speech recognition engines; spoken documents; word graphs; word lattices; word transcription error probability; Computer errors; Content based retrieval; Data engineering; Engines; Error probability; Frequency; Indexing; Information retrieval; Multimedia databases; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758173
  • Filename
    758173