• DocumentCode
    3403743
  • Title

    Audio retrieval by latent perceptual indexing

  • Author

    Sundaram, Shiva ; Narayanan, Shrikanth

  • Author_Institution
    Dept. of Electr. Eng.-Syst., Southern California Univ., Los Angeles, CA
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    49
  • Lastpage
    52
  • Abstract
    We present a query-by-example audio retrieval framework by indexing audio clips in a generic database as points in a latent perceptual space. First, feature-vectors extracted from the clips in the database are grouped into reference clusters using an unsupervised clustering technique. An audio clip-to-cluster matrix is constructed by keeping count of the number of features that are quantized into each of the reference clusters. By singular-value decomposition of this matrix, each audio clip of the database is mapped into a a point in the latent perceptual space. This is used for indexing the retrieval system. Since each of the initial reference clusters represents a specific perceptual quality in a perceptual space (similar to words that represent specific concepts in the semantic space), querying-by-example results in clips that have similar perceptual qualities. Subjective human evaluation indicates about 75% retrieval performance. Evaluation on semantic categories reveals that the system performance is comparable to other proposed methods.
  • Keywords
    audio signal processing; indexing; information retrieval systems; query processing; audio clip-to-cluster matrix; audio clips indexing; feature vector extraction; latent perceptual indexing; latent perceptual space; query-by-example audio retrieval; semantic category; singular value decomposition; unsupervised clustering; Acoustic measurements; Audio databases; Content based retrieval; Feature extraction; Humans; Indexing; Information retrieval; Labeling; Matrix decomposition; Spatial databases; audio clustering; audio indexing; audio representation; audio retrieval; query by example; similarity measure;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4517543
  • Filename
    4517543