• DocumentCode
    1757602
  • Title

    Improving Query-by-Singing/Humming by Combining Melody and Lyric Information

  • Author

    Chung-Che Wang ; Jang, Jyh-Shing Roger

  • Author_Institution
    Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • Volume
    23
  • Issue
    4
  • fYear
    2015
  • fDate
    42095
  • Firstpage
    798
  • Lastpage
    806
  • Abstract
    This paper proposes a novel method for improving query-by-singing/humming systems by using both melody and lyric information. First, singing/humming discrimination is performed to distinguish between singing and humming queries, which is achieved by considering the similarity between acoustic models. For the humming queries, a pitch-only melody recognition method that was ranked first among the MIREX (Music Information Retrieval Evaluation eXchange) query-by-singing/humming task submissions is applied. For the singing queries, a lyric similarity is computed using speech recognition techniques; the computed similarity is subsequently combined with the melody distance to exploit additional information in the lyrics. Several methods for combining melody distance and lyric similarity are investigated. Under the optimal experimental settings, the proposed query-by-singing/humming system achieves 51.19% error rate reduction for the top-10 retrieved results, indicating the feasibility of the proposed method.
  • Keywords
    music; query processing; speech recognition; MIREX query-by-humming task submissions; MIREX query-by-singing task submissions; acoustic models; lyric information; melody information; music information retrieval evaluation exchange; pitch-only melody recognition method; query-by-humming system; query-by-singing system; singing-humming discrimination; Accuracy; Acoustics; Databases; IEEE transactions; Speech; Speech recognition; Vectors; Combined melody distance and lyric similarity; query-by-singing/humming (QBSH); singing voice recognition; singing/humming discrimination (SHD);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2409735
  • Filename
    7055864