• DocumentCode
    730665
  • Title

    Analysis and automatic recognition of Human BeatBox sounds: A comparative study

  • Author

    Picart, Benjamin ; Brognaux, Sandrine ; Dupont, Stephane

  • Author_Institution
    TCTS Lab., Univ. of Mons, Mons, Belgium
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4255
  • Lastpage
    4259
  • Abstract
    “Human BeatBox” (HBB) is a newly expanding contemporary singing style where the vocalist imitates drum beats percussive sounds as well as pitched musical instrument sounds. Drum sounds typically use a notation based on plosives and fricatives, and instrument sounds cover vocalisations that go beyond spoken language vowels. HBB hence constitutes an interesting use case for expanding techniques initially developed for speech processing, with the goal of automatically annotating performances as well as developing new sound effects dedicated to HBB performers. In this paper, we investigate three complementary aspects of HBB analysis: pitch tracking, onset detection, and automatic recognition of sounds and instruments. As a first step, a new high-quality HBB audio database has been recorded, carefully segmented and annotated manually to obtain a ground truth reference. Various pitch tracking and onset detection methods are then compared and assessed against this reference. Finally, Hidden Markov Models are evaluated, together with an exploration of their parameters space, for the automatic recognition of different types of sounds. This study exhibits very encouraging experimental results.
  • Keywords
    speech recognition; HBB; HBB audio database; Hidden Markov Models; automatic recognition; drum beats percussive sounds; expanding techniques; ground truth reference; human BeatBox sounds; musical instrument sounds; onset detection methods; pitch tracking; sound effects; speech processing; spoken language vowels; Databases; Error analysis; Feature extraction; Hidden Markov models; Instruments; Speech; Speech processing; Hidden Markov Model; Human beatbox; automatic speech recognition; onset detection; pitch tracking;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178773
  • Filename
    7178773