• DocumentCode
    730880
  • Title

    Micbots: Collecting large realistic datasets for speech and audio research using mobile robots

  • Author

    Le Roux, Jonathan ; Vincent, Emmanuel ; Hershey, John R. ; Ellis, Daniel P. W.

  • Author_Institution
    MERL, Cambridge, MA, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    5635
  • Lastpage
    5639
  • Abstract
    Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. Large benchmark datasets for automatic speech recognition (ASR) have been instrumental in the advancement of speech recognition technologies. However, when it comes to robust ASR, source separation, and localization, especially using microphone arrays, the perfect dataset is out of reach, and many different data collection efforts have each made different compromises between the conflicting factors in terms of realism, ground truth, and costs. Our goal here is to escape some of the most difficult trade-offs by proposing MICbots, a low-cost method of collecting large amounts of realistic data where annotations and ground truth are readily available. Our key idea is to use freely moving robots equiped with microphones and loudspeakers, playing recorded utterances from existing (already annotated) speech datasets. We give an overview of previous data collection efforts and the trade-offs they make, and describe the benefits of using our robot-based approach. We finally explain the use of this method to collect room impulse response measurement.
  • Keywords
    audio signal processing; loudspeakers; microphone arrays; mobile robots; source separation; speech processing; speech recognition; transient response; ASR; MICbots; audio signal processing; automatic speech recognition; data collection; loudspeakers; microphone arrays; mobile robots; room impulse response measurement; source separation; speech datasets; speech recognition technologies; speech signal processing; Acoustics; Microphones; Noise; Robots; Speech; Speech processing; Speech recognition; Mobile robots; resources; robust ASR; room acoustics; source separation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7179050
  • Filename
    7179050