• DocumentCode
    258034
  • Title

    Human and machine annotation in the Orchive, a large scale bioacoustic archive

  • Author

    Ness, Steven ; Tzanetakis, George

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Victoria, Victoria, BC, Canada
  • fYear
    2014
  • fDate
    3-5 Dec. 2014
  • Firstpage
    1136
  • Lastpage
    1140
  • Abstract
    Advances in computer technology have enabled the collection, digitization, and automated processing of huge archives of bioacoustic sound. Many of the tools previously used in bioacoustics research work well with small to medium-sized audio collections, but are challenged when processing large collections ranging from tens of terabytes to petabyte size. The Orchive is a system that assists researchers to listen to, view, annotate and run advanced audio feature extraction and machine learning algorithms on large bioacoustic archives. Annotation is one of the biggest challenges in our work. In this paper, we describe our efforts to utilize experts as well as citizen scientists to participate in the process of annotating recordings. The Orchive contains over 23,000 hours of orca vocalizations collected over the course of 30 years, and represents one of the largest continuous collections of bioacoustic recordings in the world. Manual annotation is practically impossible and therefore we investigate the effectiveness of a semi-automatic approach for extracting information from these recordings, and show various experimental results. Finally we have been able to apply our automatic analysis over the a large portion of the archive and describe the computational resources required. To the best of our knowledge this is the largest archive of bioacoustic data that has even been automatically analyzed.
  • Keywords
    audio signal processing; bioacoustics; data analysis; Orchive; advanced audio feature extraction algorithm; annotating recordings; automatic analysis; bioacoustic data; bioacoustic recordings; bioacoustic sound; human annotation; large scale bioacoustic archive; machine annotation; machine learning algorithm; manual annotation; medium-sized audio collections; orca vocalization; Accuracy; Kernel; Logistics; Machine learning algorithms; Support vector machines; Whales;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on
  • Conference_Location
    Atlanta, GA
  • Type

    conf

  • DOI
    10.1109/GlobalSIP.2014.7032299
  • Filename
    7032299