• DocumentCode
    2915662
  • Title

    Robust broad-scale benthic habitat mapping when training data is scarce

  • Author

    Ahsan, Nasir ; Williams, Stefan B. ; Pizarro, Oscar

  • Author_Institution
    Australian Center for Field Robot., Univ. of Sydney, Sydney, NSW, Australia
  • fYear
    2012
  • fDate
    21-24 May 2012
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Understanding the distribution of habitat classes at broad-scales is of interest in marine park conservation and planning. Typically sites of interest can extend up to many hundreds of square kilometers. However, collecting ground truth data (optical imagery, towed video, grab samples, and etc.) over such broad scales is impractical, and only a small fraction of the sites can be sampled depending on budget constraints. Benthic habitat mapping involves learning the correlations between habitat classes derived from limited ground truth sampling of the seabed and its corresponding morphology and extrapolating these correlations to the entire site. One important issue with such approaches is that the correlations are learned on limited data, therefore, motivating the need to investigate robust techniques for learning the correlations and extrapolating them. In this paper we have motivated the use of the generative classifier Gaussian Mixture Models (GMM´s) for the task of benthic habitat mapping instead of discriminative models such as Classification Trees (CT´s - popular in the benthic habitat mapping literature) and Support Vector Machines (SVM´s - generally popular in a variety of fields) based on the idea that generative classifiers take into more information about the underlying data distribution than discriminative classifiers, yielding more robust extrapolations. Using holdout validation we have shown that GMM´s consistently perform comparably, or outperform, the best classifier for all training set sizes (small and large), and that this is not the case with CT´s and SVM´s. We also show that GMM´s are more certain about their predictions over the broad-scale than the other classifiers.
  • Keywords
    environmental factors; environmental science computing; geophysics computing; learning (artificial intelligence); oceanography; pattern classification; GMM classifier; Gaussian mixture models; correlation extrapolation; correlation learning; generative classifiers; ground truth data; habitat class correlations; habitat class distribution; marine park conservation; marine park planning; robust broad scale benthic habitat mapping; seabed ground truth sampling; seabed morphology; training data; Biological system modeling; Correlation; Data models; Entropy; Support vector machines; Training; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    OCEANS, 2012 - Yeosu
  • Conference_Location
    Yeosu
  • Print_ISBN
    978-1-4577-2089-5
  • Type

    conf

  • DOI
    10.1109/OCEANS-Yeosu.2012.6263540
  • Filename
    6263540