• Title of article

    Image Parsing: Unifying Segmentation, Detection, and Recognition

  • Author/Authors

    ZHUOWEN TU AND XIANGRONG CHEN، نويسنده , , ALAN L. YUILLE، نويسنده , , SONG-CHUN ZHU، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2005
  • Pages
    28
  • From page
    113
  • To page
    140
  • Abstract
    In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation as a “parsing graph”, in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and re-configures it dynamically using a set of moves, which are mostly reversible Markov chain jumps. This computational framework integrates two popular inference approaches—generative (top-down) methods and discriminative (bottom-up) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottom-up tests/filters. In our Markov chain algorithm design, the posterior probability, defined by the generative models, is the invariant (target) probability for the Markov chain, and the discriminative probabilities are used to construct proposal probabilities to drive the Markov chain. Intuitively, the bottom-up discriminative probabilities activate top-down generative models. In this paper, we focus on two types of visual patterns—generic visual patterns, such as texture and shading, and object patterns including human faces and text. These types of patterns compete and cooperate to explain the image and so image parsing unifies image segmentation, object detection, and recognition (if we use generic visual patterns only then image parsing will correspond to image segmentation (Tu and Zhu, 2002. IEEE Trans. PAMI, 24(5):657-673). We illustrate our algorithm on natural images of complex city scenes and show examples where image segmentation can be improved by allowing object specific knowledge to disambiguate low-level segmentation cues, and conversely where object detection can be improved by using generic visual patterns to explain away shadows and occlusions.
  • Keywords
    image parsing , image segmentation , Object detection , Object recognition , AdaBoost , data driven Markov ChainMonte Carlo
  • Journal title
    INTERNATIONAL JOURNAL OF COMPUTER VISION
  • Serial Year
    2005
  • Journal title
    INTERNATIONAL JOURNAL OF COMPUTER VISION
  • Record number

    828129