DocumentCode
1766436
Title
Phrasal Recognition
Author
Farhadi, Alireza ; Sadeghi, Mohammad Amin
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of Washington, Seattle, WA, USA
Volume
35
Issue
12
fYear
2013
fDate
Dec. 2013
Firstpage
2854
Lastpage
2865
Abstract
In this paper, we introduce visual phrases, complex visual composites like "a person riding a horse." Visual phrases often display significantly reduced visual complexity compared to their component objects because the appearance of those objects can change profoundly when they participate in relations. We introduce a dataset suitable for phrasal recognition that uses familiar PASCAL object categories, and demonstrate significant experimental gains resulting from exploiting visual phrases. We show that a visual phrase detector significantly outperforms a baseline which detects component objects and reasons about relations, even though visual phrase training sets tend to be smaller than those for objects. We argue that any multiclass detection system must decode detector outputs to produce final results; this is usually done with nonmaximum suppression. We describe a novel decoding procedure that can account accurately for local context without solving difficult inference problems. We show this decoding procedure outperforms the state of the art. Finally, we show that decoding a combination of phrasal and object detectors produces real improvements in detector results.
Keywords
image coding; inference mechanisms; object detection; object recognition; PASCAL object categories; complex visual composites; detector output decoding; inference problems; local context; multiclass detection system; nonmaximum suppression; object appearance; object detectors; phrasal detectors; phrasal recognition; visual complexity; visual phrase detector; visual phrase training sets; Complexity theory; Data visualization; Decoding; Detectors; Image processing; Object recognition; Visual phrase; object interactions; object recognition; object subcategories; phrasal recognition; scene understanding; single image activity recognition; visual composites;
fLanguage
English
Journal_Title
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher
ieee
ISSN
0162-8828
Type
jour
DOI
10.1109/TPAMI.2013.168
Filename
6587714
Link To Document