Author_Institution :
Grad. Sch. of Sci. & Eng., Ritsumeikan Univ., Kusatsu, Japan
Abstract :
Recent year, with the increasing of unhealthy diets which will threaten people´s life due to the various resulted risks such as heart stroke, liver trouble and so on, the maintaining for healthy life has attracted much attention and then how to manage the dietary life is becoming more and more important. In this research, we aim to construct an auto-recognition system of food images and keep the daily food-log records which will contribute to manage dietary life. With the easily available food images taken by mobile phone, it prospects to give the insight about the daily dietary of users with our constructed food recognition system. In order to achieve the acceptable recognition performance of the food images, we propose to apply a sparse model for coding local descriptors extracted from the food images and various pooling methods for aggregating the xoded descriptors. Sparse coding: an extension of vector quantization for local descriptors, which is popularly used in Bag-of-Features (BoF) for image representation, can reconstruct the local descriptors more effective, and then obtain more discriminated feature for food image representation. However, in order to emphasize the strongest activated pattern, the widely applied aggregation strategy of the sparse coded vector is only to retain the maximum coefficient in all (named as Max-pooling), which would completely ignore the frequency: an important signature for identifying different types of images, of the activated patterns. Therefore, we explore a hybrid aggregation strategy named as top-ranked average pooling (TRAP), which integrates not only the maximum activated magnitude but also the stronger activated number for image representation. Experiments validate that the proposed hybrid aggregation strategy combined with sparse model can greatly improve the recognition rates compared with the conventional BOF model and the state-of-the-art methods on two databases: our constructed RFID and the public PFID.
Keywords :
food processing industry; image coding; image representation; radiofrequency identification; vectors; RFID; bag-of-features; dietary life; food image representation; food recognition; hybrid aggregation; max-pooling; sparse coded descriptors; sparse coded vector; top-ranked average pooling; vector quantization; Encoding; Feature extraction; Image coding; Image recognition; Image representation; Vector quantization; Vectors; food image; image recognition; pooling; sparse coding;