Mining Discriminative Food Regions for Accurate Food Recognition
- URL: http://arxiv.org/abs/2207.03692v1
- Date: Fri, 8 Jul 2022 05:09:24 GMT
- Title: Mining Discriminative Food Regions for Accurate Food Recognition
- Authors: Jianing Qiu, Frank P.-W. Lo, Yingnan Sun, Siyao Wang, Benny Lo
- Abstract summary: We propose a novel network architecture in which a primary network maintains the base accuracy of classifying an input image.
An auxiliary network adversarially mines discriminative food regions, and a region network classifies the resulting mined regions.
The proposed architecture denoted as PAR-Net is end-to-end trainable, and highlights discriminative regions in an online fashion.
- Score: 16.78437844398436
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automatic food recognition is the very first step towards passive dietary
monitoring. In this paper, we address the problem of food recognition by mining
discriminative food regions. Taking inspiration from Adversarial Erasing, a
strategy that progressively discovers discriminative object regions for weakly
supervised semantic segmentation, we propose a novel network architecture in
which a primary network maintains the base accuracy of classifying an input
image, an auxiliary network adversarially mines discriminative food regions,
and a region network classifies the resulting mined regions. The global (the
original input image) and the local (the mined regions) representations are
then integrated for the final prediction. The proposed architecture denoted as
PAR-Net is end-to-end trainable, and highlights discriminative regions in an
online fashion. In addition, we introduce a new fine-grained food dataset named
as Sushi-50, which consists of 50 different sushi categories. Extensive
experiments have been conducted to evaluate the proposed approach. On three
food datasets chosen (Food-101, Vireo-172, and Sushi-50), our approach performs
consistently and achieves state-of-the-art results (top-1 testing accuracy of
$90.4\%$, $90.2\%$, $92.0\%$, respectively) compared with other existing
approaches. Dataset and code are available at
https://github.com/Jianing-Qiu/PARNet
Related papers
- RAFA-Net: Region Attention Network For Food Items And Agricultural Stress Recognition [18.77864122988639]
This work proposes a region attention scheme for modelling long-range dependencies by building a correlation among different regions within an input image.
The proposed Region Attention network for Food items and Agricultural stress recognition method, dubbed RAFA-Net, has been experimented on three public food datasets.
The highest top-1 accuracies of RAFA-Net are 91.69%, 91.56%, and 96.97% on the UECFood-100, UECFood-256, and MAFood-121 datasets, respectively.
arXiv Detail & Related papers (2024-10-16T16:28:08Z) - Spatial Entity Resolution between Restaurant Locations and
Transportation Destinations in Southeast Asia [0.054390204258189995]
This paper attempts to recognize identical place entities from databases of Points-of-Interest (POI) and GrabFood restaurants.
POI-restaurant matching was conducted separately for Singapore, Philippines, Indonesia, and Malaysia.
Experimental estimates demonstrate that a matching POI can be found for over 35% of restaurants in these countries.
arXiv Detail & Related papers (2024-01-16T17:59:54Z) - Transferring Knowledge for Food Image Segmentation using Transformers
and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food.
One challenge is that food items can overlap and mix, making them difficult to distinguish.
Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT)
The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z) - Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation
for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization.
We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain.
It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z) - Food Ingredients Recognition through Multi-label Learning [0.0]
The ability to recognize various food-items in a generic food plate is a key determinant for an automated diet assessment system.
We employ a deep multi-label learning approach and evaluate several state-of-the-art neural networks for their ability to detect an arbitrary number of ingredients in a dish image.
arXiv Detail & Related papers (2022-10-24T10:18:26Z) - Cross-lingual Adaptation for Recipe Retrieval with Mixup [56.79360103639741]
Cross-modal recipe retrieval has attracted research attention in recent years, thanks to the availability of large-scale paired data for training.
This paper studies unsupervised domain adaptation for image-to-recipe retrieval, where recipes in source and target domains are in different languages.
A novel recipe mixup method is proposed to learn transferable embedding features between the two domains.
arXiv Detail & Related papers (2022-05-08T15:04:39Z) - FoodLogoDet-1500: A Dataset for Large-Scale Food Logo Detection via
Multi-Scale Feature Decoupling Network [55.49022825759331]
A large-scale food logo dataset is urgently needed for developing advanced food logo detection algorithms.
FoodLogoDet-1500 is a new large-scale publicly available food logo dataset with 1,500 categories, about 100,000 images and about 150,000 manually annotated food logo objects.
We propose a novel food logo detection method Multi-scale Feature Decoupling Network (MFDNet) to solve the problem of distinguishing multiple food logo categories.
arXiv Detail & Related papers (2021-08-10T12:47:04Z) - Visual Aware Hierarchy Based Food Recognition [10.194167945992938]
We propose a new two-step food recognition system using Convolutional Neural Networks (CNNs) as the backbone architecture.
The food localization step is based on an implementation of the Faster R-CNN method to identify food regions.
In the food classification step, visually similar food categories can be clustered together automatically to generate a hierarchical structure.
arXiv Detail & Related papers (2020-12-06T20:25:31Z) - Collaborative Training between Region Proposal Localization and
Classification for Domain Adaptive Object Detection [121.28769542994664]
Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance.
In this paper, we are the first to reveal that the region proposal network (RPN) and region proposal classifier(RPC) demonstrate significantly different transferability when facing large domain gap.
arXiv Detail & Related papers (2020-09-17T07:39:52Z) - ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked
Global-Local Attention Network [50.7720194859196]
We introduce the dataset ISIA Food- 500 with 500 categories from the list in the Wikipedia and 399,726 images.
This dataset surpasses existing popular benchmark datasets by category coverage and data volume.
We propose a stacked global-local attention network, which consists of two sub-networks for food recognition.
arXiv Detail & Related papers (2020-08-13T02:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.