Related papers: Artificial Intelligence in the Food Industry: Food Waste Estimation based on Computer Vision, a Brief Case Study in a University Dining Hall

Artificial Intelligence in the Food Industry: Food Waste Estimation based on Computer Vision, a Brief Case Study in a University Dining Hall

URL: http://arxiv.org/abs/2507.14662v1
Date: Sat, 19 Jul 2025 15:21:29 GMT
Title: Artificial Intelligence in the Food Industry: Food Waste Estimation based on Computer Vision, a Brief Case Study in a University Dining Hall
Authors: Shayan Rokhva, Babak Teimourpour,
Abstract summary: This study presents a cost-effective computer vision framework that estimates plate-level food waste.<n>Four fully supervised models were trained using a capped dynamic inverse-frequency loss and AdamW metrics.<n>All models achieved satisfying performance, and for each food type, at least one model approached or surpassed 90% DPA.
Score: 1.864621482724548
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quantifying post-consumer food waste in institutional dining settings is essential for supporting data-driven sustainability strategies. This study presents a cost-effective computer vision framework that estimates plate-level food waste by utilizing semantic segmentation of RGB images taken before and after meal consumption across five Iranian dishes. Four fully supervised models (U-Net, U-Net++, and their lightweight variants) were trained using a capped dynamic inverse-frequency loss and AdamW optimizer, then evaluated through a comprehensive set of metrics, including Pixel Accuracy, Dice, IoU, and a custom-defined Distributional Pixel Agreement (DPA) metric tailored to the task. All models achieved satisfying performance, and for each food type, at least one model approached or surpassed 90% DPA, demonstrating strong alignment in pixel-wise proportion estimates. Lighter models with reduced parameter counts offered faster inference, achieving real-time throughput on an NVIDIA T4 GPU. Further analysis showed superior segmentation performance for dry and more rigid components (e.g., rice and fries), while more complex, fragmented, or viscous dishes, such as stews, showed reduced performance, specifically post-consumption. Despite limitations such as reliance on 2D imaging, constrained food variety, and manual data collection, the proposed framework is pioneering and represents a scalable, contactless solution for continuous monitoring of food consumption. This research lays foundational groundwork for automated, real-time waste tracking systems in large-scale food service environments and offers actionable insights and outlines feasible future directions for dining hall management and policymakers aiming to reduce institutional food waste.

Related papers

VolE: A Point-cloud Framework for Food 3D Reconstruction and Volume Estimation [4.621139625109643]
We present VolE, a novel framework that leverages mobile device-driven 3D reconstruction to estimate food volume.<n>VolE captures images and camera locations in free motion to generate precise 3D models, thanks to AR-capable mobile devices.<n>Our experiments demonstrate that VolE outperforms the existing volume estimation techniques across multiple datasets by achieving 2.22 % MAPE.
arXiv Detail & Related papers (2025-05-15T12:03:05Z)
IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition [16.32678094159896]
We propose IMRL (Integrated Multi-Dimensional Representation Learning), which integrates visual, physical, temporal, and geometric representations to enhance robustness and generalizability of IL for food acquisition.<n>Our approach captures food types and physical properties, models temporal dynamics of acquisition actions, and introduces geometric information to determine optimal scooping points.<n> IMRL enables IL to adaptively adjust scooping strategies based on context, improving the robot's capability to handle diverse food acquisition scenarios.
arXiv Detail & Related papers (2024-09-18T16:09:06Z)
MetaFood3D: 3D Food Dataset with Nutrition Values [52.16894900096017]
This dataset consists of 743 meticulously scanned and labeled 3D food objects across 131 categories.<n>Our MetaFood3D dataset emphasizes intra-class diversity and includes rich modalities such as textured mesh files, RGB-D videos, and segmentation masks.
arXiv Detail & Related papers (2024-09-03T15:02:52Z)
RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models [96.43285670458803]
Uni-Food is a unified food dataset that comprises over 100,000 images with various food labels.<n>Uni-Food is designed to provide a more holistic approach to food data analysis.<n>We introduce a novel Linear Rectification Mixture of Diverse Experts (RoDE) approach to address the inherent challenges of food-related multitasking.
arXiv Detail & Related papers (2024-07-17T16:49:34Z)
Computer Vision in the Food Industry: Accurate, Real-time, and Automatic Food Recognition with Pretrained MobileNetV2 [1.6590638305972631]
This study employs the pretrained MobileNetV2 model, which is efficient and fast, for food recognition on the public Food11 dataset, comprising 16643 images. It also utilizes various techniques such as dataset understanding, transfer learning, data augmentation, regularization, dynamic learning rate, hyper parameter tuning, and consideration of images in different sizes to enhance performance and robustness. Despite employing a light model with a simpler structure and fewer trainable parameters compared to some deep and dense models in the deep learning area, it achieved commendable accuracy in a short time.
arXiv Detail & Related papers (2024-05-19T17:20:20Z)
NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images [63.314702537010355]
Self-reporting methods are often inaccurate and suffer from substantial bias. Recent work has explored using computer vision prediction systems to predict nutritional information from food images. This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures.
arXiv Detail & Related papers (2024-05-13T14:56:55Z)
How Much You Ate? Food Portion Estimation on Spoons [63.611551981684244]
Current image-based food portion estimation algorithms assume that users take images of their meals one or two times. We introduce an innovative solution that utilizes stationary user-facing cameras to track food items on utensils. The system is reliable for estimation of nutritional content of liquid-solid heterogeneous mixtures such as soups and stews.
arXiv Detail & Related papers (2024-05-12T00:16:02Z)
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images. We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information. We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z)
An End-to-end Food Portion Estimation Framework Based on Shape Reconstruction from Monocular Image [7.380382380564532]
We propose an end-to-end deep learning framework for food energy estimation from a monocular image through 3D shape reconstruction. Our method is evaluated on a publicly available food image dataset Nutrition5k, resulting a Mean Absolute Error (MAE) of 40.05 kCal and Mean Absolute Percentage Error (MAPE) of 11.47% for food energy estimation.
arXiv Detail & Related papers (2023-08-03T15:17:24Z)
Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions [65.50975507723827]
Food image segmentation is an important task that has ubiquitous applications, such as estimating the nutritional value of a plate of food. One challenge is that food items can overlap and mix, making them difficult to distinguish. Two models are trained and compared, one based on convolutional neural networks and the other on Bidirectional representation for Image Transformers (BEiT) The BEiT model outperforms the previous state-of-the-art model by achieving a mean intersection over union of 49.4 on FoodSeg103.
arXiv Detail & Related papers (2023-06-15T15:38:10Z)
Dish detection in food platters: A framework for automated diet logging and nutrition management [1.7855867849530096]
Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-of-the-art model identification. We implement the framework in the context of Indian food platters known for their complex presentation.
arXiv Detail & Related papers (2023-05-12T15:25:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.