Computer Vision in the Food Industry: Accurate, Real-time, and Automatic Food Recognition with Pretrained MobileNetV2
- URL: http://arxiv.org/abs/2405.11621v1
- Date: Sun, 19 May 2024 17:20:20 GMT
- Title: Computer Vision in the Food Industry: Accurate, Real-time, and Automatic Food Recognition with Pretrained MobileNetV2
- Authors: Shayan Rokhva, Babak Teimourpour, Amir Hossein Soltani,
- Abstract summary: This study employs the pretrained MobileNetV2 model, which is efficient and fast, for food recognition on the public Food11 dataset, comprising 16643 images.
It also utilizes various techniques such as dataset understanding, transfer learning, data augmentation, regularization, dynamic learning rate, hyper parameter tuning, and consideration of images in different sizes to enhance performance and robustness.
Despite employing a light model with a simpler structure and fewer trainable parameters compared to some deep and dense models in the deep learning area, it achieved commendable accuracy in a short time.
- Score: 1.6590638305972631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In contemporary society, the application of artificial intelligence for automatic food recognition offers substantial potential for nutrition tracking, reducing food waste, and enhancing productivity in food production and consumption scenarios. Modern technologies such as Computer Vision and Deep Learning are highly beneficial, enabling machines to learn automatically, thereby facilitating automatic visual recognition. Despite some research in this field, the challenge of achieving accurate automatic food recognition quickly remains a significant research gap. Some models have been developed and implemented, but maintaining high performance swiftly, with low computational cost and low access to expensive hardware accelerators, still needs further exploration and research. This study employs the pretrained MobileNetV2 model, which is efficient and fast, for food recognition on the public Food11 dataset, comprising 16643 images. It also utilizes various techniques such as dataset understanding, transfer learning, data augmentation, regularization, dynamic learning rate, hyperparameter tuning, and consideration of images in different sizes to enhance performance and robustness. These techniques aid in choosing appropriate metrics, achieving better performance, avoiding overfitting and accuracy fluctuations, speeding up the model, and increasing the generalization of findings, making the study and its results applicable to practical applications. Despite employing a light model with a simpler structure and fewer trainable parameters compared to some deep and dense models in the deep learning area, it achieved commendable accuracy in a short time. This underscores the potential for practical implementation, which is the main intention of this study.
Related papers
- A Novel Method for Accurate & Real-time Food Classification: The Synergistic Integration of EfficientNetB7, CBAM, Transfer Learning, and Data Augmentation [1.864621482724548]
This study employs the state-of-the-art EfficientNetB7 architecture, enhanced through transfer learning, data augmentation, and the CBAM attention module.
The proposed methodology, bolstered by various deep learning techniques, consistently achieves an impressive average accuracy of 96.40%.
Notably, it can classify over 60 images within one second during inference on unseen data, demonstrating its ability to deliver high accuracy promptly.
arXiv Detail & Related papers (2024-10-03T08:39:06Z) - Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks.
Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios.
New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z) - Food Image Classification and Segmentation with Attention-based Multiple
Instance Learning [51.279800092581844]
The paper presents a weakly supervised methodology for training food image classification and semantic segmentation models.
The proposed methodology is based on a multiple instance learning approach in combination with an attention-based mechanism.
We conduct experiments on two meta-classes within the FoodSeg103 data set to verify the feasibility of the proposed approach.
arXiv Detail & Related papers (2023-08-22T13:59:47Z) - Rethinking Cooking State Recognition with Vision Transformers [0.0]
Self-attention mechanism of Vision Transformer (ViT) architecture is proposed for the Cooking State Recognition task.
The proposed approach encapsulates the globally salient features from images, while also exploiting the weights learned from a larger dataset.
Our framework has an accuracy of 94.3%, which significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2022-12-16T17:06:28Z) - Deep Active Learning for Computer Vision: Past and Future [50.19394935978135]
Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions.
By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies.
arXiv Detail & Related papers (2022-11-27T13:07:14Z) - Vision Paper: Causal Inference for Interpretable and Robust Machine
Learning in Mobility Analysis [71.2468615993246]
Building intelligent transportation systems requires an intricate combination of artificial intelligence and mobility analysis.
The past few years have seen rapid development in transportation applications using advanced deep neural networks.
This vision paper emphasizes research challenges in deep learning-based mobility analysis that require interpretability and robustness.
arXiv Detail & Related papers (2022-10-18T17:28:58Z) - Design Automation for Fast, Lightweight, and Effective Deep Learning
Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing.
It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs.
The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z) - An Integrated System for Mobile Image-Based Dietary Assessment [7.352044746821543]
We present the design and development of a mobile, image-based dietary assessment system to capture and analyze dietary intake.
Our system is capable of collecting high quality food images in naturalistic settings and provides groundtruth annotations for developing new computational approaches.
arXiv Detail & Related papers (2021-10-05T00:04:19Z) - Efficient Deep Learning: A Survey on Making Deep Learning Models
Smaller, Faster, and Better [0.0]
With the progressive improvements in deep learning models, their number of parameters, latency, resources required to train, etc. have increased significantly.
We present and motivate the problem of efficiency in deep learning, followed by a thorough survey of the five core areas of model efficiency.
We believe this is the first comprehensive survey in the efficient deep learning space that covers the landscape of model efficiency from modeling techniques to hardware support.
arXiv Detail & Related papers (2021-06-16T17:31:38Z) - Deep Learning and Machine Vision for Food Processing: A Survey [5.53479503648814]
The quality and safety of food is an important issue to the whole society, since it is at the basis of human health, social development and stability.
The development of machine vision can greatly assist researchers and industries in improving the efficiency of food processing.
We provide an overview on the traditional machine learning and deep learning methods, as well as the machine vision techniques that can be applied to the field of food processing.
arXiv Detail & Related papers (2021-03-30T06:40:19Z) - Knowledge Distillation: A Survey [87.51063304509067]
Deep neural networks have been successful in both industry and academia, especially for computer vision tasks.
It is a challenge to deploy these cumbersome deep models on devices with limited resources.
Knowledge distillation effectively learns a small student model from a large teacher model.
arXiv Detail & Related papers (2020-06-09T21:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.