Related papers: MetaFruit Meets Foundation Models: Leveraging a Comprehensive Multi-Fruit Dataset for Advancing Agricultural Foundation Models

MetaFruit Meets Foundation Models: Leveraging a Comprehensive Multi-Fruit Dataset for Advancing Agricultural Foundation Models

URL: http://arxiv.org/abs/2407.04711v1
Date: Tue, 14 May 2024 00:13:47 GMT
Title: MetaFruit Meets Foundation Models: Leveraging a Comprehensive Multi-Fruit Dataset for Advancing Agricultural Foundation Models
Authors: Jiajia Li, Kyle Lammers, Xunyuan Yin, Xiang Yin, Long He, Renfu Lu, Zhaojian Li,
Abstract summary: We introduce MetaFruit, the largest publicly available multi-class fruit dataset, comprising 4,248 images and 248,015 manually labeled instances. This study proposes an innovative open-set fruit detection system leveraging advanced Vision Foundation Models (VFMs) for fruit detection.
Score: 10.11552909915055
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Fruit harvesting poses a significant labor and financial burden for the industry, highlighting the critical need for advancements in robotic harvesting solutions. Machine vision-based fruit detection has been recognized as a crucial component for robust identification of fruits to guide robotic manipulation. Despite considerable progress in leveraging deep learning and machine learning techniques for fruit detection, a common shortfall is the inability to swiftly extend the developed models across different orchards and/or various fruit species. Additionally, the limited availability of pertinent data further compounds these challenges. In this work, we introduce MetaFruit, the largest publicly available multi-class fruit dataset, comprising 4,248 images and 248,015 manually labeled instances across diverse U.S. orchards. Furthermore, this study proposes an innovative open-set fruit detection system leveraging advanced Vision Foundation Models (VFMs) for fruit detection that can adeptly identify a wide array of fruit types under varying orchard conditions. This system not only demonstrates remarkable adaptability in learning from minimal data through few-shot learning but also shows the ability to interpret human instructions for subtle detection tasks. The performance of the developed foundation model is comprehensively evaluated using several metrics, which outperforms the existing state-of-the-art algorithms in both our MetaFruit dataset and other open-sourced fruit datasets, thereby setting a new benchmark in the field of agricultural technology and robotic harvesting. The MetaFruit dataset and detection framework are open-sourced to foster future research in vision-based fruit harvesting, marking a significant stride toward addressing the urgent needs of the agricultural sector.

Related papers

AI in Agriculture: A Survey of Deep Learning Techniques for Crops, Fisheries and Livestock [77.95897723270453]
Crops, fisheries and livestock form the backbone of global food production, essential to feed the ever-growing global population.<n> Addressing these issues requires efficient, accurate, and scalable technological solutions, highlighting the importance of artificial intelligence (AI)<n>This survey presents a systematic and thorough review of more than 200 research works covering conventional machine learning approaches, advanced deep learning techniques, and recent vision-language foundation models.
arXiv Detail & Related papers (2025-07-29T17:59:48Z)
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting [8.944833667187913]
This study introduces a novel foundation model-based framework for efficient apple ripeness and size estimation. We curated two public RGBD-based Fuji apple image datasets, integrating expanded annotations for ripeness ("Ripe" vs. "Unripe") based on fruit color and image capture dates. The resulting comprehensive dataset, Fuji-Ripeness-Size dataset, includes 4,027 images and 16,257 annotated apples with ripeness and size labels.
arXiv Detail & Related papers (2025-02-03T21:55:02Z)
Horticultural Temporal Fruit Monitoring via 3D Instance Segmentation and Re-Identification using Point Clouds [29.23207854514898]
We present a novel approach for temporal fruit monitoring that addresses point clouds collected in a greenhouse over time. Our method segments fruits using a learning-based instance segmentation approach directly on the point cloud. Experimental results on a real dataset of strawberries demonstrate that our approach outperforms other methods for fruits re-identification over time.
arXiv Detail & Related papers (2024-11-12T13:53:22Z)
Raspberry PhenoSet: A Phenology-based Dataset for Automated Growth Detection and Yield Estimation [1.2661567777618703]
We introduce Raspberry PhenoSet, a phenology-based dataset for detecting and segmenting raspberry fruit across seven developmental stages. This dataset contains 1,853 high-resolution images, the highest quality in the literature, captured under controlled artificial lighting in a vertical farm. We benchmarked several state-of-the-art deep learning models, including YOLOv8, YOLOv10, RT-DETR, and Mask R-CNN, to provide a comprehensive evaluation of their performance on the dataset.
arXiv Detail & Related papers (2024-11-01T18:34:26Z)
MetaFood3D: Large 3D Food Object Dataset with Nutrition Values [53.24500333363066]
This dataset consists of 637 meticulously labeled 3D food objects across 108 categories, featuring detailed nutrition information, weight, and food codes linked to a comprehensive nutrition database. Experimental results demonstrate our dataset's significant potential for improving algorithm performance, highlight the challenging gap between video captures and 3D scanned data, and show the strength of the MetaFood3D dataset in high-quality data generation, simulation, and augmentation.
arXiv Detail & Related papers (2024-09-03T15:02:52Z)
RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models [96.43285670458803]
Uni-Food is a unified food dataset that comprises over 100,000 images with various food labels. Uni-Food is designed to provide a more holistic approach to food data analysis. We introduce a novel Linear Rectification Mixture of Diverse Experts (RoDE) approach to address the inherent challenges of food-related multitasking.
arXiv Detail & Related papers (2024-07-17T16:49:34Z)
Few-Shot Fruit Segmentation via Transfer Learning [4.616529139444651]
We develop a few-shot semantic segmentation framework for infield fruits using transfer learning. Motivated by similar success in urban scene parsing, we propose specialized pre-training. We show that models with pre-training learn to distinguish between fruit still on the trees and fruit that have fallen on the ground.
arXiv Detail & Related papers (2024-05-04T04:05:59Z)
Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data. We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z)
Fusion-Driven Tree Reconstruction and Fruit Localization: Advancing Precision in Agriculture [2.338903291171288]
This study introduces an innovative methodology that harnesses the synergy of RGB imagery, LiDAR, and IMU data, to achieve intricate tree reconstructions. Experiments have been carried out in both a controlled environment and an actual peach orchard.
arXiv Detail & Related papers (2023-10-23T17:44:59Z)
Food Image Classification and Segmentation with Attention-based Multiple Instance Learning [51.279800092581844]
The paper presents a weakly supervised methodology for training food image classification and semantic segmentation models. The proposed methodology is based on a multiple instance learning approach in combination with an attention-based mechanism. We conduct experiments on two meta-classes within the FoodSeg103 data set to verify the feasibility of the proposed approach.
arXiv Detail & Related papers (2023-08-22T13:59:47Z)
Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities [86.89427012495457]
We review how AI techniques can transform agrifood systems and contribute to the modern agrifood industry. We present a progress review of AI methods in agrifood systems, specifically in agriculture, animal husbandry, and fishery. We highlight potential challenges and promising research opportunities for transforming modern agrifood systems with AI.
arXiv Detail & Related papers (2023-05-03T05:16:54Z)
Fruit Ripeness Classification: a Survey [59.11160990637616]
Many automatic methods have been proposed that employ a variety of feature descriptors for the food item to be graded. Machine learning and deep learning techniques dominate the top-performing methods. Deep learning can operate on raw data and thus relieve the users from having to compute complex engineered features.
arXiv Detail & Related papers (2022-12-29T19:32:20Z)
Semantic Image Segmentation with Deep Learning for Vine Leaf Phenotyping [59.0626764544669]
In this study, we use Deep Learning methods to semantically segment grapevine leaves images in order to develop an automated object detection system for leaf phenotyping. Our work contributes to plant lifecycle monitoring through which dynamic traits such as growth and development can be captured and quantified.
arXiv Detail & Related papers (2022-10-24T14:37:09Z)
Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic Review [5.639656362091594]
generative adversarial network (GAN) invented in 2014 in the computer vision community, provides suite of novel approaches that can learn good data representations. This paper presents an overview of the evolution of GAN architectures followed by a systematic review of their application to agriculture.
arXiv Detail & Related papers (2022-04-10T15:33:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.