Weakly Supervised Panoptic Segmentation for Defect-Based Grading of Fresh Produce
- URL: http://arxiv.org/abs/2411.16219v2
- Date: Fri, 11 Apr 2025 19:30:28 GMT
- Title: Weakly Supervised Panoptic Segmentation for Defect-Based Grading of Fresh Produce
- Authors: Manuel Knott, Divinefavour Odion, Sameer Sontakke, Anup Karwa, Thijs Defraeye,
- Abstract summary: Visual inspection for defect grading in agricultural supply chains is crucial but traditionally labor-intensive and error-prone.<n>We address this challenge by evaluating the Segment Anything Model to generate dense panoptic segmentation masks from sparse annotations.<n>Our approach offers practical estimates of defect number and relative size from panoptic masks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual inspection for defect grading in agricultural supply chains is crucial but traditionally labor-intensive and error-prone. Automated computer vision methods typically require extensively annotated datasets, which are often unavailable in decentralized supply chains. We address this challenge by evaluating the Segment Anything Model (SAM) to generate dense panoptic segmentation masks from sparse annotations. These dense predictions are then used to train a supervised panoptic segmentation model. Focusing on banana surface defects (bruises and scars), we validate our approach using 476 field images annotated with 1440 defects. While SAM-generated masks generally align with human annotations, substantially reducing annotation effort, we explicitly identify failure cases associated with specific defect sizes and shapes. Despite these limitations, our approach offers practical estimates of defect number and relative size from panoptic masks, underscoring the potential and current boundaries of foundation models for defect quantification in low-data agricultural scenarios. GitHub: https://github.com/manuelknott/banana-defect-segmentation
Related papers
- Saccadic Vision for Fine-Grained Visual Classification [10.681604440788854]
Fine-grained visual classification (FGVC) requires distinguishing between visually similar categories through subtle, localized features.<n>Existing part-based methods rely on complex localization networks that learn mappings from pixel to sample space.<n>We propose a two-stage process that first extracts peripheral features and generates a sample map.<n>We employ contextualized selective attention to weigh the impact of each fixation patch before fusing peripheral and focus representations.
arXiv Detail & Related papers (2025-09-19T07:03:37Z) - Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - LR-IAD:Mask-Free Industrial Anomaly Detection with Logical Reasoning [1.3124513975412255]
Industrial Anomaly Detection (IAD) is critical for ensuring product quality by identifying defects.
Existing vision-language models (VLMs) and Multimodal Large Language Models (MLLMs) address some limitations but rely on mask annotations.
We propose a reward function that dynamically prioritizes rare defect patterns during training to handle class imbalance.
arXiv Detail & Related papers (2025-04-28T06:52:35Z) - PaveSAM Segment Anything for Pavement Distress [4.671701998390791]
pavement monitoring using computer vision can analyze pavement conditions more efficiently and accurately than manual methods.
Deep learning-based segmentation models are however, often supervised and require pixel-level annotations.
This research proposes a zero-shot segmentation model, PaveSAM, that can segment pavement distresses using bounding box prompts.
arXiv Detail & Related papers (2024-09-11T14:24:29Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM [55.93697196726016]
We propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM)
We show that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas.
Our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable to some supervised methods.
arXiv Detail & Related papers (2024-02-27T13:55:17Z) - UGainS: Uncertainty Guided Anomaly Instance Segmentation [80.12253291709673]
A single unexpected object on the road can cause an accident or lead to injuries.
Current approaches tackle anomaly segmentation by assigning an anomaly score to each pixel.
We propose an approach that produces accurate anomaly instance masks.
arXiv Detail & Related papers (2023-08-03T20:55:37Z) - Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual
Mask Annotations [86.47908754383198]
Open-Vocabulary (OV) methods leverage large-scale image-caption pairs and vision-language models to learn novel categories.
Our method generates pseudo-mask annotations by leveraging the localization ability of a pre-trained vision-language model for objects present in image-caption pairs.
Our method trained with just pseudo-masks significantly improves the mAP scores on the MS-COCO dataset and OpenImages dataset.
arXiv Detail & Related papers (2023-03-29T17:58:39Z) - Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework [70.18084425770091]
Deep neural networks have been widely applied in nuclei instance segmentation of H&E stained pathology images.
It is inefficient and unnecessary to label all pixels for a dataset of nuclei images which usually contain similar and redundant patterns.
We propose a novel full nuclei segmentation framework that chooses only a few image patches to be annotated, augments the training set from the selected samples, and achieves nuclei segmentation in a semi-supervised manner.
arXiv Detail & Related papers (2022-12-20T14:53:26Z) - Learning to Annotate Part Segmentation with Gradient Matching [58.100715754135685]
This paper focuses on tackling semi-supervised part segmentation tasks by generating high-quality images with a pre-trained GAN.
In particular, we formulate the annotator learning as a learning-to-learn problem.
We show that our method can learn annotators from a broad range of labelled images including real images, generated images, and even analytically rendered images.
arXiv Detail & Related papers (2022-11-06T01:29:22Z) - GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference.
Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference.
GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z) - Coarse Retinal Lesion Annotations Refinement via Prototypical Learning [3.464871689508835]
coarse annotations such as circles or ellipses for outlining the lesion area can be six times more efficient than pixel-level annotation.
This paper proposes an annotation refinement network to convert a coarse annotation into a pixel-level segmentation mask.
We also introduce a prototype weighing module to handle challenging cases where the lesion is overly small.
arXiv Detail & Related papers (2022-08-30T14:22:47Z) - Deep Open-Set Recognition for Silicon Wafer Production Monitoring [7.7977112365916]
We propose a comprehensive pipeline for wafer monitoring based on a Submanifold Sparse Convolutional Network.
Our experiments show that directly processing full-resolution WDMs by Submanifold Sparse Convolutions yields superior classification performance on known classes.
Our solution outperforms state-of-the-art open-set recognition solutions in detecting novelties.
arXiv Detail & Related papers (2022-08-30T08:39:52Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Automatic segmentation of meniscus based on MAE self-supervision and
point-line weak supervision paradigm [2.445445375557563]
We introduce the self-supervised method MAE(Masked Autoencoders) into knee joint images to provide a good initial weight for the segmentation model.
Secondly, we propose a weakly supervised paradigm for meniscus segmentation based on the combination of point and line to reduce the time of labeling.
arXiv Detail & Related papers (2022-05-07T02:57:50Z) - Automated Seed Quality Testing System using GAN & Active Learning [2.8978926857710263]
We build a novel seed image acquisition setup, which captures both the top and bottom views.
We are able to get accuracies of up to 91.6% for testing the physical purity of seed samples.
arXiv Detail & Related papers (2021-10-02T10:28:25Z) - Enforcing Mutual Consistency of Hard Regions for Semi-supervised Medical
Image Segmentation [68.9233942579956]
We propose a novel mutual consistency network (MC-Net+) to exploit the unlabeled hard regions for semi-supervised medical image segmentation.
The MC-Net+ model is motivated by the observation that deep models trained with limited annotations are prone to output highly uncertain and easily mis-classified predictions.
We compare the segmentation results of the MC-Net+ with five state-of-the-art semi-supervised approaches on three public medical datasets.
arXiv Detail & Related papers (2021-09-21T04:47:42Z) - Learning from Partially Overlapping Labels: Image Segmentation under
Annotation Shift [68.6874404805223]
We propose several strategies for learning from partially overlapping labels in the context of abdominal organ segmentation.
We find that combining a semi-supervised approach with an adaptive cross entropy loss can successfully exploit heterogeneously annotated data.
arXiv Detail & Related papers (2021-07-13T09:22:24Z) - Weakly-supervised High-resolution Segmentation of Mammography Images for
Breast Cancer Diagnosis [17.936019428281586]
In cancer diagnosis, interpretability can be achieved by localizing the region of the input image responsible for the output.
We introduce a novel neural network architecture to perform weakly-supervised segmentation of high-resolution images.
We apply this model to breast cancer diagnosis with screening mammography, and validate it on a large clinically-realistic dataset.
arXiv Detail & Related papers (2021-06-13T17:25:21Z) - More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
Sketch Based Image Retrieval [112.1756171062067]
We introduce a novel semi-supervised framework for cross-modal retrieval.
At the centre of our design is a sequential photo-to-sketch generation model.
We also introduce a discriminator guided mechanism to guide against unfaithful generation.
arXiv Detail & Related papers (2021-03-25T17:27:08Z) - Uncertainty guided semi-supervised segmentation of retinal layers in OCT
images [4.046207281399144]
We propose a novel uncertainty-guided semi-supervised learning based on a student-teacher approach for training the segmentation network.
The proposed framework is a key contribution and applicable for biomedical image segmentation across various imaging modalities.
arXiv Detail & Related papers (2021-03-02T23:14:25Z) - Towards Unbiased COVID-19 Lesion Localisation and Segmentation via
Weakly Supervised Learning [66.36706284671291]
We propose a data-driven framework supervised by only image-level labels to support unbiased lesion localisation.
The framework can explicitly separate potential lesions from original images, with the help of a generative adversarial network and a lesion-specific decoder.
arXiv Detail & Related papers (2021-03-01T06:05:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.