LVLMs as inspectors: an agentic framework for category-level structural defect annotation
- URL: http://arxiv.org/abs/2510.00603v1
- Date: Wed, 01 Oct 2025 07:31:42 GMT
- Title: LVLMs as inspectors: an agentic framework for category-level structural defect annotation
- Authors: Sheng Jiang, Yuanmin Ning, Bingxi Huang, Peiyin Chen, Zhaohui Chen,
- Abstract summary: A novel agentic annotation framework, Agent-based Defect Pattern Tagger, is introduced.<n>It integrates Large Vision-Language Models (LVLMs) with a semantic pattern matching module and an iterative self-questioning refinement mechanism.<n>It transforms raw visual data into high-quality, semantically labeled defect datasets without any manual supervision.
- Score: 3.2445985501669434
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Automated structural defect annotation is essential for ensuring infrastructure safety while minimizing the high costs and inefficiencies of manual labeling. A novel agentic annotation framework, Agent-based Defect Pattern Tagger (ADPT), is introduced that integrates Large Vision-Language Models (LVLMs) with a semantic pattern matching module and an iterative self-questioning refinement mechanism. By leveraging optimized domain-specific prompting and a recursive verification process, ADPT transforms raw visual data into high-quality, semantically labeled defect datasets without any manual supervision. Experimental results demonstrate that ADPT achieves up to 98% accuracy in distinguishing defective from non-defective images, and 85%-98% annotation accuracy across four defect categories under class-balanced settings, with 80%-92% accuracy on class-imbalanced datasets. The framework offers a scalable and cost-effective solution for high-fidelity dataset construction, providing strong support for downstream tasks such as transfer learning and domain adaptation in structural damage assessment.
Related papers
- Confusion-Aware Rubric Optimization for LLM-based Automated Grading [31.353360036776976]
We introduce Confusion-Aware Optimization (CARO), a novel framework that enhances accuracy and computational efficiency.<n>CARO decomposes monolithic error signals into distinct modes, allowing for unambiguous diagnosis and repair of specific misclassification patterns.<n>These results suggest that replacing mixed-error aggregation with surgical, mode-specific repair yields robust improvements in automated assessment scalability and precision.
arXiv Detail & Related papers (2026-02-28T04:17:12Z) - Confident Learning for Object Detection under Model Constraints [0.05131152350448099]
Agricultural weed detection on edge devices is subject to strict constraints on model capacity, computational resources, and real-time inference latency.<n>This paper proposes Model-Driven Data Correction (MDDC), a data-centric framework that enhances detection performance by iteratively diagnosing and correcting data quality deficiencies.
arXiv Detail & Related papers (2026-01-14T15:32:08Z) - PaperAudit-Bench: Benchmarking Error Detection in Research Papers for Critical Automated Peer Review [54.141490756509306]
We introduce PaperAudit-Bench, which consists of two components: PaperAudit-Dataset, an error dataset, and PaperAudit-Review, an automated review framework.<n>Experiments on PaperAudit-Bench reveal large variability in error detectability across models and detection depths.<n>We show that the dataset supports training lightweight LLM detectors via SFT and RL, enabling effective error detection at reduced computational cost.
arXiv Detail & Related papers (2026-01-07T04:26:12Z) - Structured Uncertainty guided Clarification for LLM Agents [126.26213027785813]
LLM agents extend large language models with tool-calling capabilities, but ambiguous user instructions often lead to incorrect invocations and task failures.<n>We introduce a principled formulation of structured uncertainty over tool-call parameters, modeling joint tool-argument clarification as a POMDP with Expected Value of Perfect Information (EVPI) objective for optimal question selection and aspect-based cost modeling to prevent redundancy.<n>Our SAGE-Agent leverages this structured uncertainty to achieve superior efficiency: increasing coverage on ambiguous tasks by 7-39% while reducing clarification questions by 1.5-2.7$times$ compared to strong prompting and uncertainty-based baselines.
arXiv Detail & Related papers (2025-11-11T21:50:44Z) - Noise & pattern: identity-anchored Tikhonov regularization for robust structural anomaly detection [58.535473924035365]
Anomaly detection plays a pivotal role in automated industrial inspection, aiming to identify subtle or rare defects in otherwise uniform visual patterns.<n>We tackle structural anomaly detection using a self-supervised autoencoder that learns to repair corrupted inputs.<n>We introduce a corruption model that injects artificial disruptions into training images to mimic structural defects.
arXiv Detail & Related papers (2025-11-10T15:48:50Z) - A Contrastive Learning-Guided Confident Meta-learning for Zero Shot Anomaly Detection [17.73056562717683]
CoZAD is a novel zero-shot anomaly detection framework.<n>It integrates soft confident learning with meta-learning and contrastive feature representation.<n>We show it outperforms existing methods on 6 out of 7 industrial benchmarks.
arXiv Detail & Related papers (2025-08-25T09:27:31Z) - Automated Defect Identification and Categorization in NDE 4.0 with the Application of Artificial Intelligence [2.1458230584719247]
Investigation attempts to create an automated framework for fault detection and organization in radiography.<n>As its basic information source, the technique consists of compiling and categorizing 223 CR photographs of airplane welds.<n>Miniature a90/95 characteristics, which provide strong differentiating evidence of flaws, reveal that the suggested approach achieves exceptional awareness in defect detection.
arXiv Detail & Related papers (2025-06-26T06:25:36Z) - Stress-Testing ML Pipelines with Adversarial Data Corruption [11.91482648083998]
Regulators now demand evidence that high-stakes systems can withstand realistic, interdependent errors.<n>We introduce SAVAGE, a framework that formally models data-quality issues through dependency graphs and flexible corruption templates.<n>Savanage employs a bi-level optimization approach to efficiently identify vulnerable data subpopulations and fine-tune corruption severity.
arXiv Detail & Related papers (2025-06-02T00:41:24Z) - The Achilles Heel of AI: Fundamentals of Risk-Aware Training Data for High-Consequence Models [0.0]
AI systems in high-consequence domains must detect rare, high-impact events while operating under tight resource constraints.<n>Traditional annotation strategies that prioritize label volume over informational value introduce redundancy and noise.<n>This paper introduces smart-sizing, a training data strategy that emphasizes label diversity, model-guided selection, and marginal utility-based stopping.
arXiv Detail & Related papers (2025-05-20T22:57:35Z) - AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning [61.28113271728859]
RAG has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>Standard RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>In this work, we reinterpret RAG as Retrieval-Augmented Reasoning and identify a central but underexplored problem: textitReasoning Misalignment.
arXiv Detail & Related papers (2025-04-21T04:56:47Z) - Few-Shot Optimized Framework for Hallucination Detection in Resource-Limited NLP Systems [1.0124625066746595]
We introduce DeepSeek Few-shot optimization to enhance weak label generation through iterative prompt engineering.<n>We achieve high-quality annotations that considerably enhanced the performance of downstream models.<n>We further fine-tuned the Mistral-7B-Instruct-v0.3 model on these optimized annotations, enabling it to accurately detect hallucinations in resource-limited settings.
arXiv Detail & Related papers (2025-01-28T01:26:22Z) - Parameter-tuning-free data entry error unlearning with adaptive
selective synaptic dampening [51.34904967046097]
We introduce an extension to the selective synaptic dampening unlearning method that removes the need for parameter tuning.
We demonstrate the performance of this extension, adaptive selective synaptic dampening (ASSD) on various ResNet18 and Vision Transformer unlearning tasks.
The application of this approach is particularly compelling in industrial settings, such as supply chain management.
arXiv Detail & Related papers (2024-02-06T14:04:31Z) - Discover, Explanation, Improvement: An Automatic Slice Detection
Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints.
This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks.
Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z) - Improved Adaptive Algorithm for Scalable Active Learning with Weak
Labeler [89.27610526884496]
Weak Labeler Active Cover (WL-AC) is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy.
We show its effectiveness on the corrupted-MNIST dataset by significantly reducing the number of labels while keeping the same accuracy as in passive learning.
arXiv Detail & Related papers (2022-11-04T02:52:54Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.