Related papers: VizInspect Pro -- Automated Optical Inspection (AOI) solution

VizInspect Pro -- Automated Optical Inspection (AOI) solution

URL: http://arxiv.org/abs/2205.13095v1
Date: Thu, 26 May 2022 00:38:48 GMT
Title: VizInspect Pro -- Automated Optical Inspection (AOI) solution
Authors: Faraz Waseem, Sanjit Menon, Haotian Xu, Debashis Mondal
Abstract summary: VizInspect pro is a generic computer vision based AOI solution built on top of Leo - An edge AI platform. This paper shows how this solution and platform solved problems around model development, deployment, scaling multiple inferences and visualizations.
Score: 1.3190581566723916
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional vision based Automated Optical Inspection (referred to as AOI in paper) systems present multiple challenges in factory settings including inability to scale across multiple product lines, requirement of vendor programming expertise, little tolerance to variations and lack of cloud connectivity for aggregated insights. The lack of flexibility in these systems presents a unique opportunity for a deep learning based AOI system specifically for factory automation. The proposed solution, VizInspect pro is a generic computer vision based AOI solution built on top of Leo - An edge AI platform. Innovative features that overcome challenges of traditional vision systems include deep learning based image analysis which combines the power of self-learning with high speed and accuracy, an intuitive user interface to configure inspection profiles in minutes without ML or vision expertise and the ability to solve complex inspection challenges while being tolerant to deviations and unpredictable defects. This solution has been validated by multiple external enterprise customers with confirmed value propositions. In this paper we show you how this solution and platform solved problems around model development, deployment, scaling multiple inferences and visualizations.

Related papers

Visual Analytics for Explainable and Trustworthy Artificial Intelligence [2.1212179660694104]
A key obstacle to AI adoption lies in the lack of transparency.<n>Many automated systems function as "black boxes," providing predictions without revealing the underlying processes.<n>Visual analytics (VA) provides a compelling solution by combining AI models with interactive visualizations.
arXiv Detail & Related papers (2025-07-14T13:03:17Z)
Can Multimodal Large Language Models be Guided to Improve Industrial Anomaly Detection? [5.979778557940213]
Traditional industrial anomaly detection models often struggle with flexibility and adaptability. Recent advancements in Multimodal Large Language Models (MLLMs) hold promise for overcoming these limitations. We propose Echo, a novel multi-expert framework designed to enhance MLLM performance for IAD.
arXiv Detail & Related papers (2025-01-27T05:41:10Z)
AI-based Wearable Vision Assistance System for the Visually Impaired: Integrating Real-Time Object Recognition and Contextual Understanding Using Large Vision-Language Models [0.0]
This paper introduces a novel wearable vision assistance system with artificial intelligence (AI) technology to deliver real-time feedback to a user through a sound beep mechanism. The system provides detailed descriptions of objects in the user's environment using a large vision language model (LVLM)
arXiv Detail & Related papers (2024-12-28T07:26:39Z)
VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use [74.39058448757645]
We present VipAct, an agent framework that enhances vision-language models (VLMs) VipAct consists of an orchestrator agent, which manages task requirement analysis, planning, and coordination, along with specialized agents that handle specific tasks. We evaluate VipAct on benchmarks featuring a diverse set of visual perception tasks, with experimental results demonstrating significant performance improvements.
arXiv Detail & Related papers (2024-10-21T18:10:26Z)
Visual Agents as Fast and Slow Thinkers [88.6691504568041]
We introduce FaST, which incorporates the Fast and Slow Thinking mechanism into visual agents. FaST employs a switch adapter to dynamically select between System 1/2 modes. It tackles uncertain and unseen objects by adjusting model confidence and integrating new contextual data.
arXiv Detail & Related papers (2024-08-16T17:44:02Z)
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models. BVS supports a large number of adjustable parameters at the scene level. We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z)
LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Embodied Intelligence System [22.779285672925425]
Embodied intelligence (EI) enables manufacturing systems to flexibly perceive, reason, adapt, and operate within dynamic shop floor environments.<n>We propose LAECIPS, a large vision model-assisted adaptive edge-cloud collaboration framework for IoT-based embodied intelligence systems.<n>LAECIPS decouples large vision models in the cloud from lightweight models on the edge, enabling plug-and-play model adaptation and continual learning.
arXiv Detail & Related papers (2024-04-16T12:12:06Z)
Reasoning Capacity in Multi-Agent Systems: Limitations, Challenges and Human-Centered Solutions [14.398238217358116]
We present a formal definition of reasoning capacity and illustrate its utility in identifying limitations within each component of the system. We then argue how these limitations can be addressed with a self-reflective process wherein human-feedback is used to alleviate shortcomings in reasoning.
arXiv Detail & Related papers (2024-02-02T02:53:11Z)
MouSi: Poly-Visual-Expert Vision-Language Models [132.58949014605477]
This paper proposes the use of ensemble experts technique to synergize the capabilities of individual visual encoders. This technique introduces a fusion network to unify the processing of outputs from different visual experts. In our implementation, this technique significantly reduces the positional occupancy in models like SAM, from a substantial 4096 to a more efficient and manageable 64 or even down to 1.
arXiv Detail & Related papers (2024-01-30T18:09:11Z)
LAMBO: Large AI Model Empowered Edge Intelligence [71.56135386994119]
Next-generation edge intelligence is anticipated to benefit various applications via offloading techniques. Traditional offloading architectures face several issues, including heterogeneous constraints, partial perception, uncertain generalization, and lack of tractability. We propose a Large AI Model-Based Offloading (LAMBO) framework with over one billion parameters for solving these problems.
arXiv Detail & Related papers (2023-08-29T07:25:42Z)
Don't Treat the Symptom, Find the Cause! Efficient Artificial-Intelligence Methods for (Interactive) Debugging [0.0]
In the modern world, we are permanently using, leveraging, interacting with, and relying upon systems of ever higher sophistication. In this thesis, we will give an introduction to the topic of model-based diagnosis, point out the major challenges in the field, and discuss a selection of approaches from our research addressing these issues.
arXiv Detail & Related papers (2023-06-22T12:44:49Z)
Engineering an Intelligent Essay Scoring and Feedback System: An Experience Report [1.5168188294440734]
We describe an exploratory system for assessing the quality of essays supplied by customers of a specialized recruitment support service. The problem domain is challenging because the open-ended customer-supplied source text has considerable scope for ambiguity and error. There is also a need to incorporate specialized business domain knowledge into the intelligent processing systems.
arXiv Detail & Related papers (2021-03-25T03:46:05Z)
Cognitive Visual Inspection Service for LCD Manufacturing Industry [80.63336968475889]
This paper discloses a novel visual inspection system for liquid crystal display (LCD), which is currently a dominant type in the FPD industry. System is based on two cornerstones: robust/high-performance defect recognition model and cognitive visual inspection service architecture.
arXiv Detail & Related papers (2021-01-11T08:14:35Z)
DEEVA: A Deep Learning and IoT Based Computer Vision System to Address Safety and Security of Production Sites in Energy Industry [0.0]
This paper tackles various computer vision related problems such as scene classification, object detection in scenes, semantic segmentation, scene captioning etc. We developed Deep ExxonMobil Eye for Video Analysis (DEEVA) package to handle scene classification, object detection, semantic segmentation and captioning of scenes. The results reveal that transfer learning with the RetinaNet object detector is able to detect the presence of workers, different types of vehicles/construction equipment, safety related objects at a high level of accuracy (above 90%)
arXiv Detail & Related papers (2020-03-02T21:26:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.