Domain Invariant Siamese Attention Mask for Small Object Change
Detection via Everyday Indoor Robot Navigation
- URL: http://arxiv.org/abs/2203.15362v1
- Date: Tue, 29 Mar 2022 08:57:56 GMT
- Title: Domain Invariant Siamese Attention Mask for Small Object Change
Detection via Everyday Indoor Robot Navigation
- Authors: Koji Takeda, Kanji Tanaka, and Yoshimasa Nakamura
- Abstract summary: The problem of image change detection via everyday indoor robot navigation is explored from a novel perspective.
We propose a new self-attention technique with an ability of unsupervised on-the-fly domain adaptation.
Experiments show that our attention technique significantly boosts the state-of-the-art image change detection model.
- Score: 5.161531917413708
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of image change detection via everyday indoor robot navigation is
explored from a novel perspective of the self-attention technique. Detecting
semantically non-distinctive and visually small changes remains a key challenge
in the robotics community. Intuitively, these small non-distinctive changes may
be better handled by the recent paradigm of the attention mechanism, which is
the basic idea of this work. However, existing self-attention models require
significant retraining cost per domain, so it is not directly applicable to
robotics applications. We propose a new self-attention technique with an
ability of unsupervised on-the-fly domain adaptation, which introduces an
attention mask into the intermediate layer of an image change detection model,
without modifying the input and output layers of the model. Experiments, in
which an indoor robot aims to detect visually small changes in everyday
navigation, demonstrate that our attention technique significantly boosts the
state-of-the-art image change detection model.
Related papers
- AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation [31.214318150001947]
Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks.
We propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling.
This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
arXiv Detail & Related papers (2024-10-16T09:52:38Z) - Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection [8.977792536037956]
In everyday indoor navigation, robots often needto detect non-distinctive small-change objects.
Existing techniques rely on high-quality class-specific object priors to regularize a change detector model.
In this study, we explore the concept of degree-of-ill-posedness (DoI) to improve both passive and activevision.
arXiv Detail & Related papers (2024-05-10T01:56:39Z) - Attention Deficit is Ordered! Fooling Deformable Vision Transformers
with Collaborative Adversarial Patches [3.4673556247932225]
Deformable vision transformers significantly reduce the complexity of attention modeling.
Recent work has demonstrated adversarial attacks against conventional vision transformers.
We develop new collaborative attacks where a source patch manipulates attention to point to a target patch, which contains the adversarial noise to fool the model.
arXiv Detail & Related papers (2023-11-21T17:55:46Z) - Lifelong Change Detection: Continuous Domain Adaptation for Small Object
Change Detection in Every Robot Navigation [5.8010446129208155]
Ground view change detection suffers from its ill-posed-ness because of visual uncertainty combined with complex nonlinear perspective projection.
To regularize the ill-posed-ness, the commonly applied supervised learning methods rely on manually annotated high-quality object-class-specific priors.
The present approach adopts the powerful and versatile idea that object changes detected during everyday robot navigation can be reused as additional priors to improve future change detection tasks.
arXiv Detail & Related papers (2023-06-28T10:34:59Z) - AttentionViz: A Global View of Transformer Attention [60.82904477362676]
We present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers.
The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention.
We create an interactive visualization tool, AttentionViz, based on these joint query-key embeddings.
arXiv Detail & Related papers (2023-05-04T23:46:49Z) - Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors.
First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning.
Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z) - Challenges in Visual Anomaly Detection for Mobile Robots [65.53820325712455]
We consider the task of detecting anomalies for autonomous mobile robots based on vision.
We categorize relevant types of visual anomalies and discuss how they can be detected by unsupervised deep learning methods.
arXiv Detail & Related papers (2022-09-22T13:26:46Z) - Deep Active Visual Attention for Real-time Robot Motion Generation:
Emergence of Tool-body Assimilation and Adaptive Tool-use [9.141661467673817]
This paper proposes a novel robot motion generation model, inspired by a human cognitive structure.
The model incorporates a state-driven active top-down visual attention module, which acquires attentions that can actively change targets based on task states.
The results suggested an improvement of flexibility in model's visual perception, which sustained stable attention and motion even if it was provided with untrained tools or exposed to experimenter's distractions.
arXiv Detail & Related papers (2022-06-29T10:55:32Z) - Visualizing and Understanding Patch Interactions in Vision Transformer [96.70401478061076]
Vision Transformer (ViT) has become a leading tool in various computer vision tasks.
We propose a novel explainable visualization approach to analyze and interpret the crucial attention interactions among patches for vision transformer.
arXiv Detail & Related papers (2022-03-11T13:48:11Z) - On the Real-World Adversarial Robustness of Real-Time Semantic
Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks.
This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches.
A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z) - Morphology-Agnostic Visual Robotic Control [76.44045983428701]
MAVRIC is an approach that works with minimal prior knowledge of the robot's morphology.
We demonstrate our method on visually-guided 3D point reaching, trajectory following, and robot-to-robot imitation.
arXiv Detail & Related papers (2019-12-31T15:45:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.