Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
- URL: http://arxiv.org/abs/2307.10235v1
- Date: Sun, 16 Jul 2023 07:55:42 GMT
- Title: Towards Viewpoint-Invariant Visual Recognition via Adversarial Training
- Authors: Shouwei Ruan, Yinpeng Dong, Hang Su, Jianteng Peng, Ning Chen,
Xingxing Wei
- Abstract summary: We propose Viewpoint-Invariant Adrial Training (VIAT) to improve viewpoint robustness of common image classifiers.
VIAT is formulated as a minimax optimization problem, where the inner recognition characterizes diverse adversarial viewpoints.
To further improve the generalization performance, a distribution sharing strategy is introduced.
- Score: 28.424131496622497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual recognition models are not invariant to viewpoint changes in the 3D
world, as different viewing directions can dramatically affect the predictions
given the same object. Although many efforts have been devoted to making neural
networks invariant to 2D image translations and rotations, viewpoint invariance
is rarely investigated. As most models process images in the perspective view,
it is challenging to impose invariance to 3D viewpoint changes based only on 2D
inputs. Motivated by the success of adversarial training in promoting model
robustness, we propose Viewpoint-Invariant Adversarial Training (VIAT) to
improve viewpoint robustness of common image classifiers. By regarding
viewpoint transformation as an attack, VIAT is formulated as a minimax
optimization problem, where the inner maximization characterizes diverse
adversarial viewpoints by learning a Gaussian mixture distribution based on a
new attack GMVFool, while the outer minimization trains a viewpoint-invariant
classifier by minimizing the expected loss over the worst-case adversarial
viewpoint distributions. To further improve the generalization performance, a
distribution sharing strategy is introduced leveraging the transferability of
adversarial viewpoints across objects. Experiments validate the effectiveness
of VIAT in improving the viewpoint robustness of various image classifiers
based on the diversity of adversarial viewpoints generated by GMVFool.
Related papers
- Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation
Learning of Vision-based Autonomous Driving [73.3702076688159]
We propose a novel contrastive learning algorithm, Cohere3D, to learn coherent instance representations in a long-term input sequence.
We evaluate our algorithm by finetuning the pretrained model on various downstream perception, prediction, and planning tasks.
arXiv Detail & Related papers (2024-02-23T19:43:01Z) - Appearance Debiased Gaze Estimation via Stochastic Subject-Wise
Adversarial Learning [33.55397868171977]
Appearance-based gaze estimation has been attracting attention in computer vision, and remarkable improvements have been achieved using various deep learning techniques.
We propose a novel framework: subject-wise gaZE learning (SAZE), which trains a network to generalize the appearance of subjects.
Our experimental results verify the robustness of the method in that it yields state-of-the-art performance, achieving 3.89 and 4.42 on the MPIIGaze and EyeDiap datasets, respectively.
arXiv Detail & Related papers (2024-01-25T00:23:21Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - Improving Viewpoint Robustness for Visual Recognition via Adversarial
Training [26.824940629150362]
We propose Viewpoint-Invariant Adversarial Training (VIAT) to improve the viewpoint robustness of image classifiers.
We show that VIAT significantly improves the viewpoint robustness of various image classifiers based on the diversity of adversarial viewpoints generated by GMVFool.
arXiv Detail & Related papers (2023-07-21T12:18:35Z) - Progressive Multi-view Human Mesh Recovery with Self-Supervision [68.60019434498703]
Existing solutions typically suffer from poor generalization performance to new settings.
We propose a novel simulation-based training pipeline for multi-view human mesh recovery.
arXiv Detail & Related papers (2022-12-10T06:28:29Z) - ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial
Viewpoints [42.64942578228025]
We propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models.
By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints.
arXiv Detail & Related papers (2022-10-08T03:06:49Z) - Unsupervised View-Invariant Human Posture Representation [28.840986167408037]
We present a novel unsupervised approach that learns to extract view-invariant 3D human pose representation from a 2D image.
Our model is trained by exploiting the intrinsic view-invariant properties of human pose between simultaneous frames.
We show improvements on the state-of-the-art unsupervised cross-view action classification accuracy on RGB and depth images.
arXiv Detail & Related papers (2021-09-17T19:23:31Z) - Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models.
Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.