Robustness of Segment Anything Model (SAM) for Autonomous Driving in
Adverse Weather Conditions
- URL: http://arxiv.org/abs/2306.13290v1
- Date: Fri, 23 Jun 2023 04:56:47 GMT
- Title: Robustness of Segment Anything Model (SAM) for Autonomous Driving in
Adverse Weather Conditions
- Authors: Xinru Shan, Chaoning Zhang
- Abstract summary: Segment Anything Model (SAM) has emerged as a foundational model in computer vision.
There is a strong desire to apply SAM in autonomous driving to improve the performance of vision tasks.
This work aims to enhance understanding of SAM's robustness in challenging scenarios before integrating it into autonomous driving vision tasks.
- Score: 9.613468602635082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Segment Anything Model (SAM) has gained considerable interest in recent times
for its remarkable performance and has emerged as a foundational model in
computer vision. It has been integrated in diverse downstream tasks, showcasing
its strong zero-shot transfer capabilities. Given its impressive performance,
there is a strong desire to apply SAM in autonomous driving to improve the
performance of vision tasks, particularly in challenging scenarios such as
driving under adverse weather conditions. However, its robustness under adverse
weather conditions remains uncertain. In this work, we investigate the
application of SAM in autonomous driving and specifically explore its
robustness under adverse weather conditions. Overall, this work aims to enhance
understanding of SAM's robustness in challenging scenarios before integrating
it into autonomous driving vision tasks, providing valuable insights for future
applications.
Related papers
- InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation [53.47253633654885]
InstaDrive is a novel framework that enhances driving video realism through two key advancements.<n>By incorporating these instance-aware mechanisms, InstaDrive achieves state-of-the-art video generation quality.<n>Our project page is https://shanpoyang654.io/InstaDrive/page.html.
arXiv Detail & Related papers (2026-02-03T08:22:13Z) - Enhancing Self-Driving Segmentation in Adverse Weather Conditions: A Dual Uncertainty-Aware Training Approach to SAM Optimization [2.784110090047074]
We investigate two approaches to enhance segmentation robustness for autonomous driving.<n>First, we introduce a multi-step finetuning procedure for SAM2 that incorporates uncertainty metrics directly into the loss function.<n>Second, we adapt the Uncertainty-Aware Adapter (UAT), originally designed for medical image segmentation, to driving contexts.
arXiv Detail & Related papers (2025-09-05T01:24:42Z) - SEAL: Vision-Language Model-Based Safe End-to-End Cooperative Autonomous Driving with Adaptive Long-Tail Modeling [13.81210267833274]
SEAL is a vision-based model-based framework with adaptive multimodal learning for robust cooperative autonomous driving under long-tail scenarios.<n> SEAL introduces three core innovations: (i) a prompt-driven long-tail scenario generation and evaluation pipeline that leverages foundation models to synthesize realistic long-tail conditions; (ii) a multi-scenario adaptive attention module that modulates the visual stream using scenario priors to recalibrate ambiguous or corrupted features; and (iii) a multi-task scenario-aware contrastive learning objective that improves multimodal alignment and promotes cross-scenario feature separability.
arXiv Detail & Related papers (2025-06-26T06:42:03Z) - LightEMMA: Lightweight End-to-End Multimodal Model for Autonomous Driving [9.447298958886265]
Vision-Language Models (VLMs) have demonstrated significant potential for end-to-end autonomous driving.
We introduce LightEMMA, a Lightweight End-to-End Multimodal Model for Autonomous driving.
We construct twelve autonomous driving agents using various VLMs and evaluate their performance on the nuScenes prediction task.
arXiv Detail & Related papers (2025-05-01T04:12:41Z) - NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving [10.41584658117874]
We propose NuScenes-SpatialQA, the first large-scale ground-truth-based Question-Answer (QA) benchmark designed to evaluate the spatial understanding and reasoning capabilities of Vision-Language Models (VLMs) in autonomous driving.
Built upon the NuScenes dataset, the benchmark is constructed through an automated 3D scene graph generation pipeline and a QA generation pipeline.
Using this benchmark, we conduct extensive experiments on diverse VLMs, including both general and spatial-enhanced models, providing the first comprehensive evaluation of their spatial capabilities in autonomous driving.
arXiv Detail & Related papers (2025-04-04T04:43:10Z) - Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes [63.966251473172036]
The foundational model SAM has influenced multiple fields within computer vision, and its upgraded version, SAM 2, enhances capabilities in video segmentation.
While SAMs have demonstrated excellent performance in segmenting context-independent concepts like people, cars, and roads, they overlook more challenging context-dependent (CD) concepts, such as visual saliency, camouflage, product defects, and medical lesions.
We conduct a thorough quantitative evaluation of SAMs on 11 CD concepts across 2D and 3D images and videos in various visual modalities within natural, medical, and industrial scenes.
arXiv Detail & Related papers (2024-12-02T08:03:56Z) - Generating Out-Of-Distribution Scenarios Using Language Models [58.47597351184034]
Large Language Models (LLMs) have shown promise in autonomous driving.
This paper introduces a framework for generating diverse Out-Of-Distribution (OOD) driving scenarios.
We evaluate our framework through extensive simulations and introduce a new "OOD-ness" metric.
arXiv Detail & Related papers (2024-11-25T16:38:17Z) - WATonoBus: Field-Tested All-Weather Autonomous Shuttle Technology [8.815412946998475]
All-weather autonomous vehicle operation poses significant challenges, encompassing modules from perception and decision-making to path planning and control.
We propose a multi- module and modular system architecture with considerations for adverse weather across the perception level.
We demonstrate our proposed approach is capable of addressing adverse weather conditions and provide valuable insights from edge cases observed during operation.
arXiv Detail & Related papers (2023-12-01T21:36:14Z) - Exploring the Potential of World Models for Anomaly Detection in
Autonomous Driving [11.091582432763738]
We show how world models can be leveraged to perform anomaly detection in the domain of autonomous driving.
We provide a characterization of world models and relate individual components to previous works in anomaly detection.
arXiv Detail & Related papers (2023-08-10T17:04:51Z) - Assurance for Autonomy -- JPL's past research, lessons learned, and
future directions [56.32768279109502]
Autonomy is required when a wide variation in circumstances precludes responses being pre-planned.
Mission assurance is a key contributor to providing confidence, yet assurance practices honed over decades of spaceflight have relatively little experience with autonomy.
Researchers in JPL's software assurance group have been involved in the development of techniques specific to the assurance of autonomy.
arXiv Detail & Related papers (2023-05-16T18:24:12Z) - SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain
Adaptation [152.60469768559878]
SHIFT is the largest multi-task synthetic dataset for autonomous driving.
It presents discrete and continuous shifts in cloudiness, rain and fog intensity, time of day, and vehicle and pedestrian density.
Our dataset and benchmark toolkit are publicly available at www.vis.xyz/shift.
arXiv Detail & Related papers (2022-06-16T17:59:52Z) - Differentiable Control Barrier Functions for Vision-based End-to-End
Autonomous Driving [100.57791628642624]
We introduce a safety guaranteed learning framework for vision-based end-to-end autonomous driving.
We design a learning system equipped with differentiable control barrier functions (dCBFs) that is trained end-to-end by gradient descent.
arXiv Detail & Related papers (2022-03-04T16:14:33Z) - Safety-aware Motion Prediction with Unseen Vehicles for Autonomous
Driving [104.32241082170044]
We study a new task, safety-aware motion prediction with unseen vehicles for autonomous driving.
Unlike the existing trajectory prediction task for seen vehicles, we aim at predicting an occupancy map.
Our approach is the first one that can predict the existence of unseen vehicles in most cases.
arXiv Detail & Related papers (2021-09-03T13:33:33Z) - Dynamic Autonomous Surface Vehicle Control and Applications in
Environmental Monitoring [1.6774978731594548]
This paper addresses the problem of robotic operations in the presence of adversarial forces.
The presence of wind and/or currents produces external forces acting on the vehicle which quite often divert it from its intended path.
By measuring these phenomena, wind and current, and modelling their impact on the vessel, actions can be taken to alleviate their effect.
arXiv Detail & Related papers (2021-03-29T20:55:52Z) - Worsening Perception: Real-time Degradation of Autonomous Vehicle
Perception Performance for Simulation of Adverse Weather Conditions [47.529411576737644]
This study explores the potential of using a simple, lightweight image augmentation system in an autonomous racing vehicle.
With minimal adjustment, the prototype system can replicate the effects of both water droplets on the camera lens, and fading light conditions.
arXiv Detail & Related papers (2021-03-03T23:49:02Z) - Probabilistic End-to-End Vehicle Navigation in Complex Dynamic
Environments with Multimodal Sensor Fusion [16.018962965273495]
All-day and all-weather navigation is a critical capability for autonomous driving.
We propose a probabilistic driving model with ultiperception capability utilizing the information from the camera, lidar and radar.
The results suggest that our proposed model outperforms baselines and achieves excellent generalization performance in unseen environments.
arXiv Detail & Related papers (2020-05-05T03:48:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.