Related papers: Can Foundation Models Revolutionize Mobile AR Sparse Sensing?

Can Foundation Models Revolutionize Mobile AR Sparse Sensing?

URL: http://arxiv.org/abs/2511.02215v1
Date: Tue, 04 Nov 2025 03:06:51 GMT
Title: Can Foundation Models Revolutionize Mobile AR Sparse Sensing?
Authors: Yiqin Zhao, Tian Guo,
Abstract summary: We investigate whether foundation models can change the landscape of mobile sparse sensing.<n>Using real-world mobile AR data, our evaluations demonstrate that foundation models offer significant improvements in geometry-aware image warping.<n>Our study demonstrates the scalability of foundation model-based sparse sensing and shows its leading performance in 3D scene reconstruction.
Score: 2.984076446975729
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mobile sensing systems have long faced a fundamental trade-off between sensing quality and efficiency due to constraints in computation, power, and other limitations. Sparse sensing, which aims to acquire and process only a subset of sensor data, has been a key strategy for maintaining performance under such constraints. However, existing sparse sensing methods often suffer from reduced accuracy, as missing information across space and time introduces uncertainty into many sensing systems. In this work, we investigate whether foundation models can change the landscape of mobile sparse sensing. Using real-world mobile AR data, our evaluations demonstrate that foundation models offer significant improvements in geometry-aware image warping, a central technique for enabling accurate reuse of cross-frame information. Furthermore, our study demonstrates the scalability of foundation model-based sparse sensing and shows its leading performance in 3D scene reconstruction. Collectively, our study reveals critical aspects of the promises and the open challenges of integrating foundation models into mobile sparse sensing systems.

Related papers

A Comprehensive Survey on Deep Learning-Based LiDAR Super-Resolution for Autonomous Driving [0.4078247440919472]
This paper presents the first comprehensive survey of LiDAR super-resolution methods for autonomous driving.<n>We organize existing approaches into four categories: CNN-based architectures, model-based deep unrolling, implicit representation methods, and Transformer and Mamba-based approaches.<n>Current trends include the adoption of range image representation for efficient processing, extreme model compression and the development of resolution-flexible architectures.
arXiv Detail & Related papers (2026-02-15T22:34:28Z)
MASt3R-Fusion: Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM [12.158063913401575]
We propose MASt3R-Fusion, a multi-sensor-assisted visual SLAM framework that integrates feed-forward pointmap regression with complementary sensor information.<n>A hierarchical factor graph design is developed, which allows both real-time sliding-window optimization and global optimization with aggressive loop closures.<n>We evaluate our approach on both public benchmarks and self-collected datasets, demonstrating substantial improvements in accuracy and robustness.
arXiv Detail & Related papers (2025-09-25T05:26:28Z)
Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation [96.1872246747684]
Depth estimation is a fundamental task in 3D computer vision, crucial for applications such as 3D reconstruction, free-viewpoint rendering, robotics, autonomous driving, and AR/VR technologies.<n>Traditional methods relying on hardware sensors like LiDAR are often limited by high costs, low resolution, and environmental sensitivity, limiting their applicability in real-world scenarios.<n>Recent advances in vision-based methods offer a promising alternative, yet they face challenges in generalization and stability due to either the low-capacity model architectures or the reliance on domain-specific and small-scale datasets.
arXiv Detail & Related papers (2025-07-15T17:59:59Z)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z)
InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images [11.916941756499435]
In this paper, we explore the intricate task of incremental few-shot object detection in remote sensing images. We introduce a pioneering fine-tuning-based technique, termed InfRS, designed to facilitate the incremental learning of novel classes. We develop a prototypical calibration strategy based on the Wasserstein distance to mitigate the catastrophic forgetting problem.
arXiv Detail & Related papers (2024-05-18T13:39:50Z)
DynST: Dynamic Sparse Training for Resource-Constrained Spatio-Temporal Forecasting [31.398965880415492]
Earth science systems rely heavily on the extensive deployment of sensors.<n>Traditional approaches to sensor deployment utilize specific algorithms to design and deploy sensors.<n>We introduce for the first time the concept of sparse-temporal data dynamic sparse training and are committed to adaptively, dynamically filtering important distributions sensor.
arXiv Detail & Related papers (2024-03-05T12:31:24Z)
Open World Object Detection in the Era of Foundation Models [53.683963161370585]
We introduce a new benchmark that includes five real-world application-driven datasets. We introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects.
arXiv Detail & Related papers (2023-12-10T03:56:06Z)
Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head. The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement. This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z)
On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks [61.74608497496841]
Training on inaccurate or corrupt data induces model bias and hampers generalisation capabilities. This paper investigates the effect of sensor errors for the dense 3D vision tasks of depth estimation and reconstruction.
arXiv Detail & Related papers (2023-03-26T22:32:44Z)
AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation. We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z)
Improving Perception via Sensor Placement: Designing Multi-LiDAR Systems for Autonomous Vehicles [16.45799795374353]
We propose an easy-to-compute information-theoretic surrogate cost metric based on Probabilistic Occupancy Grids (POG) to optimize LiDAR placement for maximal sensing. Our results confirm that sensor placement is an important factor in 3D point cloud-based object detection and could lead to a variation of performance by 10% 20% on the state-of-the-art perception algorithms.
arXiv Detail & Related papers (2021-05-02T01:52:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.