Related papers: Focal Depth Estimation: A Calibration-Free, Subject- and Daytime Invariant Approach

Related papers

Continual Action Quality Assessment via Adaptive Manifold-Aligned Graph Regularization [53.82400605816587]
Action Quality Assessment (AQA) quantifies human actions in videos, supporting applications in sports scoring, rehabilitation, and skill evaluation.<n>A major challenge lies in the non-stationary nature of quality distributions in real-world scenarios.<n>We introduce Continual AQA (CAQA), which equips with Continual Learning capabilities to handle evolving distributions.
arXiv Detail & Related papers (2025-10-08T10:09:47Z)
Analysis of Transferability Estimation Metrics for Surgical Phase Recognition [3.3285108719932555]
Fine-tuning pre-trained models has become a cornerstone of modern machine learning, allowing practitioners to achieve high performance with limited labeled data.<n>In surgical video analysis, where expert annotations are especially time-consuming and costly, identifying the most suitable pre-trained model for a downstream task is both critical and challenging.<n>We provide the first comprehensive benchmark of three representative metrics, LogME, H-Score, and TransRate, on two diverse datasets.
arXiv Detail & Related papers (2025-08-22T18:05:33Z)
Foundation Feature-Driven Online End-Effector Pose Estimation: A Marker-Free and Learning-Free Approach [4.132336580197184]
This work proposes a feature-driven online End-Effect-or Pose Estimation algorithm. It generalizes across robots and end-effectors in a training-free manner. Experiments demonstrate its superior flexibility, generalization, and performance.
arXiv Detail & Related papers (2025-03-18T09:12:49Z)
Active Alignments of Lens Systems with Reinforcement Learning [0.0]
We propose a reinforcement learning (RL) approach that learns exclusively in the pixel space of the sensor output. We conduct an extensive benchmark study and show that our approach surpasses other methods in speed, precision, and robustness.
arXiv Detail & Related papers (2025-03-03T21:57:08Z)
What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration. We identify the critical limitations of regression-based methods with the widely used data generation pipeline. We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z)
SONNET: Enhancing Time Delay Estimation by Leveraging Simulated Audio [17.811771707446926]
We show that learning based methods can, even based on synthetic data, significantly outperform GCC-PHAT on novel real world data. We provide our trained model, SONNET, which is runnable in real-time and works on novel data out of the box for many real data applications.
arXiv Detail & Related papers (2024-11-20T10:23:21Z)
Consistency Calibration: Improving Uncertainty Calibration via Consistency among Perturbed Neighbors [22.39558434131574]
We introduce the concept of consistency as an alternative perspective on model calibration. We propose a post-hoc calibration method called Consistency (CC) which adjusts confidence based on the model's consistency across inputs. We show that performing perturbations at the logit level significantly improves computational efficiency.
arXiv Detail & Related papers (2024-10-16T06:55:02Z)
PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a. Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z)
Scalable Numerical Embeddings for Multivariate Time Series: Enhancing Healthcare Data Representation Learning [6.635084843592727]
We propose SCAlable Numerical Embedding (SCANE), a novel framework that treats each feature value as an independent token. SCANE regularizes the traits of distinct feature embeddings and enhances representational learning through a scalable embedding mechanism. We develop the nUMerical eMbeddIng Transformer (SUMMIT), which is engineered to deliver precise predictive outputs for MTS characterized by prevalent missing entries.
arXiv Detail & Related papers (2024-05-26T13:06:45Z)
PLATE: A perception-latency aware estimator, [0.46040036610482665]
Perception-LATency aware Estorimator (PLATE) PLATE uses different perception configurations in different moments of time in order to optimize a certain performance measure. Compared to other frame-skipping techniques, PLATE comes with a formal complexity and optimality analysis.
arXiv Detail & Related papers (2024-01-24T17:04:18Z)
Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance. Current methods with a fixed model do not work uniformly well across various datasets. This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z)
Deep Reinforcement Learning Based System for Intraoperative Hyperspectral Video Autofocusing [2.476200036182773]
This work integrates a focus-tunable liquid lens into a video HSI exoscope. A first-of-its-kind robotic focal-time scan was performed to create a realistic and reproducible testing dataset.
arXiv Detail & Related papers (2023-07-21T15:04:21Z)
EasyHeC: Accurate and Automatic Hand-eye Calibration via Differentiable Rendering and Space Exploration [49.90228618894857]
We introduce a new approach to hand-eye calibration called EasyHeC, which is markerless, white-box, and delivers superior accuracy and robustness. We propose to use two key technologies: differentiable rendering-based camera pose optimization and consistency-based joint space exploration. Our evaluation demonstrates superior performance in synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-02T03:49:54Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
Aberration-Aware Depth-from-Focus [20.956132508261664]
We investigate the domain gap caused by off-axis aberrations that will affect the decision of the best-focused frame in a focal stack. We then explore bridging this domain gap through aberration-aware training (AAT) Our approach involves a lightweight network that models lens aberrations at different positions and focus distances, which is then integrated into the conventional network training pipeline.
arXiv Detail & Related papers (2023-03-08T15:21:33Z)
A Targeted Accuracy Diagnostic for Variational Approximations [8.969208467611896]
Variational Inference (VI) is an attractive alternative to Markov Chain Monte Carlo (MCMC) Existing methods characterize the quality of the whole variational distribution. We propose the TArgeted Diagnostic for Distribution Approximation Accuracy (TADDAA)
arXiv Detail & Related papers (2023-02-24T02:50:18Z)
Learned Monocular Depth Priors in Visual-Inertial Initialization [4.99761983273316]
Visual-inertial odometry (VIO) is the pose estimation backbone for most AR/VR and autonomous robotic systems today. We propose to circumvent the limitations of classical visual-inertial structure-from-motion (SfM) We leverage learned monocular depth images (mono-depth) to constrain the relative depth of features, and upgrade the mono-depth to metric scale by jointly optimizing for its scale and shift.
arXiv Detail & Related papers (2022-04-20T00:30:04Z)
An automatic differentiation system for the age of differential privacy [65.35244647521989]
Tritium is an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML) We introduce Tritium, an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML)
arXiv Detail & Related papers (2021-09-22T08:07:42Z)
A parameter refinement method for Ptychography based on Deep Learning concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
Estimating Egocentric 3D Human Pose in Global Space [70.7272154474722]
We present a new method for egocentric global 3D body pose estimation using a single-mounted fisheye camera. Our approach outperforms state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-04-27T20:01:57Z)
Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution [51.274657266928315]
We propose a PSF aware plug-and-play deep network, which takes the aberrant image and PSF map as input and produces the latent high quality version via incorporating lens-specific deep priors. Specifically, we pre-train a base model from a set of diverse lenses and then adapt it to a given lens by quickly refining the parameters.
arXiv Detail & Related papers (2021-04-07T12:00:38Z)
Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal. We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness. We show that focal loss allows us to learn models that are already very well calibrated. We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.