DaFoEs: Mixing Datasets towards the generalization of vision-state
deep-learning Force Estimation in Minimally Invasive Robotic Surgery
- URL: http://arxiv.org/abs/2401.09239v1
- Date: Wed, 17 Jan 2024 14:39:55 GMT
- Title: DaFoEs: Mixing Datasets towards the generalization of vision-state
deep-learning Force Estimation in Minimally Invasive Robotic Surgery
- Authors: Mikel De Iturrate Reyzabal, Mingcong Chen, Wei Huang, Sebastien
Ourselin and Hongbin Liu
- Abstract summary: We present a new vision-haptic dataset (DaFoEs) with variable soft environments for the training of deep neural models.
We also present a variable encoder-decoder architecture to predict the forces done by the laparoscopic tool using single input or sequence of inputs.
- Score: 6.55111164866752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Precisely determining the contact force during safe interaction in Minimally
Invasive Robotic Surgery (MIRS) is still an open research challenge. Inspired
by post-operative qualitative analysis from surgical videos, the use of
cross-modality data driven deep neural network models has been one of the
newest approaches to predict sensorless force trends. However, these methods
required for large and variable datasets which are not currently available. In
this paper, we present a new vision-haptic dataset (DaFoEs) with variable soft
environments for the training of deep neural models. In order to reduce the
bias from a single dataset, we present a pipeline to generalize different
vision and state data inputs for mixed dataset training, using a previously
validated dataset with different setup. Finally, we present a variable
encoder-decoder architecture to predict the forces done by the laparoscopic
tool using single input or sequence of inputs. For input sequence, we use a
recurrent decoder, named with the prefix R, and a new temporal sampling to
represent the acceleration of the tool. During our training, we demonstrate
that single dataset training tends to overfit to the training data domain, but
has difficulties on translating the results across new domains. However,
dataset mixing presents a good translation with a mean relative estimated force
error of 5% and 12% for the recurrent and non-recurrent models respectively.
Our method, also marginally increase the effectiveness of transformers for
force estimation up to a maximum of ~15%, as the volume of available data is
increase by 150%. In conclusion, we demonstrate that mixing experimental set
ups for vision-state force estimation in MIRS is a possible approach towards
the general solution of the problem.
Related papers
- Diffusion-based Neural Network Weights Generation [85.6725307453325]
We propose an efficient and adaptive transfer learning scheme through dataset-conditioned pretrained weights sampling.
Specifically, we use a latent diffusion model with a variational autoencoder that can reconstruct the neural network weights.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Group Distributionally Robust Dataset Distillation with Risk
Minimization [18.07189444450016]
We introduce an algorithm that combines clustering with the minimization of a risk measure on the loss to conduct DD.
We demonstrate its effective generalization and robustness across subgroups through numerical experiments.
arXiv Detail & Related papers (2024-02-07T09:03:04Z) - Enhancing Cross-Dataset Performance of Distracted Driving Detection With
Score-Softmax Classifier [7.302402275736439]
Deep neural networks enable real-time monitoring of in-vehicle driver, facilitating the timely prediction of distractions, fatigue, and potential hazards.
Recent research has exposed unreliable cross-dataset end-to-end driver behavior recognition due to overfitting.
We introduce the Score-Softmax classifier, which addresses this issue by enhancing inter-class independence and Intra-class uncertainty.
arXiv Detail & Related papers (2023-10-08T15:28:01Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Label-Efficient Self-Supervised Federated Learning for Tackling Data
Heterogeneity in Medical Imaging [23.08596805950814]
We present a robust and label-efficient self-supervised FL framework for medical image analysis.
Specifically, we introduce a novel distributed self-supervised pre-training paradigm into the existing FL pipeline.
We show that our self-supervised FL algorithm generalizes well to out-of-distribution data and learns federated models more effectively in limited label scenarios.
arXiv Detail & Related papers (2022-05-17T18:33:43Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - The Imaginative Generative Adversarial Network: Automatic Data
Augmentation for Dynamic Skeleton-Based Hand Gesture and Human Action
Recognition [27.795763107984286]
We present a novel automatic data augmentation model, which approximates the distribution of the input data and samples new data from this distribution.
Our results show that the augmentation strategy is fast to train and can improve classification accuracy for both neural networks and state-of-the-art methods.
arXiv Detail & Related papers (2021-05-27T11:07:09Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.