DaFoEs: Mixing Datasets towards the generalization of vision-state
deep-learning Force Estimation in Minimally Invasive Robotic Surgery
- URL: http://arxiv.org/abs/2401.09239v1
- Date: Wed, 17 Jan 2024 14:39:55 GMT
- Title: DaFoEs: Mixing Datasets towards the generalization of vision-state
deep-learning Force Estimation in Minimally Invasive Robotic Surgery
- Authors: Mikel De Iturrate Reyzabal, Mingcong Chen, Wei Huang, Sebastien
Ourselin and Hongbin Liu
- Abstract summary: We present a new vision-haptic dataset (DaFoEs) with variable soft environments for the training of deep neural models.
We also present a variable encoder-decoder architecture to predict the forces done by the laparoscopic tool using single input or sequence of inputs.
- Score: 6.55111164866752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Precisely determining the contact force during safe interaction in Minimally
Invasive Robotic Surgery (MIRS) is still an open research challenge. Inspired
by post-operative qualitative analysis from surgical videos, the use of
cross-modality data driven deep neural network models has been one of the
newest approaches to predict sensorless force trends. However, these methods
required for large and variable datasets which are not currently available. In
this paper, we present a new vision-haptic dataset (DaFoEs) with variable soft
environments for the training of deep neural models. In order to reduce the
bias from a single dataset, we present a pipeline to generalize different
vision and state data inputs for mixed dataset training, using a previously
validated dataset with different setup. Finally, we present a variable
encoder-decoder architecture to predict the forces done by the laparoscopic
tool using single input or sequence of inputs. For input sequence, we use a
recurrent decoder, named with the prefix R, and a new temporal sampling to
represent the acceleration of the tool. During our training, we demonstrate
that single dataset training tends to overfit to the training data domain, but
has difficulties on translating the results across new domains. However,
dataset mixing presents a good translation with a mean relative estimated force
error of 5% and 12% for the recurrent and non-recurrent models respectively.
Our method, also marginally increase the effectiveness of transformers for
force estimation up to a maximum of ~15%, as the volume of available data is
increase by 150%. In conclusion, we demonstrate that mixing experimental set
ups for vision-state force estimation in MIRS is a possible approach towards
the general solution of the problem.
Related papers
- An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - Data-Augmented Predictive Deep Neural Network: Enhancing the extrapolation capabilities of non-intrusive surrogate models [0.5735035463793009]
We propose a new deep learning framework, where kernel dynamic mode decomposition (KDMD) is employed to evolve the dynamics of the latent space generated by the encoder part of a convolutional autoencoder (CAE)
After adding the KDMD-decoder-extrapolated data into the original data set, we train the CAE along with a feed-forward deep neural network using the augmented data.
The trained network can predict future states outside the training time interval at any out-of-training parameter samples.
arXiv Detail & Related papers (2024-10-17T09:26:14Z) - Self-Supervised Pre-training Tasks for an fMRI Time-series Transformer in Autism Detection [3.665816629105171]
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that encompasses a wide variety of symptoms and degrees of impairment.
We have developed a transformer-based self-supervised framework that directly analyzes time-series fMRI data without computing functional connectivity.
We show that randomly masking entire ROIs gives better model performance than randomly masking time points in the pre-training step.
arXiv Detail & Related papers (2024-09-18T20:29:23Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - The Imaginative Generative Adversarial Network: Automatic Data
Augmentation for Dynamic Skeleton-Based Hand Gesture and Human Action
Recognition [27.795763107984286]
We present a novel automatic data augmentation model, which approximates the distribution of the input data and samples new data from this distribution.
Our results show that the augmentation strategy is fast to train and can improve classification accuracy for both neural networks and state-of-the-art methods.
arXiv Detail & Related papers (2021-05-27T11:07:09Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.