Revisiting Distillation for Continual Learning on Visual Question
Localized-Answering in Robotic Surgery
- URL: http://arxiv.org/abs/2307.12045v1
- Date: Sat, 22 Jul 2023 10:35:25 GMT
- Title: Revisiting Distillation for Continual Learning on Visual Question
Localized-Answering in Robotic Surgery
- Authors: Long Bai, Mobarakol Islam, Hongliang Ren
- Abstract summary: The visual-question localized-answering (VQLA) system can serve as a knowledgeable assistant in surgical education.
Deep neural networks (DNNs) suffer from catastrophic forgetting when learning new knowledge.
- Score: 20.509915509237818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The visual-question localized-answering (VQLA) system can serve as a
knowledgeable assistant in surgical education. Except for providing text-based
answers, the VQLA system can highlight the interested region for better
surgical scene understanding. However, deep neural networks (DNNs) suffer from
catastrophic forgetting when learning new knowledge. Specifically, when DNNs
learn on incremental classes or tasks, their performance on old tasks drops
dramatically. Furthermore, due to medical data privacy and licensing issues, it
is often difficult to access old data when updating continual learning (CL)
models. Therefore, we develop a non-exemplar continual surgical VQLA framework,
to explore and balance the rigidity-plasticity trade-off of DNNs in a
sequential learning paradigm. We revisit the distillation loss in CL tasks, and
propose rigidity-plasticity-aware distillation (RP-Dist) and self-calibrated
heterogeneous distillation (SH-Dist) to preserve the old knowledge. The weight
aligning (WA) technique is also integrated to adjust the weight bias between
old and new tasks. We further establish a CL framework on three public surgical
datasets in the context of surgical settings that consist of overlapping
classes between old and new surgical VQLA tasks. With extensive experiments, we
demonstrate that our proposed method excellently reconciles learning and
forgetting on the continual surgical VQLA over conventional CL methods. Our
code is publicly accessible.
Related papers
- Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation [51.222684687924215]
Surgical video-language pretraining faces unique challenges due to the knowledge domain gap and the scarcity of multi-modal data.
We propose a hierarchical knowledge augmentation approach and a novel Procedure-Encoded Surgical Knowledge-Augmented Video-Language Pretraining framework to tackle these issues.
arXiv Detail & Related papers (2024-09-30T22:21:05Z) - CLEO: Continual Learning of Evolving Ontologies [12.18795037817058]
Continual learning (CL) aims to instill the lifelong learning of humans in intelligent systems.
General learning processes are not just limited to learning information, but also refinement of existing information.
CLEO is motivated by the need for intelligent systems to adapt to real-world changes over time.
arXiv Detail & Related papers (2024-07-11T11:32:33Z) - LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery [57.358568111574314]
Patient data privacy often restricts the availability of old data when updating the model.
Prior CL studies overlooked two vital problems in the surgical domain.
This paper proposes addressing these problems with a multimodal large language model (LLM) and an adaptive weight assignment methodology.
arXiv Detail & Related papers (2024-02-26T15:35:24Z) - Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class
Incremental Learning [64.14254712331116]
Non-exemplar class incremental learning aims to learn both the new and old tasks without accessing any training data from the past.
We propose a novel framework of fine-grained knowledge selection and restoration.
arXiv Detail & Related papers (2023-12-20T02:34:11Z) - Automating Continual Learning [42.710124929514066]
General-purpose learning systems should improve themselves in open-ended fashion in ever-changing environments.
We propose Automated Continual Learning (ACL) to train self-referential neural networks to meta-learn their own in-context continual (meta-)learning algorithms.
arXiv Detail & Related papers (2023-12-01T01:25:04Z) - CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual
Question Localized-Answering in Robotic Surgery [14.52406034300867]
A surgical Visual Question Localized-Answering (VQLA) system would be helpful for medical students and junior surgeons to learn and understand from recorded surgical videos.
We propose an end-to-end Transformer with the Co-Attention gaTed Vision-Language (CAT-ViL) embedding for VQLA in surgical scenarios.
The proposed method provides a promising solution for surgical scene understanding, and opens up a primary step in the Artificial Intelligence (AI)-based VQLA system for surgical training.
arXiv Detail & Related papers (2023-07-11T11:35:40Z) - Adapter Learning in Pretrained Feature Extractor for Continual Learning
of Diseases [66.27889778566734]
Currently intelligent diagnosis systems lack the ability of continually learning to diagnose new diseases once deployed.
In particular, updating an intelligent diagnosis system with training data of new diseases would cause catastrophic forgetting of old disease knowledge.
An adapter-based Continual Learning framework called ACL is proposed to help effectively learn a set of new diseases.
arXiv Detail & Related papers (2023-04-18T15:01:45Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - LRTD: Long-Range Temporal Dependency based Active Learning for Surgical
Workflow Recognition [67.86810761677403]
We propose a novel active learning method for cost-effective surgical video analysis.
Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency.
We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task.
arXiv Detail & Related papers (2020-04-21T09:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.