Domain-Agnostic Neural Architecture for Class Incremental Continual
Learning in Document Processing Platform
- URL: http://arxiv.org/abs/2307.05399v1
- Date: Tue, 11 Jul 2023 16:01:44 GMT
- Title: Domain-Agnostic Neural Architecture for Class Incremental Continual
Learning in Document Processing Platform
- Authors: Mateusz W\'ojcik, Witold Ko\'sciukiewicz, Mateusz Baran, Tomasz
Kajdanowicz, Adam Gonczarek
- Abstract summary: Recent methods with learning gradient have been shown to struggle in such setups or have limitations like memory buffers.
We present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately.
We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments.
- Score: 3.630365560970225
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Production deployments in complex systems require ML architectures to be
highly efficient and usable against multiple tasks. Particularly demanding are
classification problems in which data arrives in a streaming fashion and each
class is presented separately. Recent methods with stochastic gradient learning
have been shown to struggle in such setups or have limitations like memory
buffers, and being restricted to specific domains that disable its usage in
real-world scenarios. For this reason, we present a fully differentiable
architecture based on the Mixture of Experts model, that enables the training
of high-performance classifiers when examples from each class are presented
separately. We conducted exhaustive experiments that proved its applicability
in various domains and ability to learn online in production environments. The
proposed technique achieves SOTA results without a memory buffer and clearly
outperforms the reference methods.
Related papers
- MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental
Learning [12.737883740101438]
We present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP)
At the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain.
At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain.
arXiv Detail & Related papers (2023-07-11T18:17:50Z) - Self-Paced Learning for Open-Set Domain Adaptation [50.620824701934]
Traditional domain adaptation methods presume that the classes in the source and target domains are identical.
Open-set domain adaptation (OSDA) addresses this limitation by allowing previously unseen classes in the target domain.
We propose a novel framework based on self-paced learning to distinguish common and unknown class samples.
arXiv Detail & Related papers (2023-03-10T14:11:09Z) - Learning from Mistakes: Self-Regularizing Hierarchical Representations
in Point Cloud Semantic Segmentation [15.353256018248103]
LiDAR semantic segmentation has gained attention to accomplish fine-grained scene understanding.
We present a coarse-to-fine setup that LEArns from classification mistaKes (LEAK) derived from a standard model.
Our LEAK approach is very general and can be seamlessly applied on top of any segmentation architecture.
arXiv Detail & Related papers (2023-01-26T14:52:30Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Neural Architecture for Online Ensemble Continual Learning [6.241435193861262]
We present a fully differentiable ensemble method that allows us to efficiently train an ensemble of neural networks in the end-to-end regime.
The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.
arXiv Detail & Related papers (2022-11-27T23:17:08Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Hyperspherical embedding for novel class classification [1.5952956981784217]
We present a constraint-based approach applied to representations in the latent space under the normalized softmax loss.
We experimentally validate the proposed approach for the classification of unseen classes on different datasets using both metric learning and the normalized softmax loss.
Our results show that not only our proposed strategy can be efficiently trained on larger set of classes, as it does not require pairwise learning, but also present better classification results than the metric learning strategies.
arXiv Detail & Related papers (2021-02-05T15:42:13Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z) - Multifaceted Context Representation using Dual Attention for Ontology
Alignment [6.445605125467574]
Ontology alignment is an important research problem that finds application in various fields such as data integration, data transfer, data preparation etc.
We propose VeeAlign, a Deep Learning based model that uses a dual-attention mechanism to compute the contextualized representation of a concept in order to learn alignments.
We validate our approach on various datasets from different domains and in multilingual settings, and show its superior performance over SOTA methods.
arXiv Detail & Related papers (2020-10-16T18:28:38Z) - MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and
Architectures [61.73533544385352]
We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data.
As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
arXiv Detail & Related papers (2020-06-13T02:54:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.