Related papers: Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform

Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform

URL: http://arxiv.org/abs/2307.05399v1
Date: Tue, 11 Jul 2023 16:01:44 GMT
Title: Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform
Authors: Mateusz W\'ojcik, Witold Ko\'sciukiewicz, Mateusz Baran, Tomasz Kajdanowicz, Adam Gonczarek
Abstract summary: Recent methods with learning gradient have been shown to struggle in such setups or have limitations like memory buffers. We present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments.
Score: 3.630365560970225
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Production deployments in complex systems require ML architectures to be highly efficient and usable against multiple tasks. Particularly demanding are classification problems in which data arrives in a streaming fashion and each class is presented separately. Recent methods with stochastic gradient learning have been shown to struggle in such setups or have limitations like memory buffers, and being restricted to specific domains that disable its usage in real-world scenarios. For this reason, we present a fully differentiable architecture based on the Mixture of Experts model, that enables the training of high-performance classifiers when examples from each class are presented separately. We conducted exhaustive experiments that proved its applicability in various domains and ability to learn online in production environments. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.

Related papers

MEASURE: Multi-scale Minimal Sufficient Representation Learning for Domain Generalization in Sleep Staging [13.29032553855278]
We propose a novel MEASURE framework, which effectively reduces domain-relevant information while preserving essential temporal and spectral features for sleep stage classification.<n>Our proposed method consistently outperformed state-of-the-art methods in experiments on publicly available sleep staging benchmark datasets.
arXiv Detail & Related papers (2025-10-14T02:20:50Z)
Multiple Stochastic Prompt Tuning for Few-shot Adaptation under Extreme Domain Shift [14.85375816073596]
We introduce multiple learnable prompts per class to capture diverse modes in visual representations arising from distribution shifts.<n>These prompts are modeled as learnable Gaussian distributions, enabling efficient exploration of the prompt parameter space.<n>Experiments and comparisons with state-of-the-art methods demonstrate the effectiveness of the proposed framework.
arXiv Detail & Related papers (2025-06-04T13:18:04Z)
MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning [12.737883740101438]
We present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP) At the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain. At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain.
arXiv Detail & Related papers (2023-07-11T18:17:50Z)
Self-Paced Learning for Open-Set Domain Adaptation [50.620824701934]
Traditional domain adaptation methods presume that the classes in the source and target domains are identical. Open-set domain adaptation (OSDA) addresses this limitation by allowing previously unseen classes in the target domain. We propose a novel framework based on self-paced learning to distinguish common and unknown class samples.
arXiv Detail & Related papers (2023-03-10T14:11:09Z)
Learning from Mistakes: Self-Regularizing Hierarchical Representations in Point Cloud Semantic Segmentation [15.353256018248103]
LiDAR semantic segmentation has gained attention to accomplish fine-grained scene understanding. We present a coarse-to-fine setup that LEArns from classification mistaKes (LEAK) derived from a standard model. Our LEAK approach is very general and can be seamlessly applied on top of any segmentation architecture.
arXiv Detail & Related papers (2023-01-26T14:52:30Z)
Unifying Synergies between Self-supervised Learning and Dynamic Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms. We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z)
Neural Architecture for Online Ensemble Continual Learning [6.241435193861262]
We present a fully differentiable ensemble method that allows us to efficiently train an ensemble of neural networks in the end-to-end regime. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.
arXiv Detail & Related papers (2022-11-27T23:17:08Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance. In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z)
Hyperspherical embedding for novel class classification [1.5952956981784217]
We present a constraint-based approach applied to representations in the latent space under the normalized softmax loss. We experimentally validate the proposed approach for the classification of unseen classes on different datasets using both metric learning and the normalized softmax loss. Our results show that not only our proposed strategy can be efficiently trained on larger set of classes, as it does not require pairwise learning, but also present better classification results than the metric learning strategies.
arXiv Detail & Related papers (2021-02-05T15:42:13Z)
Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn. We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
Multifaceted Context Representation using Dual Attention for Ontology Alignment [6.445605125467574]
Ontology alignment is an important research problem that finds application in various fields such as data integration, data transfer, data preparation etc. We propose VeeAlign, a Deep Learning based model that uses a dual-attention mechanism to compute the contextualized representation of a concept in order to learn alignments. We validate our approach on various datasets from different domains and in multilingual settings, and show its superior performance over SOTA methods.
arXiv Detail & Related papers (2020-10-16T18:28:38Z)
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures [61.73533544385352]
We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data. As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
arXiv Detail & Related papers (2020-06-13T02:54:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.