Related papers: Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual Adapters

Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual Adapters

URL: http://arxiv.org/abs/2401.00971v1
Date: Mon, 1 Jan 2024 23:01:40 GMT
Title: Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual Adapters
Authors: Jiayou Chao and Wei Zhu
Abstract summary: This study presents a novel neural network model adept at optical character recognition (OCR) across diverse domains. The model is designed to achieve rapid adaptation to new domains, maintain a compact size conducive to reduced computational resource demand, ensure high accuracy, retain knowledge from previous learning experiences, and allow for domain-specific performance improvements without the need to retrain entirely.
Score: 4.454976752204893
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in deep neural networks have markedly enhanced the performance of computer vision tasks, yet the specialized nature of these networks often necessitates extensive data and high computational power. Addressing these requirements, this study presents a novel neural network model adept at optical character recognition (OCR) across diverse domains, leveraging the strengths of multi-task learning to improve efficiency and generalization. The model is designed to achieve rapid adaptation to new domains, maintain a compact size conducive to reduced computational resource demand, ensure high accuracy, retain knowledge from previous learning experiences, and allow for domain-specific performance improvements without the need to retrain entirely. Rigorous evaluation on open datasets has validated the model's ability to significantly lower the number of trainable parameters without sacrificing performance, indicating its potential as a scalable and adaptable solution in the field of computer vision, particularly for applications in optical text recognition.

Related papers

A Multi-Fidelity Graph U-Net Model for Accelerated Physics Simulations [1.2430809884830318]
We propose a novel GNN architecture, Multi-Fidelity U-Net, that utilizes the advantages of the multi-fidelity methods for enhancing the performance of the GNN model. We show that the proposed approach performs significantly better in accuracy and data requirement. We also present Multi-Fidelity U-Net Lite, a faster version of the proposed architecture, with 35% faster training, with 2 to 5% reduction in accuracy.
arXiv Detail & Related papers (2024-12-19T20:09:38Z)
Trainable Adaptive Activation Function Structure (TAAFS) Enhances Neural Network Force Field Performance with Only Dozens of Additional Parameters [0.0]
Trainable Adaptive Function Activation Structure (TAAFS) We introduce a method that selects distinct mathematical formulations for non-linear activations. In this study, we integrate TAAFS into a variety of neural network models, resulting in observed accuracy improvements.
arXiv Detail & Related papers (2024-12-19T09:06:39Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
RepAct: The Re-parameterizable Adaptive Activation Function [31.238011686165596]
RepAct is a adaptive activation function tailored for optimizing lightweight neural networks within the computational limitations of edge devices. When evaluated on tasks such as image classification and object detection, RepAct notably surpassed conventional activation functions.
arXiv Detail & Related papers (2024-06-28T08:25:45Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
Power-Enhanced Residual Network for Function Approximation and Physics-Informed Inverse Problems [0.0]
This paper introduces a novel neural network structure called the Power-Enhancing residual network. It improves the network's capabilities for both smooth and non-smooth functions approximation in 2D and 3D settings. Results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions.
arXiv Detail & Related papers (2023-10-24T10:01:15Z)
Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks. Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z)
Deepening Neural Networks Implicitly and Locally via Recurrent Attention Strategy [6.39424542887036]
Recurrent Attention Strategy implicitly increases the depth of neural networks with lightweight attention modules by local parameter sharing. Experiments on three widely-used benchmark datasets demonstrate that RAS can improve the performance of neural networks at a slight addition of parameter size and computation.
arXiv Detail & Related papers (2022-10-27T13:09:02Z)
A Proper Orthogonal Decomposition approach for parameters reduction of Single Shot Detector networks [0.0]
We propose a dimensionality reduction framework based on Proper Orthogonal Decomposition, a classical model order reduction technique. We have applied such framework to SSD300 architecture using PASCAL VOC dataset, demonstrating a reduction of the network dimension and a remarkable speedup in the fine-tuning of the network in a transfer learning context.
arXiv Detail & Related papers (2022-07-27T14:43:14Z)
Contextual HyperNetworks for Novel Feature Adaptation [43.49619456740745]
Contextual HyperNetwork (CHN) generates parameters for extending the base model to a new feature. At prediction time, the CHN requires only a single forward pass through a neural network, yielding a significant speed-up. We show that this system obtains improved few-shot learning performance for novel features over existing imputation and meta-learning baselines.
arXiv Detail & Related papers (2021-04-12T23:19:49Z)
Joint Learning of Neural Transfer and Architecture Adaptation for Image Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset. In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness. Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.