Efficient Multi-domain Text Recognition Deep Neural Network
Parameterization with Residual Adapters
- URL: http://arxiv.org/abs/2401.00971v1
- Date: Mon, 1 Jan 2024 23:01:40 GMT
- Title: Efficient Multi-domain Text Recognition Deep Neural Network
Parameterization with Residual Adapters
- Authors: Jiayou Chao and Wei Zhu
- Abstract summary: This study presents a novel neural network model adept at optical character recognition (OCR) across diverse domains.
The model is designed to achieve rapid adaptation to new domains, maintain a compact size conducive to reduced computational resource demand, ensure high accuracy, retain knowledge from previous learning experiences, and allow for domain-specific performance improvements without the need to retrain entirely.
- Score: 4.454976752204893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in deep neural networks have markedly enhanced the
performance of computer vision tasks, yet the specialized nature of these
networks often necessitates extensive data and high computational power.
Addressing these requirements, this study presents a novel neural network model
adept at optical character recognition (OCR) across diverse domains, leveraging
the strengths of multi-task learning to improve efficiency and generalization.
The model is designed to achieve rapid adaptation to new domains, maintain a
compact size conducive to reduced computational resource demand, ensure high
accuracy, retain knowledge from previous learning experiences, and allow for
domain-specific performance improvements without the need to retrain entirely.
Rigorous evaluation on open datasets has validated the model's ability to
significantly lower the number of trainable parameters without sacrificing
performance, indicating its potential as a scalable and adaptable solution in
the field of computer vision, particularly for applications in optical text
recognition.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - RepAct: The Re-parameterizable Adaptive Activation Function [31.238011686165596]
RepAct is a adaptive activation function tailored for optimizing lightweight neural networks within the computational limitations of edge devices.
When evaluated on tasks such as image classification and object detection, RepAct notably surpassed conventional activation functions.
arXiv Detail & Related papers (2024-06-28T08:25:45Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Power-Enhanced Residual Network for Function Approximation and Physics-Informed Inverse Problems [0.0]
This paper introduces a novel neural network structure called the Power-Enhancing residual network.
It improves the network's capabilities for both smooth and non-smooth functions approximation in 2D and 3D settings.
Results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions.
arXiv Detail & Related papers (2023-10-24T10:01:15Z) - Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks.
Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios.
New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z) - Deepening Neural Networks Implicitly and Locally via Recurrent Attention
Strategy [6.39424542887036]
Recurrent Attention Strategy implicitly increases the depth of neural networks with lightweight attention modules by local parameter sharing.
Experiments on three widely-used benchmark datasets demonstrate that RAS can improve the performance of neural networks at a slight addition of parameter size and computation.
arXiv Detail & Related papers (2022-10-27T13:09:02Z) - A Proper Orthogonal Decomposition approach for parameters reduction of
Single Shot Detector networks [0.0]
We propose a dimensionality reduction framework based on Proper Orthogonal Decomposition, a classical model order reduction technique.
We have applied such framework to SSD300 architecture using PASCAL VOC dataset, demonstrating a reduction of the network dimension and a remarkable speedup in the fine-tuning of the network in a transfer learning context.
arXiv Detail & Related papers (2022-07-27T14:43:14Z) - Contextual HyperNetworks for Novel Feature Adaptation [43.49619456740745]
Contextual HyperNetwork (CHN) generates parameters for extending the base model to a new feature.
At prediction time, the CHN requires only a single forward pass through a neural network, yielding a significant speed-up.
We show that this system obtains improved few-shot learning performance for novel features over existing imputation and meta-learning baselines.
arXiv Detail & Related papers (2021-04-12T23:19:49Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.