Tetra-AML: Automatic Machine Learning via Tensor Networks
- URL: http://arxiv.org/abs/2303.16214v1
- Date: Tue, 28 Mar 2023 12:56:54 GMT
- Title: Tetra-AML: Automatic Machine Learning via Tensor Networks
- Authors: A. Naumov, Ar. Melnikov, V. Abronin, F. Oxanichenko, K. Izmailov, M.
Pflitsch, A. Melnikov, M. Perelshtein
- Abstract summary: We introduce the Tetra-AML toolbox, which automates neural architecture search and hyperparameter optimization.
The toolbox also provides model compression through quantization and pruning, augmented by compression using tensor networks.
Here, we analyze a unified benchmark for optimizing neural networks in computer vision tasks and show the superior performance of our approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks have revolutionized many aspects of society but in the era of
huge models with billions of parameters, optimizing and deploying them for
commercial applications can require significant computational and financial
resources. To address these challenges, we introduce the Tetra-AML toolbox,
which automates neural architecture search and hyperparameter optimization via
a custom-developed black-box Tensor train Optimization algorithm, TetraOpt. The
toolbox also provides model compression through quantization and pruning,
augmented by compression using tensor networks. Here, we analyze a unified
benchmark for optimizing neural networks in computer vision tasks and show the
superior performance of our approach compared to Bayesian optimization on the
CIFAR-10 dataset. We also demonstrate the compression of ResNet-18 neural
networks, where we use 14.5 times less memory while losing just 3.2% of
accuracy. The presented framework is generic, not limited by computer vision
problems, supports hardware acceleration (such as with GPUs and TPUs) and can
be further extended to quantum hardware and to hybrid quantum machine learning
models.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs [0.0]
Training deep neural networks consumes increasing computational resource shares in many compute centers.
We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only.
We compare the proposed second-order method with two state-of-the-arts on five representative neural network problems.
arXiv Detail & Related papers (2022-08-03T12:38:23Z) - Neural Network Quantization with AI Model Efficiency Toolkit (AIMET) [15.439669159557253]
We present an overview of neural network quantization using AI Model Efficiency Toolkit (AIMET)
AIMET is a library of state-of-the-art quantization and compression algorithms designed to ease the effort required for model optimization.
We provide a practical guide to quantization via AIMET by covering PTQ and QAT, code examples and practical tips.
arXiv Detail & Related papers (2022-01-20T20:35:37Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Parameter Prediction for Unseen Deep Architectures [23.79630072083828]
We study if we can use deep learning to directly predict parameters by exploiting the past knowledge of training other networks.
We propose a hypernetwork that can predict performant parameters in a single forward pass taking a fraction of a second, even on a CPU.
The proposed model achieves surprisingly good performance on unseen and diverse networks.
arXiv Detail & Related papers (2021-10-25T16:52:33Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Optimisation of a Siamese Neural Network for Real-Time Energy Efficient
Object Tracking [0.0]
optimisation of visual object tracking using a Siamese neural network for embedded vision systems is presented.
It was assumed that the solution shall operate in real-time, preferably for a high resolution video stream.
arXiv Detail & Related papers (2020-07-01T13:49:56Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.