Related papers: VeriCompress: A Tool to Streamline the Synthesis of Verified Robust Compressed Neural Networks from Scratch

VeriCompress: A Tool to Streamline the Synthesis of Verified Robust Compressed Neural Networks from Scratch

URL: http://arxiv.org/abs/2211.09945v7
Date: Tue, 21 Nov 2023 18:03:06 GMT
Title: VeriCompress: A Tool to Streamline the Synthesis of Verified Robust Compressed Neural Networks from Scratch
Authors: Sawinder Kaur, Yi Xiao, Asif Salekin
Abstract summary: AI's widespread integration has led to neural networks (NNs) deployment on edge and similar limited-resource platforms for safety-critical scenarios. This study introduces VeriCompress, a tool that automates the search and training of compressed models with robustness guarantees. The method trains models 2-3 times faster than the state-of-the-art approaches, surpassing relevant baseline approaches by average accuracy and robustness gains of 15.1 and 9.8 percentage points, respectively.
Score: 10.061078548888567
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: AI's widespread integration has led to neural networks (NNs) deployment on edge and similar limited-resource platforms for safety-critical scenarios. Yet, NN's fragility raises concerns about reliable inference. Moreover, constrained platforms demand compact networks. This study introduces VeriCompress, a tool that automates the search and training of compressed models with robustness guarantees. These models are well-suited for safety-critical applications and adhere to predefined architecture and size limitations, making them deployable on resource-restricted platforms. The method trains models 2-3 times faster than the state-of-the-art approaches, surpassing relevant baseline approaches by average accuracy and robustness gains of 15.1 and 9.8 percentage points, respectively. When deployed on a resource-restricted generic platform, these models require 5-8 times less memory and 2-4 times less inference time than models used in verified robustness literature. Our comprehensive evaluation across various model architectures and datasets, including MNIST, CIFAR, SVHN, and a relevant pedestrian detection dataset, showcases VeriCompress's capacity to identify compressed verified robust models with reduced computation overhead compared to current standards. This underscores its potential as a valuable tool for end users, such as developers of safety-critical applications on edge or Internet of Things platforms, empowering them to create suitable models for safety-critical, resource-constrained platforms in their respective domains.

Related papers

Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models [21.88436406884943]
We introduce EED, which diversifies the compression of a single base model based on different pruning importance scores and enhances ensemble diversity to achieve high adversarial robustness and resource efficiency. EED demonstrated state-of-the-art performance compared to existing adversarial pruning techniques, along with an inference speed improvement of up to 1.86 times.
arXiv Detail & Related papers (2025-04-07T05:41:35Z)
Adaptable Embeddings Network (AEN) [49.1574468325115]
We introduce Adaptable Embeddings Networks (AEN), a novel dual-encoder architecture using Kernel Density Estimation (KDE) AEN allows for runtime adaptation of classification criteria without retraining and is non-autoregressive. The architecture's ability to preprocess and cache condition embeddings makes it ideal for edge computing applications and real-time monitoring systems.
arXiv Detail & Related papers (2024-11-21T02:15:52Z)
Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability [2.6708879445664584]
This paper introduces a novel approach for assessing a newly trained model's performance based on another known model. The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output.
arXiv Detail & Related papers (2024-08-15T22:57:39Z)
Computer Vision Model Compression Techniques for Embedded Systems: A Survey [75.38606213726906]
This paper covers the main model compression techniques applied for computer vision tasks. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges.
arXiv Detail & Related papers (2024-08-15T16:41:55Z)
Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference [11.39873199479642]
Nesa introduces a model-agnostic sharding framework designed for decentralized AI inference. Our framework uses blockchain-based deep neural network sharding to distribute computational tasks across a diverse network of nodes. Our results highlight the potential to democratize access to cutting-edge AI technologies.
arXiv Detail & Related papers (2024-07-29T08:18:48Z)
PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data. The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates. We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z)
LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models. We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity. Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
Compact CNN Structure Learning by Knowledge Distillation [34.36242082055978]
We propose a framework that leverages knowledge distillation along with customizable block-wise optimization to learn a lightweight CNN structure. Our method results in a state of the art network compression while being capable of achieving better inference accuracy. In particular, for the already compact network MobileNet_v2, our method offers up to 2x and 5.2x better model compression.
arXiv Detail & Related papers (2021-04-19T10:34:22Z)
Dataless Model Selection with the Deep Frame Potential [45.16941644841897]
We quantify networks by their intrinsic capacity for unique and robust representations. We propose the deep frame potential: a measure of coherence that is approximately related to representation stability but has minimizers that depend only on network structure. We validate its use as a criterion for model selection and demonstrate correlation with generalization error on a variety of common residual and densely connected network architectures.
arXiv Detail & Related papers (2020-03-30T23:27:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.