Related papers: Certified Training with Branch-and-Bound: A Case Study on Lyapunov-stable Neural Control

Certified Training with Branch-and-Bound: A Case Study on Lyapunov-stable Neural Control

URL: http://arxiv.org/abs/2411.18235v1
Date: Wed, 27 Nov 2024 11:12:46 GMT
Title: Certified Training with Branch-and-Bound: A Case Study on Lyapunov-stable Neural Control
Authors: Zhouxing Shi, Cho-Jui Hsieh, Huan Zhang,
Abstract summary: We develop a new and generally formulated certified training framework named CT-BaB.<n>In order to handle the relatively large region-of-interest, we propose a novel framework of training-time branch-and-bound.<n>We demonstrate that our new training framework can produce models which can be more efficiently verified at test time.
Score: 64.58719561861079
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of learning Lyapunov-stable neural controllers which provably satisfy the Lyapunov asymptotic stability condition within a region-of-attraction. Compared to previous works which commonly used counterexample guided training on this task, we develop a new and generally formulated certified training framework named CT-BaB, and we optimize for differentiable verified bounds, to produce verification-friendly models. In order to handle the relatively large region-of-interest, we propose a novel framework of training-time branch-and-bound to dynamically maintain a training dataset of subregions throughout training, such that the hardest subregions are iteratively split into smaller ones whose verified bounds can be computed more tightly to ease the training. We demonstrate that our new training framework can produce models which can be more efficiently verified at test time. On the largest 2D quadrotor dynamical system, verification for our model is more than 5X faster compared to the baseline, while our size of region-of-attraction is 16X larger than the baseline.

Related papers

Bagging-Based Model Merging for Robust General Text Embeddings [73.51674133699196]
General-purpose text embedding models underpin a wide range of NLP and information retrieval applications.<n>We present a systematic study of multi-task training for text embeddings from two perspectives: data scheduling and model merging.<n>We propose Bagging-based rObust mOdel Merging (BOOM), which trains multiple embedding models on sampled subsets and merges them into a single model.
arXiv Detail & Related papers (2026-02-05T15:45:08Z)
Two-Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion [17.905596843865705]
We propose a novel two-stage training framework to jointly synthesize the controller and Lyapunov function for continuous-time systems.<n>Unlike existing works on continuous-time systems that rely on an SMT solver to formally verify the Lyapunov condition, we extend state-of-the-art neural network verifier $alpha,!beta$-CROWN.
arXiv Detail & Related papers (2025-06-02T06:20:09Z)
FORT: Forward-Only Regression Training of Normalizing Flows [85.66894616735752]
We revisit classical normalizing flows as one-step generative models with exact likelihoods.<n>We propose a novel, scalable training objective that does not require computing the expensive change of variable formula used in conventional maximum likelihood training.
arXiv Detail & Related papers (2025-06-01T20:32:27Z)
Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network) After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference. We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z)
A Multi-Level Framework for Accelerating Training Transformer Models [5.268960238774481]
Training large-scale deep learning models poses an unprecedented demand for computing power. We propose a multi-level framework for training acceleration based on Coalescing, De-coalescing and Interpolation. We prove that the proposed framework reduces the computational cost by about 20% on training BERT/GPT-Base models and up to 51.6% on training the BERT-Large model.
arXiv Detail & Related papers (2024-04-07T03:04:34Z)
Efficient Stagewise Pretraining via Progressive Subnetworks [53.00045381931778]
The prevailing view suggests that stagewise dropping strategies, such as layer dropping, are ineffective when compared to stacking-based approaches. This paper challenges this notion by demonstrating that, with proper design, dropping strategies can be competitive, if not better, than stacking methods. We propose an instantiation of this framework - Random Part Training (RAPTR) - that selects and trains only a random subnetwork at each step, progressively increasing the size in stages.
arXiv Detail & Related papers (2024-02-08T18:49:09Z)
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration [46.4179239171213]
We propose an efficient always-sparse training algorithm with excellent scaling to larger and sparser models. We evaluate our method on CIFAR-10/100 and ImageNet using VGG, and ViT models, and compare it against a range of sparsification methods.
arXiv Detail & Related papers (2024-01-12T21:32:04Z)
TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series [57.4208255711412]
Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS) We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks.
arXiv Detail & Related papers (2023-10-02T16:45:19Z)
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks. We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z)
CURTAINs Flows For Flows: Constructing Unobserved Regions with Maximum Likelihood Estimation [0.0]
We introduce a major improvement to the CURTAINs method by training the conditional normalizing flow between two side-band regions. CURTAINsF4F requires substantially less computational resources to cover a large number of signal regions than other fully data driven approaches.
arXiv Detail & Related papers (2023-05-08T11:58:49Z)
Controlled Descent Training [0.0]
A novel and model-based artificial neural network (ANN) training method is developed supported by optimal control theory. The method augments training labels in order to robustly guarantee training loss convergence and improve training convergence rate. The applicability of the method is demonstrated on standard regression and classification problems.
arXiv Detail & Related papers (2023-03-16T10:45:24Z)
Structured State Space Models for In-Context Reinforcement Learning [30.189834820419446]
Structured state space sequence (S4) models have recently achieved state-of-the-art performance on long-range sequence modeling tasks. We propose a modification to a variant of S4 that enables us to initialise and reset the hidden state in parallel. We show that our modified architecture runs faster than Transformers in sequence length and performs better than RNN's on a simple memory-based task.
arXiv Detail & Related papers (2023-03-07T15:32:18Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free [115.81899803240758]
Adversarial training and its many variants substantially improve deep network robustness, yet at the cost of compromising standard accuracy. This paper asks how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies. Our proposed framework, Once-for-all Adversarial Training (OAT), is built on an innovative model-conditional training framework.
arXiv Detail & Related papers (2020-10-22T16:06:34Z)
Deep Ensembles for Low-Data Transfer Learning [21.578470914935938]
We study different ways of creating ensembles from pre-trained models. We show that the nature of pre-training itself is a performant source of diversity. We propose a practical algorithm that efficiently identifies a subset of pre-trained models for any downstream dataset.
arXiv Detail & Related papers (2020-10-14T07:59:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.