TUTOR: Training Neural Networks Using Decision Rules as Model Priors
- URL: http://arxiv.org/abs/2010.05429v3
- Date: Wed, 16 Feb 2022 01:19:53 GMT
- Title: TUTOR: Training Neural Networks Using Decision Rules as Model Priors
- Authors: Shayan Hassantabar, Prerit Terway, and Niraj K. Jha
- Abstract summary: Deep neural networks (DNNs) generally need large amounts of data and computational resources for training.
We propose the TUTOR framework to synthesize accurate DNN models with limited available data and reduced memory/computational requirements.
We show that in comparison to fully connected DNNs, TUTOR, on an average, reduces the need for data by 5.9x, improves accuracy by 3.4%, and reduces the number of parameters (fFLOPs) by 4.7x (4.3x)
- Score: 4.0880509203447595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The human brain has the ability to carry out new tasks with limited
experience. It utilizes prior learning experiences to adapt the solution
strategy to new domains. On the other hand, deep neural networks (DNNs)
generally need large amounts of data and computational resources for training.
However, this requirement is not met in many settings. To address these
challenges, we propose the TUTOR DNN synthesis framework. TUTOR targets tabular
datasets. It synthesizes accurate DNN models with limited available data and
reduced memory/computational requirements. It consists of three sequential
steps. The first step involves generation, verification, and labeling of
synthetic data. The synthetic data generation module targets both the
categorical and continuous features. TUTOR generates the synthetic data from
the same probability distribution as the real data. It then verifies the
integrity of the generated synthetic data using a semantic integrity classifier
module. It labels the synthetic data based on a set of rules extracted from the
real dataset. Next, TUTOR uses two training schemes that combine synthetic and
training data to learn the parameters of the DNN model. These two schemes focus
on two different ways in which synthetic data can be used to derive a prior on
the model parameters and, hence, provide a better DNN initialization for
training with real data. In the third step, TUTOR employs a grow-and-prune
synthesis paradigm to learn both the weights and the architecture of the DNN to
reduce model size while ensuring its accuracy. We evaluate the performance of
TUTOR on nine datasets of various sizes. We show that in comparison to fully
connected DNNs, TUTOR, on an average, reduces the need for data by 5.9x,
improves accuracy by 3.4%, and reduces the number of parameters (fFLOPs) by
4.7x (4.3x). Thus, TUTOR enables a less data-hungry, more accurate, and more
compact DNN synthesis.
Related papers
- Little Giants: Synthesizing High-Quality Embedding Data at Scale [71.352883755806]
We introduce SPEED, a framework that aligns open-source small models to efficiently generate large-scale embedding data.
SPEED uses only less than 1/10 of the GPT API calls, outperforming the state-of-the-art embedding model E5_mistral when both are trained solely on their synthetic data.
arXiv Detail & Related papers (2024-10-24T10:47:30Z) - Inferring Data Preconditions from Deep Learning Models for Trustworthy
Prediction in Deployment [25.527665632625627]
It is important to reason about the trustworthiness of the model's predictions with unseen data during deployment.
Existing methods for specifying and verifying traditional software are insufficient for this task.
We propose a novel technique that uses rules derived from neural network computations to infer data preconditions.
arXiv Detail & Related papers (2024-01-26T03:47:18Z) - Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large
Language Models by Extrapolating Errors from Small Models [69.76066070227452]
*Data Synthesis* is a promising way to train a small model with very little labeled data.
We propose *Synthesis Step by Step* (**S3**), a data synthesis framework that shrinks this distribution gap.
Our approach improves the performance of a small model by reducing the gap between the synthetic dataset and the real data.
arXiv Detail & Related papers (2023-10-20T17:14:25Z) - AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud
Registration [69.21282992341007]
Auto Synth automatically generates 3D training data for point cloud registration.
We replace the point cloud registration network with a much smaller surrogate network, leading to a $4056.43$ speedup.
Our results on TUD-L, LINEMOD and Occluded-LINEMOD evidence that a neural network trained on our searched dataset yields consistently better performance than the same one trained on the widely used ModelNet40 dataset.
arXiv Detail & Related papers (2023-09-20T09:29:44Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - A New Benchmark: On the Utility of Synthetic Data with Blender for Bare
Supervised Learning and Downstream Domain Adaptation [42.2398858786125]
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data.
The uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist.
To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization.
arXiv Detail & Related papers (2023-03-16T09:03:52Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Using GPT-2 to Create Synthetic Data to Improve the Prediction
Performance of NLP Machine Learning Classification Models [0.0]
It is becoming common practice to utilize synthetic data to boost the performance of Machine Learning Models.
I used a Yelp pizza restaurant reviews dataset and transfer learning to fine-tune a pre-trained GPT-2 Transformer Model to generate synthetic pizza reviews data.
I then combined this synthetic data with the original genuine data to create a new joint dataset.
arXiv Detail & Related papers (2021-04-02T20:20:42Z) - STAN: Synthetic Network Traffic Generation with Generative Neural Models [10.54843182184416]
This paper presents STAN (Synthetic network Traffic generation with Autoregressive Neural models), a tool to generate realistic synthetic network traffic datasets.
Our novel neural architecture captures both temporal dependencies and dependence between attributes at any given time.
We evaluate the performance of STAN in terms of the quality of data generated, by training it on both a simulated dataset and a real network traffic data set.
arXiv Detail & Related papers (2020-09-27T04:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.