FFCV: Accelerating Training by Removing Data Bottlenecks
- URL: http://arxiv.org/abs/2306.12517v1
- Date: Wed, 21 Jun 2023 19:06:41 GMT
- Title: FFCV: Accelerating Training by Removing Data Bottlenecks
- Authors: Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi
Salman, Aleksander Madry
- Abstract summary: We present FFCV, a library for easy and fast machine learning model training.
It speeds up model training by eliminating (often subtle) data bottlenecks from the training process.
Detailed installation instructions, documentation, and Slack support channel are available at https://ffcv.io/.
- Score: 84.89623507733963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present FFCV, a library for easy and fast machine learning model training.
FFCV speeds up model training by eliminating (often subtle) data bottlenecks
from the training process. In particular, we combine techniques such as an
efficient file storage format, caching, data pre-loading, asynchronous data
transfer, and just-in-time compilation to (a) make data loading and transfer
significantly more efficient, ensuring that GPUs can reach full utilization;
and (b) offload as much data processing as possible to the CPU asynchronously,
freeing GPU cycles for training. Using FFCV, we train ResNet-18 and ResNet-50
on the ImageNet dataset with competitive tradeoff between accuracy and training
time. For example, we are able to train an ImageNet ResNet-50 model to 75\% in
only 20 mins on a single machine. We demonstrate FFCV's performance,
ease-of-use, extensibility, and ability to adapt to resource constraints
through several case studies. Detailed installation instructions,
documentation, and Slack support channel are available at https://ffcv.io/ .
Related papers
- TensorSocket: Shared Data Loading for Deep Learning Training [0.0]
Deep learning training is a repetitive and resource-intensive process.
socket enables simultaneous training processes to share the same data loader.
Our evaluation shows thatsocket enables scenarios that are infeasible without data sharing, increases training throughput by up to $100%$.
arXiv Detail & Related papers (2024-09-27T13:39:47Z) - Effective pruning of web-scale datasets based on complexity of concept
clusters [48.125618324485195]
We present a method for pruning large-scale multimodal datasets for training CLIP-style models on ImageNet.
We find that training on a smaller set of high-quality data can lead to higher performance with significantly lower training costs.
We achieve a new state-of-the-art Imagehttps://info.arxiv.org/help/prep#commentsNet zero-shot accuracy and a competitive average zero-shot accuracy on 38 evaluation tasks.
arXiv Detail & Related papers (2024-01-09T14:32:24Z) - Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - CiT: Curation in Training for Effective Vision-Language Data [84.77867625605053]
This paper presents Curation in Training (CiT), a vision-text learning algorithm that couples a data objective into training.
CiT automatically yields quality data to speed-up contrastive image-text training.
We observe that CiT can speed up training by over an order of magnitude, especially if the raw data size is large.
arXiv Detail & Related papers (2023-01-05T18:59:57Z) - Towards Efficient and Data Agnostic Image Classification Training
Pipeline for Embedded Systems [0.0]
This work is focusing on reviewing the latest augmentation and regularization methods for the image classification.
We can achieve a reasonable performance on a variety of downstream image classification tasks without manual tuning of parameters to each particular task.
Resulting models are computationally efficient and can be deployed to CPU using the OpenVINO toolkit.
arXiv Detail & Related papers (2021-08-16T12:38:05Z) - ScaleFreeCTR: MixCache-based Distributed Training System for CTR Models
with Huge Embedding Table [23.264897780201316]
Various deep Click-Through Rate (CTR) models are deployed in the commercial systems by industrial companies.
To achieve better performance, it is necessary to train the deep CTR models on huge volume of training data efficiently.
We propose the ScaleFreeCTR: a MixCache-based distributed training system for CTR models.
arXiv Detail & Related papers (2021-04-17T13:36:19Z) - EfficientNetV2: Smaller Models and Faster Training [91.77432224225221]
This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models.
We use a combination of training-aware neural architecture search and scaling, to jointly optimize training speed and parameter efficiency.
Our experiments show that EfficientNetV2 models train much faster than state-of-the-art models while being up to 6.8x smaller.
arXiv Detail & Related papers (2021-04-01T07:08:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.