Related papers: DAViD: Data-efficient and Accurate Vision Models from Synthetic Data

DAViD: Data-efficient and Accurate Vision Models from Synthetic Data

URL: http://arxiv.org/abs/2507.15365v1
Date: Mon, 21 Jul 2025 08:17:41 GMT
Title: DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
Authors: Fatemeh Saleh, Sadegh Aliakbarian, Charlie Hewitt, Lohit Petikam, Xiao-Xian, Antonio Criminisi, Thomas J. Cashman, Tadas Baltrušaitis,
Abstract summary: We demonstrate that it is possible to train models on much smaller but high-fidelity synthetic datasets.<n>Our models require only a fraction of the cost of training and inference when compared with foundational models of similar accuracy.
Score: 6.829390872619486
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The state of the art in human-centric computer vision achieves high accuracy and robustness across a diverse range of tasks. The most effective models in this domain have billions of parameters, thus requiring extremely large datasets, expensive training regimes, and compute-intensive inference. In this paper, we demonstrate that it is possible to train models on much smaller but high-fidelity synthetic datasets, with no loss in accuracy and higher efficiency. Using synthetic training data provides us with excellent levels of detail and perfect labels, while providing strong guarantees for data provenance, usage rights, and user consent. Procedural data synthesis also provides us with explicit control on data diversity, that we can use to address unfairness in the models we train. Extensive quantitative assessment on real input images demonstrates accuracy of our models on three dense prediction tasks: depth estimation, surface normal estimation, and soft foreground segmentation. Our models require only a fraction of the cost of training and inference when compared with foundational models of similar accuracy. Our human-centric synthetic dataset and trained models are available at https://aka.ms/DAViD.

Related papers

Prototype-Guided Diffusion for Digital Pathology: Achieving Foundation Model Performance with Minimal Clinical Data [6.318463500874778]
We propose a prototype-guided diffusion model to generate high-fidelity synthetic pathology data at scale.<n>Our approach ensures biologically and diagnostically meaningful variations in the generated data.<n>We demonstrate that self-supervised features trained on our synthetic dataset achieve competitive performance despite using 60x-760x less data than models trained on large real-world datasets.
arXiv Detail & Related papers (2025-04-15T21:17:39Z)
Scaling Laws of Synthetic Data for Language Models [132.67350443447611]
We introduce SynthLLM, a scalable framework that transforms pre-training corpora into diverse, high-quality synthetic datasets.<n>Our approach achieves this by automatically extracting and recombining high-level concepts across multiple documents using a graph algorithm.
arXiv Detail & Related papers (2025-03-25T11:07:12Z)
Enhancing Object Detection Accuracy in Autonomous Vehicles Using Synthetic Data [0.8267034114134277]
Performance of machine learning models depends on the nature and size of the training data sets. High-quality, diverse, relevant and representative training data is essential to build accurate and reliable machine learning models. It is hypothesised that well-designed synthetic data can improve the performance of a machine learning algorithm.
arXiv Detail & Related papers (2024-11-23T16:38:02Z)
Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset. We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding. Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z)
Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks. Such models tend to be large and require commensurate volumes of training data. It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs. Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z)
How Good Are Synthetic Medical Images? An Empirical Study with Lung Ultrasound [0.3312417881789094]
Adding synthetic training data using generative models offers a low-cost method to deal with the data scarcity challenge. We show that training with both synthetic and real data outperforms training with real data alone.
arXiv Detail & Related papers (2023-10-05T15:42:53Z)
On the Stability of Iterative Retraining of Generative Models on their own Data [56.153542044045224]
We study the impact of training generative models on mixed datasets. We first prove the stability of iterative training under the condition that the initial generative models approximate the data distribution well enough. We empirically validate our theory on both synthetic and natural images by iteratively training normalizing flows and state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-09-30T16:41:04Z)
Robust Category-Level 3D Pose Estimation from Synthetic Data [17.247607850702558]
We introduce SyntheticP3D, a new synthetic dataset for object pose estimation generated from CAD models. We propose a novel approach (CC3D) for training neural mesh models that perform pose estimation via inverse rendering.
arXiv Detail & Related papers (2023-05-25T14:56:03Z)
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data [78.21197488065177]
Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning. This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
arXiv Detail & Related papers (2022-10-06T15:25:00Z)
Dataset Distillation by Matching Training Trajectories [75.9031209877651]
We propose a new formulation that optimize our distilled data to guide networks to a similar state as those trained on real data. Given a network, we train it for several iterations on our distilled data and optimize the distilled data with respect to the distance between the synthetically trained parameters and the parameters trained on real data. Our method handily outperforms existing methods and also allows us to distill higher-resolution visual data.
arXiv Detail & Related papers (2022-03-22T17:58:59Z)
Data Impressions: Mining Deep Models to Extract Samples for Data-free Applications [26.48630545028405]
"Data Impressions" act as proxy to the training data and can be used to realize a variety of tasks. We show the applicability of data impressions in solving several computer vision tasks.
arXiv Detail & Related papers (2021-01-15T11:37:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.