Big Learning
- URL: http://arxiv.org/abs/2207.03899v4
- Date: Sun, 21 May 2023 04:05:46 GMT
- Title: Big Learning
- Authors: Yulai Cong, Miaoyun Zhao
- Abstract summary: Big learning exploits the information inherent in its large-scale complete/incomplete training data.
Big learning is capable of delivering all joint/conditional/marginal data capabilities with one universal model.
Diverse experiments are carried out to validate the effectiveness of the presented big learning.
- Score: 15.733273085085763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in big/foundation models reveal a promising path for deep
learning, where the roadmap steadily moves from big data to big models to (the
newly-introduced) big learning. Specifically, the big learning exhaustively
exploits the information inherent in its large-scale complete/incomplete
training data, by simultaneously modeling many/all joint/conditional/marginal
data distributions across potentially diverse domains, with one universal
foundation model. We reveal that big learning ($i$) underlies most existing
foundation models, ($ii$) is equipped with extraordinary flexibilities for
complete/incomplete training data and trustworthy data tasks, ($iii$) is
capable of delivering all joint/conditional/marginal data capabilities with one
universal model, and ($iv$) unifies conventional machine learning paradigms and
enables their flexible cooperations, manifested as a universal learning
paradigm. Diverse experiments are carried out to validate the effectiveness of
the presented big learning.
Related papers
- $\textbf{Only-IF}$:Revealing the Decisive Effect of Instruction Diversity on Generalization [1.6958018695660049]
We show that $textbfonly emerges$ when training data is diversified enough across semantic domains.
We extend our analysis to real-world scenarios, including fine-tuning of $textit$textbfspecialist$$ and $textit$textbfgeneralist$$ models.
arXiv Detail & Related papers (2024-10-07T03:15:11Z) - Big Cooperative Learning [7.958840888809145]
We show that the training of foundation models can be interpreted as a form of big cooperative learning.
We propose the BigLearn-GAN, which is a novel adversarially-trained foundation model with versatile data sampling capabilities.
arXiv Detail & Related papers (2024-07-31T03:59:14Z) - Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding [9.112203072394648]
Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow.
Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples.
arXiv Detail & Related papers (2023-12-08T19:26:13Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - X-Learner: Learning Cross Sources and Tasks for Universal Visual
Representation [71.51719469058666]
We propose a representation learning framework called X-Learner.
X-Learner learns the universal feature of multiple vision tasks supervised by various sources.
X-Learner achieves strong performance on different tasks without extra annotations, modalities and computational costs.
arXiv Detail & Related papers (2022-03-16T17:23:26Z) - Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning.
I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Relating by Contrasting: A Data-efficient Framework for Multimodal
Generative Models [86.9292779620645]
We develop a contrastive framework for generative model learning, allowing us to train the model not just by the commonality between modalities, but by the distinction between "related" and "unrelated" multimodal data.
Under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data.
arXiv Detail & Related papers (2020-07-02T15:08:11Z) - Learning from Imperfect Annotations [15.306536555936692]
Many machine learning systems today are trained on large amounts of human-annotated data.
We propose a new end-to-end framework that enables us to merge the aggregation step with model training.
We show accuracy gains of up to 25% over the current state-of-the-art approaches for aggregating annotations.
arXiv Detail & Related papers (2020-04-07T15:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.