A Survey of Deep Learning: From Activations to Transformers
- URL: http://arxiv.org/abs/2302.00722v3
- Date: Sat, 10 Feb 2024 17:48:25 GMT
- Title: A Survey of Deep Learning: From Activations to Transformers
- Authors: Johannes Schneider and Michalis Vlachos
- Abstract summary: We provide a comprehensive overview of the most important, recent works in deep learning.
We identify and discuss patterns that summarize the key strategies for many of the successful innovations over the last decade.
We also include a discussion on recent commercially built, closed-source models such as OpenAI's GPT-4 and Google's PaLM 2.
- Score: 3.175481425273993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning has made tremendous progress in the last decade. A key success
factor is the large amount of architectures, layers, objectives, and
optimization techniques. They include a myriad of variants related to
attention, normalization, skip connections, transformers and self-supervised
learning schemes -- to name a few. We provide a comprehensive overview of the
most important, recent works in these areas to those who already have a basic
understanding of deep learning. We hope that a holistic and unified treatment
of influential, recent works helps researchers to form new connections between
diverse areas of deep learning. We identify and discuss multiple patterns that
summarize the key strategies for many of the successful innovations over the
last decade as well as works that can be seen as rising stars. We also include
a discussion on recent commercially built, closed-source models such as
OpenAI's GPT-4 and Google's PaLM 2.
Related papers
- Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective [77.94874338927492]
OpenAI has claimed that the main techinique behinds o1 is the reinforcement learning.
This paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning.
arXiv Detail & Related papers (2024-12-18T18:24:47Z) - A Decade of Deep Learning: A Survey on The Magnificent Seven [19.444198085817543]
Deep learning has fundamentally reshaped the landscape of artificial intelligence over the past decade.
We present a comprehensive overview of the most influential deep learning algorithms selected through a broad-based survey of the field.
Our discussion centers on pivotal architectures, including Residual Networks, Transformers, Generative Adversarial Networks, Variational Autoencoders, Graph Neural Networks, Contrastive Language-Image Pre-training, and Diffusion models.
arXiv Detail & Related papers (2024-12-13T17:55:39Z) - O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey.
Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects.
We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z) - Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities.
One of the crucial factors to achieve success is aligning the LLM's output with human preferences.
We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z) - What comes after transformers? -- A selective survey connecting ideas in deep learning [1.8592384822257952]
Transformers have become the de-facto standard model in artificial intelligence since 2017.
For researchers it is difficult to keep track of such developments on a broader level.
We provide a comprehensive overview of the many important, recent works in these areas to those who already have a basic understanding of deep learning.
arXiv Detail & Related papers (2024-08-01T08:50:25Z) - A Survey on Vision-Language-Action Models for Embodied AI [71.16123093739932]
Vision-language-action models (VLAs) have become a foundational element in robot learning.
Various methods have been proposed to enhance traits such as versatility, dexterity, and generalizability.
VLAs serve as high-level task planners capable of decomposing long-horizon tasks into executable subtasks.
arXiv Detail & Related papers (2024-05-23T01:43:54Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - Deep Learning for Face Anti-Spoofing: A Survey [74.42603610773931]
Face anti-spoofing (FAS) has lately attracted increasing attention due to its vital role in securing face recognition systems from presentation attacks (PAs)
arXiv Detail & Related papers (2021-06-28T19:12:00Z) - A Deep Learning Framework for Lifelong Machine Learning [6.662800021628275]
We propose a simple yet powerful unified deep learning framework.
Our framework supports almost all of these properties and approaches through one central mechanism.
We hope that this unified lifelong learning framework inspires new work towards large-scale experiments and understanding human learning in general.
arXiv Detail & Related papers (2021-05-01T03:43:25Z) - Meta-Learning in Neural Networks: A Survey [4.588028371034406]
This survey describes the contemporary meta-learning landscape.
We first discuss definitions of meta-learning and position it with respect to related fields.
We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods.
arXiv Detail & Related papers (2020-04-11T16:34:24Z) - A Survey of Deep Learning for Scientific Discovery [13.372738220280317]
We have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks.
The amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity.
This suggests many exciting opportunities for deep learning applications in scientific settings.
arXiv Detail & Related papers (2020-03-26T06:16:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.