Related papers: Model Complexity of Deep Learning: A Survey

Model Complexity of Deep Learning: A Survey

URL: http://arxiv.org/abs/2103.05127v1
Date: Mon, 8 Mar 2021 22:39:32 GMT
Title: Model Complexity of Deep Learning: A Survey
Authors: Xia Hu, Lingyang Chu, Jian Pei, Weiqing Liu and Jiang Bian
Abstract summary: We conduct a systematic overview of the latest studies on model complexity in deep learning. We review the existing studies on those two categories along four important factors, including model framework, model size, optimization process and data complexity.
Score: 79.20117679251766
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model complexity is a fundamental problem in deep learning. In this paper we conduct a systematic overview of the latest studies on model complexity in deep learning. Model complexity of deep learning can be categorized into expressive capacity and effective model complexity. We review the existing studies on those two categories along four important factors, including model framework, model size, optimization process and data complexity. We also discuss the applications of deep learning model complexity including understanding model generalization capability, model optimization, and model selection and design. We conclude by proposing several interesting future directions.

Related papers

C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning [78.36259648527401]
C2-Evo is an automatic, closed-loop self-improving framework that jointly evolves both training data and model capabilities.<n>We show that C2-Evo consistently obtains considerable performance gains across multiple mathematical reasoning benchmarks.
arXiv Detail & Related papers (2025-07-22T12:27:08Z)
Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models [104.17057231661371]
Time series analysis is crucial for understanding dynamics of complex systems. Recent advances in foundation models have led to task-agnostic Time Series Foundation Models (TSFMs) and Large Language Model-based Time Series Models (TSLLMs) Their success depends on large, diverse, and high-quality datasets, which are challenging to build due to regulatory, diversity, quality, and quantity constraints. This survey provides a comprehensive review of synthetic data for TSFMs and TSLLMs, analyzing data generation strategies, their role in model pretraining, fine-tuning, and evaluation, and identifying future research directions.
arXiv Detail & Related papers (2025-03-14T13:53:46Z)
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community. There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z)
A Survey on State-of-the-art Deep Learning Applications and Challenges [0.0]
Building a deep learning model is challenging due to the algorithm's complexity and the dynamic nature of real-world problems. This study aims to comprehensively review the state-of-the-art deep learning models in computer vision, natural language processing, time series analysis and pervasive computing.
arXiv Detail & Related papers (2024-03-26T10:10:53Z)
Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z)
Learning Dynamics Models for Model Predictive Agents [28.063080817465934]
Model-Based Reinforcement Learning involves learning a textitdynamics model from data, and then using this model to optimise behaviour. This paper sets out to disambiguate the role of different design choices for learning dynamics models, by comparing their performance to planning with a ground-truth model.
arXiv Detail & Related papers (2021-09-29T09:50:25Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)
Redefining Neural Architecture Search of Heterogeneous Multi-Network Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models. We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z)
Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An Information-Theoretic Framework [20.28063653485698]
We provide a comprehensive framework for deep learning model design and information-theoretic analysis. First, we develop and demonstrate a novel interactively-connected deep recurrent neural network (I$2$DRNN) model. Second, to theoretically prove that our designed model can learn multi-scale-temporal dependency in PSTA tasks, we provide an information-theoretic analysis.
arXiv Detail & Related papers (2020-09-14T10:05:14Z)
Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey [1.2031796234206134]
Model-based reinforcement learning creates an explicit model of the environment dynamics to reduce the need for environment samples. A challenge for deep model-based methods is to achieve high predictive power while maintaining low sample complexity. We propose a taxonomy based on three approaches: using explicit planning on given transitions, using explicit planning on learned transitions, and end-to-end learning of both planning and transitions.
arXiv Detail & Related papers (2020-08-11T08:49:04Z)
Revealing the Invisible with Model and Data Shrinking for Composite-database Micro-expression Recognition [49.463864096615254]
We analyze the influence of learning complexity, including the input complexity and model complexity. We propose a recurrent convolutional network (RCN) to explore the shallower-architecture and lower-resolution input data. We develop three parameter-free modules to integrate with RCN without increasing any learnable parameters.
arXiv Detail & Related papers (2020-06-17T06:19:24Z)
PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes [31.83144400718369]
We consider the problem of batch multi-task reinforcement learning with observed context descriptors, motivated by its application to personalized medical treatment. We study two general classes of learning algorithms: direct policy learning (DPL), an imitation-learning based approach which learns from expert trajectories, and model-based learning.
arXiv Detail & Related papers (2020-06-11T11:57:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.