Related papers: Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators

Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators

URL: http://arxiv.org/abs/2108.08891v1
Date: Thu, 19 Aug 2021 19:54:04 GMT
Title: Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators
Authors: Zihang Meng, Vikas Singh, Sathya N. Ravi
Abstract summary: We study how differential equation (SDE) based ideas can inspire new modifications to existing algorithms for a set of problems in computer vision. We show promising experiments on a number of vision tasks including few shot learning, point cloud transformers and deep variational segmentation.
Score: 37.92379202320938
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study how stochastic differential equation (SDE) based ideas can inspire new modifications to existing algorithms for a set of problems in computer vision. Loosely speaking, our formulation is related to both explicit and implicit strategies for data augmentation and group equivariance, but is derived from new results in the SDE literature on estimating infinitesimal generators of a class of stochastic processes. If and when there is nominal agreement between the needs of an application/task and the inherent properties and behavior of the types of processes that we can efficiently handle, we obtain a very simple and efficient plug-in layer that can be incorporated within any existing network architecture, with minimal modification and only a few additional parameters. We show promising experiments on a number of vision tasks including few shot learning, point cloud transformers and deep variational segmentation obtaining efficiency or performance improvements.

Related papers

Instruction-Guided Autoregressive Neural Network Parameter Generation [49.800239140036496]
We propose IGPG, an autoregressive framework that unifies parameter synthesis across diverse tasks and architectures. By autoregressively generating neural network weights' tokens, IGPG ensures inter-layer coherence and enables efficient adaptation across models and datasets. Experiments on multiple datasets demonstrate that IGPG consolidates diverse pretrained models into a single, flexible generative framework.
arXiv Detail & Related papers (2025-04-02T05:50:19Z)
Efficient Training of Neural Fractional-Order Differential Equation via Adjoint Backpropagation [19.331085375128048]
Fractional-order differential equations (FDEs) enhance traditional differential equations by extending the order of differential operators from integers to real numbers. Recent progress at the intersection of FDEs and deep learning has catalyzed a new wave of innovative models. We propose a scalable adjoint backpropagation method for training neural FDEs by solving an augmented FDE backward in time.
arXiv Detail & Related papers (2025-03-20T19:26:54Z)
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks. We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge. Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Synergistic Learning with Multi-Task DeepONet for Efficient PDE Problem Solving [5.692133861249929]
Multi-task learning (MTL) is an inductive transfer mechanism designed to leverage useful information from multiple tasks to improve generalization performance. In this work, we apply MTL to problems in science and engineering governed by partial differential equations (PDEs) We present a multi-task deep operator network (MT-DeepONet) to learn solutions across various functional forms of source terms in a PDE and multiple geometries in a single concurrent training session.
arXiv Detail & Related papers (2024-08-05T02:50:58Z)
HAMLET: Graph Transformer Neural Operator for Partial Differential Equations [13.970458554623939]
We present a novel graph transformer framework, HAMLET, designed to address the challenges in solving partial differential equations (PDEs) using neural networks. The framework uses graph transformers with modular input encoders to directly incorporate differential equation information into the solution process. Notably, HAMLET scales effectively with increasing data complexity and noise, showcasing its robustness.
arXiv Detail & Related papers (2024-02-05T21:55:24Z)
Deep Learning-based surrogate models for parametrized PDEs: handling geometric variability through graph neural networks [0.0]
This work explores the potential usage of graph neural networks (GNNs) for the simulation of time-dependent PDEs. We propose a systematic strategy to build surrogate models based on a data-driven time-stepping scheme. We show that GNNs can provide a valid alternative to traditional surrogate models in terms of computational efficiency and generalization to new scenarios.
arXiv Detail & Related papers (2023-08-03T08:14:28Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
Towards Multi-spatiotemporal-scale Generalized PDE Modeling [4.924631198058705]
We make a comparison between various FNO and U-Net like approaches on fluid mechanics problems in both vorticity-stream and velocity function form. We show promising results on generalization to different PDE parameters and time-scales with a single surrogate model.
arXiv Detail & Related papers (2022-09-30T17:40:05Z)
Robust and Scalable SDE Learning: A Functional Perspective [5.642000444047032]
We propose an importance-sampling for probabilities of observations of SDE estimators for the purposes of learning. The proposed method produces lower-variance estimates compared to algorithms based on SDE. This facilitates the effective use of large-scale parallel hardware for massive decreases in time.
arXiv Detail & Related papers (2021-10-11T11:36:50Z)
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework. We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z)
Efficient Feature Transformations for Discriminative and Generative Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning. Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture. We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.