Regularized Optimal Transport Layers for Generalized Global Pooling
Operations
- URL: http://arxiv.org/abs/2212.06339v1
- Date: Tue, 13 Dec 2022 02:46:36 GMT
- Title: Regularized Optimal Transport Layers for Generalized Global Pooling
Operations
- Authors: Hongteng Xu and Minjie Cheng
- Abstract summary: We develop a novel and generalized global pooling framework through the lens of optimal transport.
Our framework is interpretable from the perspective of expectation-maximization.
Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling.
- Score: 25.309212446782684
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Global pooling is one of the most significant operations in many machine
learning models and tasks, which works for information fusion and structured
data (like sets and graphs) representation. However, without solid mathematical
fundamentals, its practical implementations often depend on empirical
mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In
this work, we develop a novel and generalized global pooling framework through
the lens of optimal transport. The proposed framework is interpretable from the
perspective of expectation-maximization. Essentially, it aims at learning an
optimal transport across sample indices and feature dimensions, making the
corresponding pooling operation maximize the conditional expectation of input
data. We demonstrate that most existing pooling methods are equivalent to
solving a regularized optimal transport (ROT) problem with different
specializations, and more sophisticated pooling operations can be implemented
by hierarchically solving multiple ROT problems. Making the parameters of the
ROT problem learnable, we develop a family of regularized optimal transport
pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep
implicit layer. Their model architectures correspond to different optimization
algorithms. We test our ROTP layers in several representative set-level machine
learning scenarios, including multi-instance learning (MIL), graph
classification, graph set representation, and image classification.
Experimental results show that applying our ROTP layers can reduce the
difficulty of the design and selection of global pooling -- our ROTP layers may
either imitate some existing global pooling methods or lead to some new pooling
layers fitting data better. The code is available at
\url{https://github.com/SDS-Lab/ROT-Pooling}.
Related papers
- Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models [16.16372459671255]
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget.
We propose a novel framework that integrates smaller auxiliary modules within each Feed-Forward Network layer of the LLM.
We show that trained routers operate differently from oracles and often yield suboptimal solutions.
arXiv Detail & Related papers (2024-10-01T16:10:21Z) - Specularity Factorization for Low-Light Enhancement [2.7961648901433134]
We present a new additive image factorization technique that treats images to be composed of multiple latent components.
Our model-driven em RSFNet estimates these factors by unrolling the optimization into network layers.
The resultant factors are interpretable by design and can be fused for different image enhancement tasks via a network or combined directly by the user.
arXiv Detail & Related papers (2024-04-02T14:41:42Z) - Submodel Partitioning in Hierarchical Federated Learning: Algorithm
Design and Convergence Analysis [15.311309249848739]
Hierarchical learning (FL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL)
In this paper, we propose independent sub training overconstrained Internet of Things (IoT)
Key idea behind HIST is a global version of model computation, where we partition the global model into disjoint submodels in each round, and distribute them across different cells.
arXiv Detail & Related papers (2023-10-27T04:42:59Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Multi-Agent Reinforcement Learning for Microprocessor Design Space
Exploration [71.95914457415624]
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency.
We propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem.
Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines.
arXiv Detail & Related papers (2022-11-29T17:10:24Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical
Place Recognition [0.09558392439655011]
We propose a tightly coupled learning (TCL) strategy to train triplet models.
It combines global and local descriptors for joint optimization.
Our lightweight unified model is better than several state-of-the-art methods.
arXiv Detail & Related papers (2022-02-14T03:20:39Z) - Revisiting Pooling through the Lens of Optimal Transport [25.309212446782684]
We develop a novel and solid algorithmic pooling framework through the lens of optimal transport.
We make the parameters of the UOT problem learnable, and accordingly, propose a generalized pooling layer called textitUOT-Pooling for neural networks.
We test our UOT-Pooling layers in two application scenarios, including multi-instance learning (MIL) and graph embedding.
arXiv Detail & Related papers (2022-01-23T06:20:39Z) - MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks [56.66010634895913]
The learning rate (LR) is one of the most important hyper-learned network parameters in gradient descent (SGD) training networks (DNN)
In this paper, we propose to learn a proper LR schedule for MLR-SNet tasks.
We also make MLR-SNet to query tasks like different noises, architectures, data modalities, sizes from the training ones, and achieve or even better performance.
arXiv Detail & Related papers (2020-07-29T01:18:58Z) - MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and
Architectures [61.73533544385352]
We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data.
As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
arXiv Detail & Related papers (2020-06-13T02:54:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.