Regularized Optimal Transport Layers for Generalized Global Pooling
Operations
- URL: http://arxiv.org/abs/2212.06339v1
- Date: Tue, 13 Dec 2022 02:46:36 GMT
- Title: Regularized Optimal Transport Layers for Generalized Global Pooling
Operations
- Authors: Hongteng Xu and Minjie Cheng
- Abstract summary: We develop a novel and generalized global pooling framework through the lens of optimal transport.
Our framework is interpretable from the perspective of expectation-maximization.
Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling.
- Score: 25.309212446782684
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Global pooling is one of the most significant operations in many machine
learning models and tasks, which works for information fusion and structured
data (like sets and graphs) representation. However, without solid mathematical
fundamentals, its practical implementations often depend on empirical
mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In
this work, we develop a novel and generalized global pooling framework through
the lens of optimal transport. The proposed framework is interpretable from the
perspective of expectation-maximization. Essentially, it aims at learning an
optimal transport across sample indices and feature dimensions, making the
corresponding pooling operation maximize the conditional expectation of input
data. We demonstrate that most existing pooling methods are equivalent to
solving a regularized optimal transport (ROT) problem with different
specializations, and more sophisticated pooling operations can be implemented
by hierarchically solving multiple ROT problems. Making the parameters of the
ROT problem learnable, we develop a family of regularized optimal transport
pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep
implicit layer. Their model architectures correspond to different optimization
algorithms. We test our ROTP layers in several representative set-level machine
learning scenarios, including multi-instance learning (MIL), graph
classification, graph set representation, and image classification.
Experimental results show that applying our ROTP layers can reduce the
difficulty of the design and selection of global pooling -- our ROTP layers may
either imitate some existing global pooling methods or lead to some new pooling
layers fitting data better. The code is available at
\url{https://github.com/SDS-Lab/ROT-Pooling}.
Related papers
- Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.
However, they still struggle with problems requiring multi-step decision-making and environmental feedback.
We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - MINIMA: Modality Invariant Image Matching [52.505282811925454]
We present MINIMA, a unified image matching framework for multiple cross-modal cases.
We scale up the modalities from cheap but rich RGB-only matching data, by means of generative models.
With MD-syn, we can directly train any advanced matching pipeline on randomly selected modality pairs to obtain cross-modal ability.
arXiv Detail & Related papers (2024-12-27T02:39:50Z) - Comparative Analysis of Pooling Mechanisms in LLMs: A Sentiment Analysis Perspective [0.0]
Transformer-based models like BERT and GPT rely on pooling layers to aggregate token-level embeddings into sentence-level representations.
Common pooling mechanisms such as Mean, Max, and Weighted Sum play a pivotal role in this aggregation process.
This paper investigates the effects of these pooling mechanisms on two prominent LLM families -- BERT and GPT, in the context of sentence-level sentiment analysis.
arXiv Detail & Related papers (2024-11-22T00:59:25Z) - Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models [16.16372459671255]
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget.
We propose a novel framework that integrates smaller auxiliary modules within each Feed-Forward Network layer of the LLM.
We show that trained routers operate differently from oracles and often yield suboptimal solutions.
arXiv Detail & Related papers (2024-10-01T16:10:21Z) - Specularity Factorization for Low-Light Enhancement [2.7961648901433134]
We present a new additive image factorization technique that treats images to be composed of multiple latent components.
Our model-driven em RSFNet estimates these factors by unrolling the optimization into network layers.
The resultant factors are interpretable by design and can be fused for different image enhancement tasks via a network or combined directly by the user.
arXiv Detail & Related papers (2024-04-02T14:41:42Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical
Place Recognition [0.09558392439655011]
We propose a tightly coupled learning (TCL) strategy to train triplet models.
It combines global and local descriptors for joint optimization.
Our lightweight unified model is better than several state-of-the-art methods.
arXiv Detail & Related papers (2022-02-14T03:20:39Z) - Revisiting Pooling through the Lens of Optimal Transport [25.309212446782684]
We develop a novel and solid algorithmic pooling framework through the lens of optimal transport.
We make the parameters of the UOT problem learnable, and accordingly, propose a generalized pooling layer called textitUOT-Pooling for neural networks.
We test our UOT-Pooling layers in two application scenarios, including multi-instance learning (MIL) and graph embedding.
arXiv Detail & Related papers (2022-01-23T06:20:39Z) - MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks [56.66010634895913]
The learning rate (LR) is one of the most important hyper-learned network parameters in gradient descent (SGD) training networks (DNN)
In this paper, we propose to learn a proper LR schedule for MLR-SNet tasks.
We also make MLR-SNet to query tasks like different noises, architectures, data modalities, sizes from the training ones, and achieve or even better performance.
arXiv Detail & Related papers (2020-07-29T01:18:58Z) - MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and
Architectures [61.73533544385352]
We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data.
As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
arXiv Detail & Related papers (2020-06-13T02:54:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.