Related papers: Regularized Optimal Transport Layers for Generalized Global Pooling Operations

Regularized Optimal Transport Layers for Generalized Global Pooling Operations

URL: http://arxiv.org/abs/2212.06339v1
Date: Tue, 13 Dec 2022 02:46:36 GMT
Title: Regularized Optimal Transport Layers for Generalized Global Pooling Operations
Authors: Hongteng Xu and Minjie Cheng
Abstract summary: We develop a novel and generalized global pooling framework through the lens of optimal transport. Our framework is interpretable from the perspective of expectation-maximization. Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling.
Score: 25.309212446782684
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a novel and generalized global pooling framework through the lens of optimal transport. The proposed framework is interpretable from the perspective of expectation-maximization. Essentially, it aims at learning an optimal transport across sample indices and feature dimensions, making the corresponding pooling operation maximize the conditional expectation of input data. We demonstrate that most existing pooling methods are equivalent to solving a regularized optimal transport (ROT) problem with different specializations, and more sophisticated pooling operations can be implemented by hierarchically solving multiple ROT problems. Making the parameters of the ROT problem learnable, we develop a family of regularized optimal transport pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep implicit layer. Their model architectures correspond to different optimization algorithms. We test our ROTP layers in several representative set-level machine learning scenarios, including multi-instance learning (MIL), graph classification, graph set representation, and image classification. Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling -- our ROTP layers may either imitate some existing global pooling methods or lead to some new pooling layers fitting data better. The code is available at \url{https://github.com/SDS-Lab/ROT-Pooling}.

Related papers

SPIO: Ensemble and Selective Strategies via LLM-Based Multi-Agent Planning in Automated Data Science [1.1343849658875087]
Large Language Models (LLMs) have revolutionized automated data analytics and machine learning by enabling dynamic reasoning and adaptability. We propose SPIO, a novel framework that orchestrates multi-agent planning across four key modules. In each module, dedicated planning agents independently generate candidate strategies that cascade into subsequent stages, fostering comprehensive exploration.
arXiv Detail & Related papers (2025-03-30T04:45:32Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
MINIMA: Modality Invariant Image Matching [52.505282811925454]
We present MINIMA, a unified image matching framework for multiple cross-modal cases. We scale up the modalities from cheap but rich RGB-only matching data, by means of generative models. With MD-syn, we can directly train any advanced matching pipeline on randomly selected modality pairs to obtain cross-modal ability.
arXiv Detail & Related papers (2024-12-27T02:39:50Z)
Comparative Analysis of Pooling Mechanisms in LLMs: A Sentiment Analysis Perspective [2.2334256816037987]
Transformer-based models like BERT and GPT rely on pooling layers to aggregate token-level embeddings into sentence-level representations. Common pooling mechanisms such as Mean, Max, and Weighted Sum play a pivotal role in this aggregation process. This paper investigates the effects of these pooling mechanisms on two prominent LLM families -- BERT and GPT, in the context of sentence-level sentiment analysis.
arXiv Detail & Related papers (2024-11-22T00:59:25Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms. We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models [16.16372459671255]
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget. We propose a novel framework that integrates smaller auxiliary modules within each Feed-Forward Network layer of the LLM. We show that trained routers operate differently from oracles and often yield suboptimal solutions.
arXiv Detail & Related papers (2024-10-01T16:10:21Z)
Specularity Factorization for Low-Light Enhancement [2.7961648901433134]
We present a new additive image factorization technique that treats images to be composed of multiple latent components. Our model-driven em RSFNet estimates these factors by unrolling the optimization into network layers. The resultant factors are interpretable by design and can be fused for different image enhancement tasks via a network or combined directly by the user.
arXiv Detail & Related papers (2024-04-02T14:41:42Z)
RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching) To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth. We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z)
Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration [71.95914457415624]
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. We propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem. Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines.
arXiv Detail & Related papers (2022-11-29T17:10:24Z)
Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z)
Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical Place Recognition [0.09558392439655011]
We propose a tightly coupled learning (TCL) strategy to train triplet models. It combines global and local descriptors for joint optimization. Our lightweight unified model is better than several state-of-the-art methods.
arXiv Detail & Related papers (2022-02-14T03:20:39Z)
Revisiting Pooling through the Lens of Optimal Transport [25.309212446782684]
We develop a novel and solid algorithmic pooling framework through the lens of optimal transport. We make the parameters of the UOT problem learnable, and accordingly, propose a generalized pooling layer called textitUOT-Pooling for neural networks. We test our UOT-Pooling layers in two application scenarios, including multi-instance learning (MIL) and graph embedding.
arXiv Detail & Related papers (2022-01-23T06:20:39Z)
MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks [56.66010634895913]
The learning rate (LR) is one of the most important hyper-learned network parameters in gradient descent (SGD) training networks (DNN) In this paper, we propose to learn a proper LR schedule for MLR-SNet tasks. We also make MLR-SNet to query tasks like different noises, architectures, data modalities, sizes from the training ones, and achieve or even better performance.
arXiv Detail & Related papers (2020-07-29T01:18:58Z)
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures [61.73533544385352]
We propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data. As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize heterogeneous tasks and architectures.
arXiv Detail & Related papers (2020-06-13T02:54:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.