Related papers: Supervised Tree-Wasserstein Distance

Supervised Tree-Wasserstein Distance

URL: http://arxiv.org/abs/2101.11520v1
Date: Wed, 27 Jan 2021 16:24:51 GMT
Title: Supervised Tree-Wasserstein Distance
Authors: Yuki Takezawa, Ryoma Sato, Makoto Yamada
Abstract summary: We propose the Supervised Tree-Wasserstein (STW) distance, a fast, supervised metric learning method based on the tree metric. We show that the STW distance can be computed fast, and improves the accuracy of document classification tasks.
Score: 21.9998734051455
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To measure the similarity of documents, the Wasserstein distance is a powerful tool, but it requires a high computational cost. Recently, for fast computation of the Wasserstein distance, methods for approximating the Wasserstein distance using a tree metric have been proposed. These tree-based methods allow fast comparisons of a large number of documents; however, they are unsupervised and do not learn task-specific distances. In this work, we propose the Supervised Tree-Wasserstein (STW) distance, a fast, supervised metric learning method based on the tree metric. Specifically, we rewrite the Wasserstein distance on the tree metric by the parent-child relationships of a tree, and formulate it as a continuous optimization problem using a contrastive loss. Experimentally, we show that the STW distance can be computed fast, and improves the accuracy of document classification tasks. Furthermore, the STW distance is formulated by matrix multiplications, runs on a GPU, and is suitable for batch processing. Therefore, we show that the STW distance is extremely efficient when comparing a large number of documents.

Related papers

Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls [83.89771461061903]
Recent advancements in tree search algorithms guided by verifiers have significantly enhanced the reasoning capabilities of large language models (LLMs) Recent advancements in tree search algorithms guided by verifiers have significantly enhanced the reasoning capabilities of large language models (LLMs) We identify two key challenges contributing to this inefficiency: $textitover-exploration$ due to redundant states with semantically equivalent content, and $textitunder-exploration$ caused by high variance in verifier scoring. We propose FETCH, a flexible, plug-and-play system compatible with various tree search algorithms.
arXiv Detail & Related papers (2025-02-16T16:12:01Z)
Fast unsupervised ground metric learning with tree-Wasserstein distance [14.235762519615175]
unsupervised ground metric learning approaches have been introduced. We propose to augment the WSV method by embedding samples and features on trees, on which we compute the tree-Wasserstein distance (TWD) We demonstrate theoretically and empirically that the algorithm converges to a better approximation of the full WSV approach than the best known alternatives, and does so with $mathcalO(n3)$ complexity.
arXiv Detail & Related papers (2024-11-11T23:21:01Z)
Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$) We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z)
A Class of Topological Pseudodistances for Fast Comparison of Persistence Diagrams [0.3968603035422276]
We introduce a class of pseudodistances called Extended Topological Pseudodistances (ETD) ETDs have tunable complexity, and can approximate Sliced and classical Wasserstein distances at the high-complexity extreme. We experimentally verify that ETDs outperform PSs in terms of accuracy and outperform Wasserstein and Sliced Wasserstein distances in terms of computational complexity.
arXiv Detail & Related papers (2024-02-22T12:27:35Z)
Tree Prompting: Efficient Task Adaptation without Fine-Tuning [112.71020326388029]
Tree Prompting builds a decision tree of prompts, linking multiple LM calls together to solve a task. Experiments on classification datasets show that Tree Prompting improves accuracy over competing methods and is competitive with fine-tuning.
arXiv Detail & Related papers (2023-10-21T15:18:22Z)
Linearized Wasserstein dimensionality reduction with approximation guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space. We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size. We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z)
Robust Multi-Object Tracking by Marginal Inference [92.48078680697311]
Multi-object tracking in videos requires to solve a fundamental problem of one-to-one assignment between objects in adjacent frames. We present an efficient approach to compute a marginal probability for each pair of objects in real time. It achieves competitive results on MOT17 and MOT20 benchmarks.
arXiv Detail & Related papers (2022-08-07T14:04:45Z)
Approximating 1-Wasserstein Distance with Trees [41.77145868123863]
Wasserstein distance shows efficacy in various types of natural language processing (NLP) and computer vision (CV) applications. One of the challenges in estimating Wasserstein distance is that it is computationally expensive and does not scale well for many distribution comparison tasks. We propose a simple yet efficient L1-regularized approach to learning the weights of the edges in a tree.
arXiv Detail & Related papers (2022-06-24T07:19:50Z)
Wasserstein Distances, Geodesics and Barycenters of Merge Trees [9.149293243237778]
This paper presents a unified computational framework for the estimation of distances, geodesics and barycenters of merge trees. We introduce a new metric, called the Wasserstein distance between merge trees, which is purposely designed to enable efficient computations of geodesics and barycenters.
arXiv Detail & Related papers (2021-07-16T09:27:49Z)
Metric Learning for Ordered Labeled Trees with pq-grams [11.284638114256712]
We propose a new metric learning approach for tree-structured data with pq-grams. The pq-gram distance is a distance for ordered labeled trees, and has much lower computation cost than the tree edit distance. We empirically show that the proposed approach achieves competitive results with the state-of-the-art edit distance-based methods.
arXiv Detail & Related papers (2020-03-09T08:04:47Z)
Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations. Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic. We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.