GausSetExpander: A Simple Approach for Entity Set Expansion
- URL: http://arxiv.org/abs/2202.13649v1
- Date: Mon, 28 Feb 2022 09:44:43 GMT
- Title: GausSetExpander: A Simple Approach for Entity Set Expansion
- Authors: A\"issatou Diallo and Johannes F\"urnkranz
- Abstract summary: We propose GausSetExpander, an unsupervised approach based on optimal transport techniques.
We demonstrate the validity of our approach by comparing to state-of-the art approaches.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Entity Set Expansion is an important NLP task that aims at expanding a small
set of entities into a larger one with items from a large pool of candidates.
In this paper, we propose GausSetExpander, an unsupervised approach based on
optimal transport techniques. We propose to re-frame the problem as choosing
the entity that best completes the seed set. For this, we interpret a set as an
elliptical distribution with a centroid which represents the mean and a spread
that is represented by the scale parameter. The best entity is the one that
increases the spread of the set the least. We demonstrate the validity of our
approach by comparing to state-of-the art approaches.
Related papers
- Representative Action Selection for Large Action Space Meta-Bandits [49.386906771833274]
We study the problem of selecting a subset from a large action space shared by a family of bandits.<n>We assume that similar actions tend to have related payoffs, modeled by a Gaussian process.<n>We propose a simple epsilon-net algorithm to select a representative subset.
arXiv Detail & Related papers (2025-05-23T18:08:57Z) - Optimization Can Learn Johnson Lindenstrauss Embeddings [30.652854230884145]
Randomized methods like Johnson-Lindenstrauss (JL) provide unimprovable theoretical guarantees for achieving such representations.
We present a novel method motivated by diffusion models, that circumvents this fundamental challenge.
We show that by moving through this larger space, the objective converges to a deterministic (zero variance) solution, avoiding bad stationary points.
arXiv Detail & Related papers (2024-12-10T07:07:04Z) - Towards Scalable Semantic Representation for Recommendation [65.06144407288127]
Mixture-of-Codes is proposed to construct semantic IDs based on large language models (LLMs)
Our method achieves superior discriminability and dimension robustness scalability, leading to the best scale-up performance in recommendations.
arXiv Detail & Related papers (2024-10-12T15:10:56Z) - Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis [28.18800845199871]
We present a novel non-rigid point set registration method inspired by unsupervised clustering analysis.
Our method achieves high accuracy results across various scenarios and surpasses competitors by a significant margin.
arXiv Detail & Related papers (2024-06-27T01:16:44Z) - Focus on Query: Adversarial Mining Transformer for Few-Shot Segmentation [44.778713276910715]
Few-shot segmentation (FSS) aims to segment objects of new categories given only a handful of annotated samples.
We propose a new query-centric FSS model Adrial Mining Transformer (AMFormer)
AMFormer achieves accurate query image segmentation with only rough support guidance or even weak support labels.
arXiv Detail & Related papers (2023-11-29T13:39:18Z) - Bayesian Optimization-based Combinatorial Assignment [10.73407470973258]
We study the assignment domain, which includes auctions and course allocation.
The main challenge in this domain is that the bundle space grows exponentially in the number of items.
arXiv Detail & Related papers (2022-08-31T08:47:02Z) - PIE: a Parameter and Inference Efficient Solution for Large Scale
Knowledge Graph Embedding Reasoning [24.29409958504209]
We propose PIE, a textbfparameter and textbfinference textbfefficient solution.
Inspired from tensor decomposition methods, we find that decompose entity embedding matrix into low rank matrices can reduce more than half of the parameters.
To accelerate model inference, we propose a self-supervised auxiliary task, which can be seen as fine-grained entity typing.
arXiv Detail & Related papers (2022-04-29T09:06:56Z) - HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric
Regularization [52.369435664689995]
We introduce a textitHyperbolic Regularization powered Collaborative Filtering (HRCF) and design a geometric-aware hyperbolic regularizer.
Specifically, the proposal boosts optimization procedure via the root alignment and origin-aware penalty.
Our proposal is able to tackle the over-smoothing problem caused by hyperbolic aggregation and also brings the models a better discriminative ability.
arXiv Detail & Related papers (2022-04-18T06:11:44Z) - Contextual Bandits for Advertising Campaigns: A Diffusion-Model
Independent Approach (Extended Version) [73.59962178534361]
We study an influence problem in which little is assumed to be known about the diffusion network or about the model that determines how information may propagate.
In this setting, an explore-exploit approach could be used to learn the key underlying diffusion parameters, while running the campaign.
We describe and compare two methods of contextual multi-armed bandits, with upper-confidence bounds on the remaining potential of influencers.
arXiv Detail & Related papers (2022-01-13T22:06:10Z) - Dense Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
We propose a few-shot segmentation method based on dense Gaussian process (GP) regression.
We exploit the end-to-end learning capabilities of our approach to learn a high-dimensional output space for the GP.
Our approach sets a new state-of-the-art for both 1-shot and 5-shot FSS on the PASCAL-5$i$ and COCO-20$i$ benchmarks.
arXiv Detail & Related papers (2021-10-07T17:57:54Z) - Nearly Dimension-Independent Sparse Linear Bandit over Small Action
Spaces via Best Subset Selection [71.9765117768556]
We consider the contextual bandit problem under the high dimensional linear model.
This setting finds essential applications such as personalized recommendation, online advertisement, and personalized medicine.
We propose doubly growing epochs and estimating the parameter using the best subset selection method.
arXiv Detail & Related papers (2020-09-04T04:10:39Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z) - Guiding Corpus-based Set Expansion by Auxiliary Sets Generation and
Co-Expansion [45.716171458483636]
corpus-based set expansion algorithms bootstrap the given seeds by incorporating lexical patterns and distributional similarity.
Set-CoExpan automatically generates auxiliary sets as negative sets that are closely related to the target set of user's interest.
We show that Set-CoExpan outperforms strong baseline methods significantly.
arXiv Detail & Related papers (2020-01-27T22:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.