Finding Influential Instances for Distantly Supervised Relation
Extraction
- URL: http://arxiv.org/abs/2009.09841v2
- Date: Tue, 25 Jan 2022 16:07:09 GMT
- Title: Finding Influential Instances for Distantly Supervised Relation
Extraction
- Authors: Zifeng Wang, Rui Wen, Xi Chen, Shao-Lun Huang, Ningyu Zhang, Yefeng
Zheng
- Abstract summary: This work proposes a novel model-agnostic instance sampling method for Distant supervision (DS) by influence function (IF)
Our method identifies favorable/unfavorable instances in the bag based on IF, then does dynamic instance sampling.
Experiments show that REIF is able to win over a series of baselines that have complicated architectures.
- Score: 42.94953922808431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distant supervision (DS) is a strong way to expand the datasets for enhancing
relation extraction (RE) models but often suffers from high label noise.
Current works based on attention, reinforcement learning, or GAN are black-box
models so they neither provide meaningful interpretation of sample selection in
DS nor stability on different domains. On the contrary, this work proposes a
novel model-agnostic instance sampling method for DS by influence function
(IF), namely REIF. Our method identifies favorable/unfavorable instances in the
bag based on IF, then does dynamic instance sampling. We design a fast
influence sampling algorithm that reduces the computational complexity from
$\mathcal{O}(mn)$ to $\mathcal{O}(1)$, with analyzing its robustness on the
selected sampling function. Experiments show that by simply sampling the
favorable instances during training, REIF is able to win over a series of
baselines that have complicated architectures. We also demonstrate that REIF
can support interpretable instance selection.
Related papers
- Improved Active Learning via Dependent Leverage Score Sampling [8.400581768343804]
We show how to obtain improved active learning methods in the agnostic (adversarial noise) setting.
We propose an easily implemented method based on the emphpivotal sampling algorithm
In comparison to independent sampling, our method reduces the number of samples needed to reach a given target accuracy by up to $50%$.
arXiv Detail & Related papers (2023-10-08T01:51:30Z) - Sample and Predict Your Latent: Modality-free Sequential Disentanglement
via Contrastive Estimation [2.7759072740347017]
We introduce a self-supervised sequential disentanglement framework based on contrastive estimation with no external signals.
In practice, we propose a unified, efficient, and easy-to-code sampling strategy for semantically similar and dissimilar views of the data.
Our method presents state-of-the-art results in comparison to existing techniques.
arXiv Detail & Related papers (2023-05-25T10:50:30Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Example-Based Sampling with Diffusion Models [7.943023838493658]
diffusion models for image generation could be appropriate for learning how to generate point sets from examples.
We propose a generic way to produce 2-d point sets imitating existing samplers from observed point sets using a diffusion model.
We demonstrate how the differentiability of our approach can be used to optimize point sets to enforce properties.
arXiv Detail & Related papers (2023-02-10T08:35:17Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Online progressive instance-balanced sampling for weakly supervised
object detection [0.0]
An online progressive instance-balanced sampling (OPIS) algorithm based on hard sampling and soft sampling is proposed in this paper.
The proposed method can significantly improve the baseline, which is also comparable to many existing state-of-the-art results.
arXiv Detail & Related papers (2022-06-21T12:48:13Z) - CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator [60.799183326613395]
We propose an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples.
CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling.
We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline.
arXiv Detail & Related papers (2021-10-26T20:14:30Z) - Sensing Cox Processes via Posterior Sampling and Positive Bases [56.82162768921196]
We study adaptive sensing of point processes, a widely used model from spatial statistics.
We model the intensity function as a sample from a truncated Gaussian process, represented in a specially constructed positive basis.
Our adaptive sensing algorithms use Langevin dynamics and are based on posterior sampling (textscCox-Thompson) and top-two posterior sampling (textscTop2) principles.
arXiv Detail & Related papers (2021-10-21T14:47:06Z) - ANL: Anti-Noise Learning for Cross-Domain Person Re-Identification [25.035093667770052]
We propose an Anti-Noise Learning (ANL) approach, which contains two modules.
FDA module is designed to gather the id-related samples and disperse id-unrelated samples, through the camera-wise contrastive learning and adversarial adaptation.
Reliable Sample Selection ( RSS) module utilizes an Auxiliary Model to correct noisy labels and select reliable samples for the Main Model.
arXiv Detail & Related papers (2020-12-27T02:38:45Z) - Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances.
We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD.
MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.