Construct Informative Triplet with Two-stage Hard-sample Generation
- URL: http://arxiv.org/abs/2112.02259v1
- Date: Sat, 4 Dec 2021 06:28:25 GMT
- Title: Construct Informative Triplet with Two-stage Hard-sample Generation
- Authors: Chuang Zhu, Zheng Hu, Huihui Dong, Gang He, Zekuan Yu, Shangshang
Zhang
- Abstract summary: We propose a two-stage synthesis framework that produces hard samples through effective positive and negative sample generators.
Our method achieves superior performance than the existing hard-sample generation algorithms.
We also find that our proposed hard sample generation method combining the existing triplet mining strategies can further boost the deep metric learning performance.
- Score: 6.361348748202731
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a robust sample generation scheme to construct
informative triplets. The proposed hard sample generation is a two-stage
synthesis framework that produces hard samples through effective positive and
negative sample generators in two stages, respectively. The first stage
stretches the anchor-positive pairs with piecewise linear manipulation and
enhances the quality of generated samples by skillfully designing a conditional
generative adversarial network to lower the risk of mode collapse. The second
stage utilizes an adaptive reverse metric constraint to generate the final hard
samples. Extensive experiments on several benchmark datasets verify that our
method achieves superior performance than the existing hard-sample generation
algorithms. Besides, we also find that our proposed hard sample generation
method combining the existing triplet mining strategies can further boost the
deep metric learning performance.
Related papers
- CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs [5.89889361990138]
Large language models (LLMs) have demonstrated remarkable performance in diverse tasks using zero-shot and few-shot prompting.
In this work, we tackle the challenge of generating datasets with high diversity, upon which a student model is trained for downstream tasks.
Taking the route of decoding-time guidance-based approaches, we propose Corr Synth, which generates data that is more diverse and faithful to the input prompt using a correlated sampling strategy.
arXiv Detail & Related papers (2024-11-13T12:09:23Z) - Generating Realistic Tabular Data with Large Language Models [49.03536886067729]
Large language models (LLM) have been used for diverse tasks, but do not capture the correct correlation between the features and the target variable.
We propose a LLM-based method with three important improvements to correctly capture the ground-truth feature-class correlation in the real data.
Our experiments show that our method significantly outperforms 10 SOTA baselines on 20 datasets in downstream tasks.
arXiv Detail & Related papers (2024-10-29T04:14:32Z) - Hard Sample Aware Network for Contrastive Deep Graph Clustering [38.44763843990694]
We propose a novel contrastive deep graph clustering method dubbed Hard Sample Aware Network (HSAN)
In our algorithm, the similarities between samples are calculated by considering both the attribute embeddings and the structure embeddings.
Under the guidance of the carefully collected high-confidence clustering information, our proposed weight modulating function will first recognize the positive and negative samples.
arXiv Detail & Related papers (2022-12-16T16:57:37Z) - Selectively increasing the diversity of GAN-generated samples [8.980453507536017]
We propose a novel method to selectively increase the diversity of GAN-generated samples.
We show the superiority of our method in a synthetic benchmark as well as a real-life scenario simulating data from the Zero Degree Calorimeter of ALICE experiment in CERN.
arXiv Detail & Related papers (2022-07-04T16:27:06Z) - A Well-Composed Text is Half Done! Composition Sampling for Diverse
Conditional Generation [79.98319703471596]
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality.
It builds on recently proposed plan-based neural generation models that are trained to first create a composition of the output and then generate by conditioning on it and the input.
arXiv Detail & Related papers (2022-03-28T21:24:03Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Contrastive Triple Extraction with Generative Transformer [72.21467482853232]
We introduce a novel model, contrastive triple extraction with a generative transformer.
Specifically, we introduce a single shared transformer module for encoder-decoder-based generation.
To generate faithful results, we propose a novel triplet contrastive training object.
arXiv Detail & Related papers (2020-09-14T05:29:24Z) - Self-Adversarial Learning with Comparative Discrimination for Text
Generation [111.18614166615968]
We propose a novel self-adversarial learning (SAL) paradigm for improving GANs' performance in text generation.
During training, SAL rewards the generator when its currently generated sentence is found to be better than its previously generated samples.
Experiments on text generation benchmark datasets show that our proposed approach substantially improves both the quality and the diversity.
arXiv Detail & Related papers (2020-01-31T07:50:25Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.