Unsupervised Open-domain Keyphrase Generation
- URL: http://arxiv.org/abs/2306.10755v1
- Date: Mon, 19 Jun 2023 07:57:13 GMT
- Title: Unsupervised Open-domain Keyphrase Generation
- Authors: Lam Thanh Do, Pritom Saha Akash, Kevin Chen-Chuan Chang
- Abstract summary: We propose a seq2seq model that consists of two modules, namely textitphraseness and textitinformativeness module.
The phraseness module generates phrases, while the informativeness module guides the generation towards those that represent the core concepts of the text.
We thoroughly evaluate our proposed method using eight benchmark datasets from different domains.
- Score: 7.429941108199692
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we study the problem of unsupervised open-domain keyphrase
generation, where the objective is a keyphrase generation model that can be
built without using human-labeled data and can perform consistently across
domains. To solve this problem, we propose a seq2seq model that consists of two
modules, namely \textit{phraseness} and \textit{informativeness} module, both
of which can be built in an unsupervised and open-domain fashion. The
phraseness module generates phrases, while the informativeness module guides
the generation towards those that represent the core concepts of the text. We
thoroughly evaluate our proposed method using eight benchmark datasets from
different domains. Results on in-domain datasets show that our approach
achieves state-of-the-art results compared with existing unsupervised models,
and overall narrows the gap between supervised and unsupervised methods down to
about 16\%. Furthermore, we demonstrate that our model performs consistently
across domains, as it overall surpasses the baselines on out-of-domain
datasets.
Related papers
- Unsupervised Domain Adaptation for Keyphrase Generation using Citation Contexts [33.04325179283727]
Adapting keyphrase generation models to new domains typically involves few-shot fine-tuning with in-domain labeled data.
This paper presents silk, an unsupervised method designed to address this issue by extracting silver-standard keyphrases from citation contexts to create synthetic labeled data for domain adaptation.
arXiv Detail & Related papers (2024-09-20T06:56:14Z) - StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain.
State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data.
We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z) - Compositional Semantic Mix for Domain Adaptation in Point Cloud
Segmentation [65.78246406460305]
compositional semantic mixing represents the first unsupervised domain adaptation technique for point cloud segmentation.
We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world)
arXiv Detail & Related papers (2023-08-28T14:43:36Z) - DORIC : Domain Robust Fine-Tuning for Open Intent Clustering through
Dependency Parsing [14.709084509818474]
DSTC11-Track2 aims to provide a benchmark for zero-shot, cross-domain, intent-set induction.
We leveraged a multi-domain dialogue dataset to fine-tune the language model and proposed extracting Verb-Object pairs.
Our approach achieved 3rd place in the precision score and showed superior accuracy and normalized mutual information (NMI) score than the baseline model.
arXiv Detail & Related papers (2023-03-17T08:12:36Z) - Cross-Domain Ensemble Distillation for Domain Generalization [17.575016642108253]
We propose a simple yet effective method for domain generalization, named cross-domain ensemble distillation (XDED)
Our method generates an ensemble of the output logits from training data with the same label but from different domains and then penalizes each output for the mismatch with the ensemble.
We show that models learned by our method are robust against adversarial attacks and image corruptions.
arXiv Detail & Related papers (2022-11-25T12:32:36Z) - Grounding Visual Representations with Texts for Domain Generalization [9.554646174100123]
Cross-modality supervision can be successfully used to ground domain-invariant visual representations.
Our proposed method achieves state-of-the-art results and ranks 1st in average performance for five multi-domain datasets.
arXiv Detail & Related papers (2022-07-21T03:43:38Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z) - Bi-Directional Generation for Unsupervised Domain Adaptation [61.73001005378002]
Unsupervised domain adaptation facilitates the unlabeled target domain relying on well-established source domain information.
Conventional methods forcefully reducing the domain discrepancy in the latent space will result in the destruction of intrinsic data structure.
We propose a Bi-Directional Generation domain adaptation model with consistent classifiers interpolating two intermediate domains to bridge source and target domains.
arXiv Detail & Related papers (2020-02-12T09:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.