Time-aware Prompting for Text Generation
- URL: http://arxiv.org/abs/2211.02162v1
- Date: Thu, 3 Nov 2022 22:10:25 GMT
- Title: Time-aware Prompting for Text Generation
- Authors: Shuyang Cao and Lu Wang
- Abstract summary: We study the effects of incorporating timestamps, such as document creation dates, into generation systems.
Two types of time-aware prompts are investigated: (1) textual prompts that encode document timestamps in natural language sentences; and (2) linear prompts that convert timestamps into continuous vectors.
- Score: 17.58231642569116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the effects of incorporating timestamps, such as
document creation dates, into generation systems. Two types of time-aware
prompts are investigated: (1) textual prompts that encode document timestamps
in natural language sentences; and (2) linear prompts that convert timestamps
into continuous vectors. To explore extrapolation to future data points, we
further introduce a new data-to-text generation dataset, TempWikiBio,
containing more than 4 millions of chronologically ordered revisions of
biographical articles from English Wikipedia, each paired with structured
personal profiles. Through data-to-text generation on TempWikiBio, text-to-text
generation on the content transfer dataset, and summarization on XSum, we show
that linear prompts on encoder and textual prompts improve the generation
quality on all datasets. Despite having less performance drop when testing on
data drawn from a later time, linear prompts focus more on non-temporal
information and are less sensitive to the given timestamps, according to human
evaluations and sensitivity analyses. Meanwhile, textual prompts establish the
association between the given timestamps and the output dates, yielding more
factual temporal information in the output.
Related papers
- Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data [5.264562311559751]
We propose a method to generate domain-independent descriptive texts from time-series data.
By implementing the novel backward approach, we create the Temporal Automated Captions for Observations dataset.
Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.
arXiv Detail & Related papers (2024-09-25T06:04:03Z) - Evolving Text Data Stream Mining [2.28438857884398]
A massive amount of such text data is generated by online social platforms every day.
Learning useful information from such streaming data under the constraint of limited time and memory has gained increasing attention.
New learning models are proposed for clustering and multi-label learning on text streams.
arXiv Detail & Related papers (2024-08-15T15:38:52Z) - Attribute First, then Generate: Locally-attributable Grounded Text Generation [33.371400233333326]
We introduce a locally-attributable text generation approach, prioritizing concise attributions.
Our method, named "Attribute First, then Generate", breaks down the conventional end-to-end generation process into three intuitive steps.
arXiv Detail & Related papers (2024-03-25T18:41:47Z) - Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge
Selection [71.20871905457174]
Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text.
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up of irrelevant references.
We present DKGen, which divide the text generation process into an iterative process.
arXiv Detail & Related papers (2023-08-30T02:22:40Z) - Sequentially Controlled Text Generation [97.22539956688443]
GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure.
We study the problem of imposing structure on long-range text.
We develop a sequential controlled text generation pipeline with generation and editing.
arXiv Detail & Related papers (2023-01-05T21:23:51Z) - CiteBench: A benchmark for Scientific Citation Text Generation [69.37571393032026]
CiteBench is a benchmark for citation text generation.
We make the code for CiteBench publicly available at https://github.com/UKPLab/citebench.
arXiv Detail & Related papers (2022-12-19T16:10:56Z) - A Benchmark Corpus for the Detection of Automatically Generated Text in
Academic Publications [0.02578242050187029]
This paper presents two datasets comprised of artificially generated research content.
In the first case, the content is completely generated by the GPT-2 model after a short prompt extracted from original papers.
The partial or hybrid dataset is created by replacing several sentences of abstracts with sentences that are generated by the Arxiv-NLP model.
We evaluate the quality of the datasets comparing the generated texts to aligned original texts using fluency metrics such as BLEU and ROUGE.
arXiv Detail & Related papers (2022-02-04T08:16:56Z) - Text Editing by Command [82.50904226312451]
A prevailing paradigm in neural text generation is one-shot generation, where text is produced in a single step.
We address this limitation with an interactive text generation setting in which the user interacts with the system by issuing commands to edit existing text.
We show that our Interactive Editor, a transformer-based model trained on this dataset, outperforms baselines and obtains positive results in both automatic and human evaluations.
arXiv Detail & Related papers (2020-10-24T08:00:30Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Efficient text generation of user-defined topic using generative
adversarial networks [0.32228025627337864]
We propose a User-Defined GAN (UD-GAN) with two-level discriminators to solve this problem.
The proposed method is capable of generating texts with less time than others.
arXiv Detail & Related papers (2020-06-22T04:49:47Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.