Reverse Engineering Configurations of Neural Text Generation Models
- URL: http://arxiv.org/abs/2004.06201v1
- Date: Mon, 13 Apr 2020 21:02:44 GMT
- Title: Reverse Engineering Configurations of Neural Text Generation Models
- Authors: Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, Andrew
Tomkins
- Abstract summary: The study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area.
We conduct an extensive suite of diagnostic tests to observe whether modeling choices leave detectable artifacts in the text they generate.
Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by observing the generated text alone.
- Score: 86.9479386959155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper seeks to develop a deeper understanding of the fundamental
properties of neural text generations models. The study of artifacts that
emerge in machine generated text as a result of modeling choices is a nascent
research area. Previously, the extent and degree to which these artifacts
surface in generated text has not been well studied. In the spirit of better
understanding generative text models and their artifacts, we propose the new
task of distinguishing which of several variants of a given model generated a
piece of text, and we conduct an extensive suite of diagnostic tests to observe
whether modeling choices (e.g., sampling methods, top-$k$ probabilities, model
architectures, etc.) leave detectable artifacts in the text they generate. Our
key finding, which is backed by a rigorous set of experiments, is that such
artifacts are present and that different modeling choices can be inferred by
observing the generated text alone. This suggests that neural text generators
may be more sensitive to various modeling choices than previously thought.
Related papers
- A linguistic analysis of undesirable outcomes in the era of generative AI [4.841442157674423]
We present a comprehensive simulation framework built upon the chat version of LLama2, focusing on the linguistic aspects of the generated content.
Our results show that the model produces less lexical rich content across generations, reducing diversity.
We find that autophagy transforms the initial model into a more creative, doubtful and confused one, which might provide inaccurate answers.
arXiv Detail & Related papers (2024-10-16T08:02:48Z) - ManiFPT: Defining and Analyzing Fingerprints of Generative Models [16.710998621718193]
We formalize the definition of artifact and fingerprint in generative models.
We propose an algorithm for computing them in practice.
We study the structure of the fingerprints and observe that it is very predictive of the effect of different design choices on the generative process.
arXiv Detail & Related papers (2024-02-16T01:58:35Z) - RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions.
Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps.
In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - Artificial Text Detection via Examining the Topology of Attention Maps [58.46367297712477]
We propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA)
We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10% on three common datasets.
The probing analysis of the features reveals their sensitivity to the surface and syntactic properties.
arXiv Detail & Related papers (2021-09-10T12:13:45Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Topical Language Generation using Transformers [4.795530213347874]
This paper presents a novel approach for Topical Language Generation (TLG) by combining a pre-trained LM with topic modeling information.
We extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text.
Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.
arXiv Detail & Related papers (2021-03-11T03:45:24Z) - Neural Deepfake Detection with Factual Structure of Text [78.30080218908849]
We propose a graph-based model for deepfake detection of text.
Our approach represents the factual structure of a given document as an entity graph.
Our model can distinguish the difference in the factual structure between machine-generated text and human-written text.
arXiv Detail & Related papers (2020-10-15T02:35:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.