Guarding against artificial intelligence--hallucinated citations: the case for full-text reference deposit
- URL: http://arxiv.org/abs/2503.19848v1
- Date: Tue, 25 Mar 2025 17:12:38 GMT
- Title: Guarding against artificial intelligence--hallucinated citations: the case for full-text reference deposit
- Authors: Alex Glynn,
- Abstract summary: Journals should require authors to submit the full text of each cited source along with their manuscripts.<n>This solution requires limited additional work on the part of authors or editors while effectively immunizing journals against hallucinated references.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The tendency of generative artificial intelligence (AI) systems to "hallucinate" false information is well-known; AI-generated citations to non-existent sources have made their way into the reference lists of peer-reviewed publications. Here, I propose a solution to this problem, taking inspiration from the Transparency and Openness Promotion (TOP) data sharing guidelines, the clash of generative AI with the American judiciary, and the precedent set by submissions of prior art to the United States Patent and Trademark Office. Journals should require authors to submit the full text of each cited source along with their manuscripts, thereby preventing authors from citing any material whose full text they cannot produce. This solution requires limited additional work on the part of authors or editors while effectively immunizing journals against hallucinated references.
Related papers
- Suspected Undeclared Use of Artificial Intelligence in the Academic Literature: An Analysis of the Academ-AI Dataset [0.0]
Academ-AI documents examples of suspected undeclared AI usage in the academic literature.
Undeclared AI seems to appear in journals with higher citation metrics and higher article processing charges.
arXiv Detail & Related papers (2024-11-20T21:29:36Z) - Tackling GenAI Copyright Issues: Originality Estimation and Genericization [25.703494724823756]
We propose a genericization method that modifies the outputs of a generative model to make them more generic and less likely to imitate copyrighted materials.<n>As a practical implementation, we introduce PREGen, which combines our genericization method with an existing mitigation technique.
arXiv Detail & Related papers (2024-06-05T14:58:32Z) - Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools [32.78336381381673]
We report on the first preregistered empirical evaluation of AI-driven legal research tools.
We find that the AI research tools made by LexisNexis (Lexis+ AI) and Thomson Reuters (Westlaw AI-Assisted Research and Ask Practical Law AI) each hallucinate between 17% and 33% of the time.
It provides evidence to inform the responsibilities of legal professionals in supervising and verifying AI outputs.
arXiv Detail & Related papers (2024-05-30T17:56:05Z) - A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works.
Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement.
We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z) - Cited Text Spans for Citation Text Generation [12.039469573641217]
An automatic citation generation system aims to concisely and accurately describe the relationship between two scientific articles.
Due to the length of scientific documents, existing abstractive approaches have conditioned only on cited paper abstracts.
We propose to condition instead on the cited text span (CTS) as an alternative to the abstract.
arXiv Detail & Related papers (2023-09-12T16:28:36Z) - Improving Wikipedia Verifiability with AI [116.69749668874493]
We develop a neural network based system, called Side, to identify Wikipedia citations that are unlikely to support their claims.
Our first citation recommendation collects over 60% more preferences than existing Wikipedia citations for the same top 10% most likely unverifiable claims.
Our results indicate that an AI-based system could be used, in tandem with humans, to improve the verifiability of Wikipedia.
arXiv Detail & Related papers (2022-07-08T15:23:29Z) - Towards generating citation sentences for multiple references with
intent control [86.53829532976303]
We build a novel generation model with the Fusion-in-Decoder approach to cope with multiple long inputs.
Experiments demonstrate that the proposed approaches provide much more comprehensive features for generating citation sentences.
arXiv Detail & Related papers (2021-12-02T15:32:24Z) - Tortured phrases: A dubious writing style emerging in science. Evidence
of critical issues affecting established journals [69.76097138157816]
Probabilistic text generators have been used to produce fake scientific papers for more than a decade.
Complex AI-powered generation techniques produce texts indistinguishable from that of humans.
Some websites offer to rewrite texts for free, generating gobbledegook full of tortured phrases.
arXiv Detail & Related papers (2021-07-12T20:47:08Z) - A Token-level Reference-free Hallucination Detection Benchmark for
Free-form Text Generation [50.55448707570669]
We propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDes.
To create this dataset, we first perturb a large number of text segments extracted from English language Wikipedia, and then verify these with crowd-sourced annotations.
arXiv Detail & Related papers (2021-04-18T04:09:48Z) - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.