Late-Binding Scholarship in the Age of AI: Navigating Legal and
Normative Challenges of a New Form of Knowledge Production
- URL: http://arxiv.org/abs/2305.11058v1
- Date: Thu, 4 May 2023 04:14:28 GMT
- Title: Late-Binding Scholarship in the Age of AI: Navigating Legal and
Normative Challenges of a New Form of Knowledge Production
- Authors: Bill Tomlinson, Andrew W. Torrance, Rebecca W. Black, Donald J.
Patterson
- Abstract summary: Artificial Intelligence (AI) is poised to enable a new leap in the creation of scholarly content.
This article articulates ways in which those artifacts can be written, distributed, read, organized, and stored.
- Score: 8.497410878853309
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Artificial Intelligence (AI) is poised to enable a new leap in the creation
of scholarly content. New forms of engagement with AI systems, such as
collaborations with large language models like GPT-3, offer affordances that
will change the nature of both the scholarly process and the artifacts it
produces. This article articulates ways in which those artifacts can be
written, distributed, read, organized, and stored that are more dynamic, and
potentially more effective, than current academic practices. Specifically,
rather than the current "early-binding" process (that is, one in which ideas
are fully reduced to a final written form before they leave an author's desk),
we propose that there are substantial benefits to a "late-binding" process, in
which ideas are written dynamically at the moment of reading. In fact, the
paradigm of "binding" knowledge may transition to a new model in which
scholarship remains ever "unbound" and evolving. An alternative form for a
scholarly work could be encapsulated via several key components: a text
abstract of the work's core arguments; hyperlinks to a bibliography of relevant
related work; novel data that had been collected and metadata describing those
data; algorithms or processes necessary for analyzing those data; a reference
to a particular AI model that would serve as a "renderer" of the canonical
version of the text; and specified parameters that would allow for a precise,
word-for-word reconstruction of the canonical version. Such a form would enable
both the rendering of the canonical version, and also the possibility of
dynamic AI reimaginings of the text in light of future findings, scholarship
unknown to the original authors, alternative theories, and precise tailoring to
specific audiences (e.g., children, adults, professionals, amateurs).
Related papers
- Synthetic continued pretraining [29.6872772403251]
We propose synthetic continued pretraining on a small corpus of domain-specific documents.
We instantiate this proposal with EntiGraph, a synthetic data augmentation algorithm.
We show how synthetic data augmentation can "rearrange" knowledge to enable more data-efficient learning.
arXiv Detail & Related papers (2024-09-11T17:21:59Z) - Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification [49.41632476658246]
We discuss the extension of DFKD to Vision-Language Foundation Models without access to the billion-level image-text datasets.
The objective is to customize a student model for distribution-agnostic downstream tasks with given category concepts.
We propose three novel Prompt Diversification methods to encourage image synthesis with diverse styles.
arXiv Detail & Related papers (2024-07-21T13:26:30Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - The Creative Frontier of Generative AI: Managing the Novelty-Usefulness
Tradeoff [0.4873362301533825]
We explore the optimal balance between novelty and usefulness in generative Artificial Intelligence (AI) systems.
Overemphasizing either aspect can lead to limitations such as hallucinations and memorization.
arXiv Detail & Related papers (2023-06-06T11:44:57Z) - Foundation Models for Natural Language Processing -- Pre-trained
Language Models Integrating Media [0.0]
Foundation Models are pre-trained language models for Natural Language Processing.
They can be applied to a wide range of different media and problem domains, ranging from image and video processing to robot control learning.
This book provides a comprehensive overview of the state of the art in research and applications of Foundation Models.
arXiv Detail & Related papers (2023-02-16T20:42:04Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - Synthetic Books [0.0]
Article explores new ways of written language aided by AI technologies, like GPT-2 and GPT-3.
New concept of synthetic books is introduced in the article.
Paper emphasizes that artistic quality is an issue when it comes to AI-generated content.
arXiv Detail & Related papers (2022-01-24T08:26:28Z) - Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
in Natural Language Processing [78.8500633981247]
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning"
Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly.
arXiv Detail & Related papers (2021-07-28T18:09:46Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z) - Abstractive Summarization of Spoken and Written Instructions with BERT [66.14755043607776]
We present the first application of the BERTSum model to conversational language.
We generate abstractive summaries of narrated instructional videos across a wide variety of topics.
We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.
arXiv Detail & Related papers (2020-08-21T20:59:34Z) - Distributional semantic modeling: a revised technique to train term/word
vector space models applying the ontology-related approach [36.248702416150124]
We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings)
Vec2graph is a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs.
arXiv Detail & Related papers (2020-03-06T18:27:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.