Related papers: Reducing the Scope of Language Models

Reducing the Scope of Language Models

URL: http://arxiv.org/abs/2410.21597v2
Date: Thu, 17 Apr 2025 19:17:21 GMT
Title: Reducing the Scope of Language Models
Authors: David Yunis, Siyu Huo, Chulaka Gunasekara, Danish Contractor,
Abstract summary: We show that it is possible to scope language models.<n>We ablate diversity of irrelevant queries, layer different techniques, conduct adversarial evaluations.<n>We intend our study to serve as a practitioner's guide to scoping language models.
Score: 7.464494269745494
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We now deploy language models in a wide variety of user-facing applications. Typically, these deployments have some specific purpose, like answering questions about documentation or acting as coding assistants, but they require general language understanding. Under these circumstances these models should not be able to answer irrelevant requests such as, poetry generation or questions about physics, etc. Instead we would like language models to only answer to queries corresponding to desired behavior and refuse all other requests, which we refer to as scoping. We conduct a comprehensive empirical evaluation of potential methods from prompting to fine-tuning to preference learning to a recently proposed method for general alignment called Circuit Breakers (CB). Across three families of language models and a broad variety of tasks, we show that it is possible to scope language models. We examine scoping for multiple topics, and fine-grained topics. We ablate diversity of irrelevant queries, layer different techniques, conduct adversarial evaluations and more. Among other results, we find that, when diverse examples of irrelevant queries are available, simple supervised fine-tuning produces the best results, but when such diversity is low, Circuit Breakers perform quite well. One can often get the benefits of both methods by layering them in succession. We intend our study to serve as a practitioner's guide to scoping language models.

Related papers

Linguistic Nepotism: Trading-off Quality for Language Preference in Multilingual RAG [55.258582772528506]
We investigate whether the mixture of different document languages impacts generation and citation in unintended ways.<n>Across eight languages and six open-weight models, we find that models preferentially cite English sources when queries are in English.<n>We find that models sometimes trade-off document relevance for language preference, indicating that citation choices are not always driven by informativeness alone.
arXiv Detail & Related papers (2025-09-17T12:58:18Z)
Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning [20.193445005516363]
This study presents an overview of different Large Language Models adaptation paradigms for the detection of hyperpartisan and fake news, harmful tweets, and political bias.<n>We tested different strategies ranging from parameter efficient Fine-Tuning of language models to a variety of different In-Context Learning strategies and prompts.
arXiv Detail & Related papers (2025-09-09T14:01:15Z)
mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval [61.17793165194077]
We introduce mFollowIR, a benchmark for measuring instruction-following ability in retrieval models. We present results for both multilingual (XX-XX) and cross-lingual (En-XX) performance. We see strong cross-lingual performance with English-based retrievers that trained using instructions, but find a notable drop in performance in the multilingual setting.
arXiv Detail & Related papers (2025-01-31T16:24:46Z)
Contextualized Evaluations: Taking the Guesswork Out of Language Model Evaluations [85.81295563405433]
Language model users often issue queries that lack specification, where the context under which a query was issued is not explicit. We present contextualized evaluations, a protocol that synthetically constructs context surrounding an under-specified query and provides it during evaluation. We find that the presence of context can 1) alter conclusions drawn from evaluation, even flipping win rates between model pairs, 2) nudge evaluators to make fewer judgments based on surface-level criteria, like style, and 3) provide new insights about model behavior across diverse contexts.
arXiv Detail & Related papers (2024-11-11T18:58:38Z)
The Art of Saying No: Contextual Noncompliance in Language Models [123.383993700586]
We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should not comply with user requests. Our taxonomy spans a wide range of categories including incomplete, unsupported, indeterminate, and humanizing requests. To test noncompliance capabilities of language models, we use this taxonomy to develop a new evaluation suite of 1000 noncompliance prompts.
arXiv Detail & Related papers (2024-07-02T07:12:51Z)
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries [6.382667978271587]
Retrieval Augmented Generation (RAG) enriches the ability of language models to reason using external context to augment responses for a given user prompt. This approach has risen in popularity due to practical applications in various applications of language models in search, question/answering, and chat-bots. In this paper, we mechanistically examine the RAG pipeline to highlight that language models take shortcut and have a strong bias towards utilizing only the context information to answer the question, while relying minimally on their parametric memory.
arXiv Detail & Related papers (2024-06-18T17:46:08Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Eliciting Human Preferences with Language Models [56.68637202313052]
Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts. We propose to use *LMs themselves* to guide the task specification process. We study GATE in three domains: email validation, content recommendation, and moral reasoning.
arXiv Detail & Related papers (2023-10-17T21:11:21Z)
Language Models are Universal Embedders [48.12992614723464]
We show that pre-trained transformer decoders can embed universally when finetuned on limited English data. Our models achieve competitive performance on different embedding tasks by minimal training data. These results provide evidence of a promising path towards building powerful unified embedders.
arXiv Detail & Related papers (2023-10-12T11:25:46Z)
Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant. Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z)
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions [6.9076450524134145]
We propose a combination of multitask pretraining on up to 93 tasks designed to instill diverse reasoning abilities. We show that performance can be significantly improved by adding retrieval-augmented training datasets.
arXiv Detail & Related papers (2023-08-02T05:00:12Z)
Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist. One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity. We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z)
Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation [0.04199844472131922]
We present and apply the GUI-assisted framework allowing us to easily probe a massive number of languages. Most of the regularities revealed in the mBERT model are typical for the western-European languages. Our framework can be integrated with the existing probing toolboxes, model cards, and leaderboards.
arXiv Detail & Related papers (2022-10-24T13:41:17Z)
Regularized Contrastive Learning of Semantic Search [0.0]
Transformer-based models are widely used as retrieval models due to their excellent ability to learn semantic representations. We propose a new regularization method: Regularized Contrastive Learning. It augments several different semantic representations for every sentence, then take them into the contrastive objective as regulators.
arXiv Detail & Related papers (2022-09-27T08:25:19Z)
Language Models are General-Purpose Interfaces [109.45478241369655]
We propose to use language models as a general-purpose interface to various foundation models. A collection of pretrained encoders perceive diverse modalities (such as vision, and language) We propose a semi-causal language modeling objective to jointly pretrain the interface and the modular encoders.
arXiv Detail & Related papers (2022-06-13T17:34:22Z)
Towards Best Practices for Training Multilingual Dense Retrieval Models [54.91016739123398]
We focus on the task of monolingual retrieval in a variety of typologically diverse languages using one such design. Our study is organized as a "best practices" guide for training multilingual dense retrieval models.
arXiv Detail & Related papers (2022-04-05T17:12:53Z)
Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks. For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z)
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm [0.0]
We discuss methods of prompt programming, emphasizing the usefulness of considering prompts through the lens of natural language. We introduce the idea of a metaprompt that seeds the model to generate its own natural language prompts for a range of tasks.
arXiv Detail & Related papers (2021-02-15T05:27:55Z)
Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers. We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.