Related papers: Does More Inference-Time Compute Really Help Robustness?

Does More Inference-Time Compute Really Help Robustness?

URL: http://arxiv.org/abs/2507.15974v1
Date: Mon, 21 Jul 2025 18:08:38 GMT
Title: Does More Inference-Time Compute Really Help Robustness?
Authors: Tong Wu, Chong Xiang, Jiachen T. Wang, Weichen Yu, Chawin Sitawarin, Vikash Sehwag, Prateek Mittal,
Abstract summary: We show that small-scale, open-source models can benefit from inference-time scaling.<n>We identify an important security risk, intuitively motivated and empirically verified as an inverse scaling law.<n>We urge practitioners to carefully weigh these subtle trade-offs before applying inference-time scaling in security-sensitive, real-world applications.
Score: 50.47666612618054
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, Zaremba et al. demonstrated that increasing inference-time computation improves robustness in large proprietary reasoning LLMs. In this paper, we first show that smaller-scale, open-source models (e.g., DeepSeek R1, Qwen3, Phi-reasoning) can also benefit from inference-time scaling using a simple budget forcing strategy. More importantly, we reveal and critically examine an implicit assumption in prior work: intermediate reasoning steps are hidden from adversaries. By relaxing this assumption, we identify an important security risk, intuitively motivated and empirically verified as an inverse scaling law: if intermediate reasoning steps become explicitly accessible, increased inference-time computation consistently reduces model robustness. Finally, we discuss practical scenarios where models with hidden reasoning chains are still vulnerable to attacks, such as models with tool-integrated reasoning and advanced reasoning extraction attacks. Our findings collectively demonstrate that the robustness benefits of inference-time scaling depend heavily on the adversarial setting and deployment context. We urge practitioners to carefully weigh these subtle trade-offs before applying inference-time scaling in security-sensitive, real-world applications.

Related papers

ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments [18.198349215500183]
ReasoningGuard injects timely safety aha moments to steer harmless while helpful reasoning processes.<n>Our approach outperforms seven existing safeguards, achieving state-of-the-art safety defenses.
arXiv Detail & Related papers (2025-08-06T08:35:10Z)
Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models [0.0]
Reasoning Language Models (RLMs) have gained traction for their ability to perform complex, multi-step reasoning tasks.<n>While these capabilities promise improved reliability, their impact on robustness to social biases remains unclear.<n>We leverage the CLEAR-Bias benchmark to investigate the adversarial robustness of RLMs to bias elicitation.
arXiv Detail & Related papers (2025-07-03T17:01:53Z)
Is Long-to-Short a Free Lunch? Investigating Inconsistency and Reasoning Efficiency in LRMs [8.359909829007005]
We investigate whether efficient reasoning strategies introduce behavioral inconsistencies in large reasoning models (LRMs)<n>$ICBENCH$ is a benchmark designed to measure inconsistency in LRMs across three dimensions.<n>We find that while larger models generally exhibit greater consistency than smaller ones, they all display widespread "scheming" behaviors.
arXiv Detail & Related papers (2025-06-24T10:25:28Z)
Excessive Reasoning Attack on Reasoning LLMs [26.52688123765127]
In this work, we expose a novel threat: adversarial inputs can be crafted to exploit excessive reasoning behaviors.<n>Our results demonstrate a 3x to 9x increase in reasoning length with comparable utility performance.<n>Our crafted adversarial inputs exhibit transferability, inducing computational overhead in o3-mini, o1-mini, DeepSeek-R1, and QWQ models.
arXiv Detail & Related papers (2025-06-17T10:16:52Z)
On Reasoning Strength Planning in Large Reasoning Models [50.61816666920207]
We find evidence that LRMs pre-plan the reasoning strengths in their activations even before generation.<n>We then uncover that LRMs encode this reasoning strength through a pre-allocated directional vector embedded in the activations of the model.<n>Our work provides new insights into the internal mechanisms of reasoning in LRMs and offers practical tools for controlling their reasoning behaviors.
arXiv Detail & Related papers (2025-06-10T02:55:13Z)
Saffron-1: Safety Inference Scaling [69.61130284742353]
SAFFRON is a novel inference scaling paradigm tailored explicitly for safety assurance.<n>Central to our approach is the introduction of a multifurcation reward model (MRM) that significantly reduces the required number of reward model evaluations.<n>We publicly release our trained multifurcation reward model (Saffron-1) and the accompanying token-level safety reward dataset (Safety4M)
arXiv Detail & Related papers (2025-06-06T18:05:45Z)
PixelThink: Towards Efficient Chain-of-Pixel Reasoning [70.32510083790069]
PixelThink is a simple yet effective scheme that integrates externally estimated task difficulty and internally measured model uncertainty.<n>It learns to compress reasoning length in accordance with scene complexity and predictive confidence.<n> Experimental results demonstrate that the proposed approach improves both reasoning efficiency and overall segmentation performance.
arXiv Detail & Related papers (2025-05-29T17:55:49Z)
ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning [75.1101108949743]
Large Reasoning Models (LRMs) perform strongly in complex reasoning tasks via Chain-of-Thought (CoT) prompting.<n>LRMs often suffer from verbose outputs caused by redundant content, increasing computational overhead, and degrading user experience.<n>We propose ConCISE, a framework that simplifies reasoning chains by reinforcing the model's confidence during inference.
arXiv Detail & Related papers (2025-05-08T01:40:40Z)
Cannot See the Forest for the Trees: Invoking Heuristics and Biases to Elicit Irrational Choices of LLMs [83.11815479874447]
We propose a novel jailbreak attack framework, inspired by cognitive decomposition and biases in human cognition.<n>We employ cognitive decomposition to reduce the complexity of malicious prompts and relevance bias to reorganize prompts.<n>We also introduce a ranking-based harmfulness evaluation metric that surpasses the traditional binary success-or-failure paradigm.
arXiv Detail & Related papers (2025-05-03T05:28:11Z)
Trading Inference-Time Compute for Adversarial Robustness [27.514612815314084]
We conduct experiments on the impact of increasing inference-time compute in reasoning models on their robustness to adversarial attacks.<n>We find that across a variety of attacks, increased inference-time compute leads to improved robustness.
arXiv Detail & Related papers (2025-01-31T01:20:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.