Related papers: Wild Guesses and Mild Guesses in Active Concept Learning

Wild Guesses and Mild Guesses in Active Concept Learning

URL: http://arxiv.org/abs/2602.06818v1
Date: Fri, 06 Feb 2026 16:04:44 GMT
Title: Wild Guesses and Mild Guesses in Active Concept Learning
Authors: Anirudh Chari, Neil Pattanaik,
Abstract summary: We study a trade-off in a neuro-symbolic Bayesian learner whose hypotheses are proposed by a large language model (LLM)<n>We compare a Rational Active Learner that selects queries to maximize approximate expected information gain (EIG) and the human-like Positive Test Strategy (PTS)<n>Our results suggest that "confirmation bias" may not be a cognitive error, but rather a rational adaptation for maintaining tractable inference in the sparse, open-ended hypothesis spaces characteristic of human thought.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human concept learning is typically active: learners choose which instances to query or test in order to reduce uncertainty about an underlying rule or category. Active concept learning must balance informativeness of queries against the stability of the learner that generates and scores hypotheses. We study this trade-off in a neuro-symbolic Bayesian learner whose hypotheses are executable programs proposed by a large language model (LLM) and reweighted by Bayesian updating. We compare a Rational Active Learner that selects queries to maximize approximate expected information gain (EIG) and the human-like Positive Test Strategy (PTS) that queries instances predicted to be positive under the current best hypothesis. Across concept-learning tasks in the classic Number Game, EIG is effective when falsification is necessary (e.g., compound or exception-laden rules), but underperforms on simple concepts. We trace this failure to a support mismatch between the EIG policy and the LLM proposal distribution: highly diagnostic boundary queries drive the posterior toward regions where the generator produces invalid or overly specific programs, yielding a support-mismatch trap in the particle approximation. PTS is information-suboptimal but tends to maintain proposal validity by selecting "safe" queries, leading to faster convergence on simple rules. Our results suggest that "confirmation bias" may not be a cognitive error, but rather a rational adaptation for maintaining tractable inference in the sparse, open-ended hypothesis spaces characteristic of human thought.

Related papers

ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference [60.958331943869126]
ODAR-Expert is an adaptive routing framework that optimize the accuracy-efficiency trade-off via principled resource allocation.<n>We show strong and consistent gains, including 98.2% accuracy on MATH and 54.8% on Humanity's Last Exam.
arXiv Detail & Related papers (2026-02-27T05:22:01Z)
Towards Generalizable Reasoning: Group Causal Counterfactual Policy Optimization for LLM Reasoning [50.352417879912515]
Large language models (LLMs) excel at complex tasks with advances in reasoning capabilities.<n>We propose Group Causal Counterfactual Policy Optimization to explicitly train LLMs to learn generalizable reasoning patterns.<n>We then construct token-level advantages from this reward and optimize the policy, encouraging LLMs to favor reasoning patterns that are process-valid and counterfactually robust.
arXiv Detail & Related papers (2026-02-06T08:03:11Z)
The Silent Scholar Problem: A Probabilistic Framework for Breaking Epistemic Asymmetry in LLM Agents [0.6117371161379209]
We propose a formal probabilistic framework that provides agents with a non-altruistic motive for bidirectional knowledge exchange.<n>We show how these accumulated belief states serve as verifiable reward signals for Reinforcement Learning from Human Feedback (RLHF) and high-quality data filters for Supervised Fine-Tuning (SFT)<n> Simulation results validate that this uncertainty-driven strategy significantly outperforms random baselines in heterogeneous environments.
arXiv Detail & Related papers (2025-12-24T02:02:25Z)
Latent Chain-of-Thought for Visual Reasoning [53.541579327424046]
Chain-of-thought (CoT) reasoning is critical for improving the interpretability and reliability of Large Vision-Language Models (LVLMs)<n>We reformulate reasoning in LVLMs as posterior inference and propose a scalable training algorithm based on amortized variational inference.<n>We empirically demonstrate that the proposed method enhances the state-of-the-art LVLMs on seven reasoning benchmarks.
arXiv Detail & Related papers (2025-10-27T23:10:06Z)
The Consistency Hypothesis in Uncertainty Quantification for Large Language Models [22.60039074743706]
Black-box uncertainty quantification (UQ) methods, relying solely on model API access, have gained popularity due to their practical benefits.<n>In this paper, we examine the implicit assumption behind several UQ methods, which use generation consistency as a proxy for confidence.<n>We propose data-free black-box UQ methods that aggregate similarities between generations for confidence estimation.
arXiv Detail & Related papers (2025-06-27T01:53:15Z)
PredictaBoard: Benchmarking LLM Score Predictability [50.47497036981544]
Large Language Models (LLMs) often fail unpredictably.<n>This poses a significant challenge to ensuring their safe deployment.<n>We present PredictaBoard, a novel collaborative benchmarking framework.
arXiv Detail & Related papers (2025-02-20T10:52:38Z)
Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data. This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis. To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z)
Learnability, Sample Complexity, and Hypothesis Class Complexity for Regression Models [10.66048003460524]
This work is inspired by the foundation of PAC and is motivated by the existing regression learning issues. The proposed approach, denoted by epsilon-Confidence Approximately Correct (epsilon CoAC), utilizes Kullback Leibler divergence (relative entropy) It enables the learner to compare hypothesis classes of different complexity orders and choose among them the optimum with the minimum epsilon.
arXiv Detail & Related papers (2023-03-28T15:59:12Z)
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence [45.9949173746044]
We show that large-size pre-trained language models (PLMs) do not satisfy the logical negation property (LNP) We propose a novel intermediate training task, names meaning-matching, designed to directly learn a meaning-text correspondence. We find that the task enables PLMs to learn lexical semantic information.
arXiv Detail & Related papers (2022-05-08T08:37:36Z)
L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system. A novel $L2R2$ approach is proposed under the learning-to-rank framework. Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z)
Stopping criterion for active learning based on deterministic generalization bounds [4.518012967046983]
We propose a criterion for automatically stopping active learning. The proposed stopping criterion is based on the difference in the expected generalization errors and hypothesis testing. We demonstrate the effectiveness of the proposed method via experiments with both artificial and real datasets.
arXiv Detail & Related papers (2020-05-15T08:15:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.