Related papers: Parent-Guided Adaptive Reliability (PGAR): A Behavioural Meta-Learning Framework for Stable and Trustworthy AI

Parent-Guided Adaptive Reliability (PGAR): A Behavioural Meta-Learning Framework for Stable and Trustworthy AI

URL: http://arxiv.org/abs/2601.06167v1
Date: Wed, 07 Jan 2026 06:02:34 GMT
Title: Parent-Guided Adaptive Reliability (PGAR): A Behavioural Meta-Learning Framework for Stable and Trustworthy AI
Authors: Anshum Rankawat,
Abstract summary: Parent-Guided Adaptive Reliability (PGAR) is a lightweight behavioural meta-learning framework.<n>It adds a supervisory "parent" layer on top of a standard learner to improve stability, calibration, and recovery under disturbances.<n>PGAR functions as a plug-in reliability layer for existing optimization and learning pipelines, supporting interpretable traces in safety-relevant settings.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parent-Guided Adaptive Reliability (PGAR) is a lightweight behavioural meta-learning framework that adds a supervisory "parent" layer on top of a standard learner to improve stability, calibration, and recovery under disturbances. PGAR computes three reflex-level signals (incident detection, overconfidence correction, and recovery memory) and fuses them into a bounded reliability index in [0,1]. This index continuously modulates the learner's effective learning rate, reducing update magnitude during instability and restoring it as reliability improves. We provide a Lyapunov-based proof sketch establishing bounded adaptation of the reliability dynamics under mild assumptions (smooth loss, descent direction, and bounded reflex outputs). Empirical evaluations on representative learning tasks show improved calibration, reduced loss variance, and faster recovery compared to standard optimizers, while retaining computational simplicity. PGAR functions as a plug-in reliability layer for existing optimization and learning pipelines, supporting interpretable reliability traces in safety-relevant settings.

Related papers

VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction [55.04308051033549]
Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a dominant paradigm for enhancing Large Language Models (LLMs) reasoning.<n>We introduceVerifier-Independent Curriculum Reinforcement Learning (VI-CuRL), a framework that leverages the model's intrinsic confidence to construct a curriculum independent from external verifiers.
arXiv Detail & Related papers (2026-02-13T03:40:52Z)
Uncertainty-aware Generative Recommendation [52.0751022792023]
Uncertainty-aware Generative Recommendation (UGR) is a unified framework that leverages uncertainty as a critical signal for adaptive optimization.<n>UGR not only yields superior recommendation performance but also fundamentally stabilizes training, preventing the performance degradation often observed in standard methods.
arXiv Detail & Related papers (2026-02-12T08:48:51Z)
QuAIL: Quality-Aware Inertial Learning for Robust Training under Data Corruption [7.630511612007769]
We present QuAIL, a quality-informed training mechanism that incorporates feature reliability priors directly into the learning process.<n>We show that QuAIL consistently improves average performance over neural baselines under both random and value-dependent corruption.
arXiv Detail & Related papers (2026-02-03T16:06:30Z)
Meta-Cognitive Reinforcement Learning with Self-Doubt and Recovery [25.522943543082363]
We propose a meta-cognitive reinforcement learning framework that enables an agent to assess, regulate, and recover its learning behavior.<n>The proposed method introduces a meta-trust variable driven by Value Prediction Error Stability (VPES), which modulates learning dynamics via fail-safe regulation and gradual trust recovery.
arXiv Detail & Related papers (2026-01-28T02:43:03Z)
Adaptive Learning Guided by Bias-Noise-Alignment Diagnostics [0.7519872646378835]
This paper proposes a diagnostic-driven learning framework that explicitly models error adaptive evolution.<n>These diagnostics are computed online from lightweight statistics of loss or temporal-difference (TD) error trajectories.
arXiv Detail & Related papers (2025-12-30T19:57:52Z)
SG-OIF: A Stability-Guided Online Influence Framework for Reliable Vision Data [6.4391040754741296]
In this paper, we introduce a Stability-Guided Online Influence Framework (SG-OIF) for Approximating training-point influence on test predictions.<n>We show that SG-OIF achieves 91.1% accuracy in the top 1% prediction samples on the CIFAR-10, and 99.8% AUPR score on MNIST.
arXiv Detail & Related papers (2025-11-21T19:58:54Z)
MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics [72.00014675808228]
Instability in Large Language Models evaluation process obscures true learning dynamics.<n>We introduce textbfMaP, a framework that integrates underlineMerging underlineand the underlinePass@k metric.<n>Experiments show that MaP yields significantly smoother performance curves, reduces inter-run variance, and ensures more consistent rankings.
arXiv Detail & Related papers (2025-10-10T11:40:27Z)
Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks [1.5374297736981706]
Federated learning is vulnerable to poisoning and backdoor attacks under partial observability.<n>We introduce a trust-aware Deep Q-Network that integrates multi-signal evidence into client trust updates.
arXiv Detail & Related papers (2025-09-25T13:30:09Z)
Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations [67.35596444651037]
Vision-language models (VLMs) exhibit remarkable zero-shot capabilities but struggle with distribution shifts in downstream tasks when labeled data is unavailable.<n>We propose a Reliable Test-time Adaptation (ReTA) method that enhances reliability from two perspectives.
arXiv Detail & Related papers (2025-07-13T05:37:33Z)
Aurora: Are Android Malware Classifiers Reliable and Stable under Distribution Shift? [51.12297424766236]
AURORA is a framework to evaluate malware classifiers based on their confidence quality and operational resilience.<n>AURORA is complemented by a set of metrics designed to go beyond point-in-time performance.<n>The fragility in SOTA frameworks across datasets of varying drift suggests the need for a return to the whiteboard.
arXiv Detail & Related papers (2025-05-28T20:22:43Z)
ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning [26.369237406972577]
Vision-centric semantic occupancy prediction plays a crucial role in autonomous driving. There is still few research effort to explore the reliability in predicting semantic occupancy from camera. We propose ReliOcc, a method designed to enhance the reliability of camera-based occupancy networks.
arXiv Detail & Related papers (2024-09-26T16:33:16Z)
Joint Differentiable Optimization and Verification for Certified Reinforcement Learning [91.93635157885055]
In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties. We propose a framework that jointly conducts reinforcement learning and formal verification.
arXiv Detail & Related papers (2022-01-28T16:53:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.