The Promotion Wall: Efficiency-Equity Trade-offs of Direct Promotion Regimes in Engineering Education
- URL: http://arxiv.org/abs/2511.17182v1
- Date: Fri, 21 Nov 2025 12:04:31 GMT
- Title: The Promotion Wall: Efficiency-Equity Trade-offs of Direct Promotion Regimes in Engineering Education
- Authors: H. R. Paz,
- Abstract summary: Article uses a calibrated agent-based model to examine how alternative progression regimes reconfigure dropout, time-to-degree, equity and students' psychological experience.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Progression and assessment rules are often treated as administrative details, yet they fundamentally shape who is allowed to remain in higher education, and on what terms. This article uses a calibrated agent-based model to examine how alternative progression regimes reconfigure dropout, time-to-degree, equity and students' psychological experience in a long, tightly sequenced engineering programme. Building on a leakage-aware longitudinal dataset of 1,343 students and a Kaplan-Meier survival analysis of time-to-dropout, we simulate three policy scenarios: (A) a historical "regularity + finals" regime, where students accumulate exam debt; (B) a direct-promotion regime that removes regularity and finals but requires full course completion each term; and (C) a direct-promotion regime complemented by a capacity-limited remedial "safety net" for marginal failures in bottleneck courses. The model is empirically calibrated to reproduce the observed dropout curve under Scenario A and then used to explore counterfactuals. Results show that direct promotion creates a "promotion wall": attrition becomes sharply front-loaded in the first two years, overall dropout rises, and equity gaps between low- and high-resilience students widen, even as exam debt disappears. The safety-net scenario partially dismantles this wall: it reduces dropout and equity gaps relative to pure direct promotion and yields the lowest final stress levels, at the cost of additional, targeted teaching capacity. These findings position progression rules as central objects of assessment policy rather than neutral background. The article argues that claims of improved efficiency are incomplete unless they are evaluated jointly with inclusion, equity and students' psychological wellbeing, and it illustrates how simulation-based decision support can help institutions rehearse assessment reforms before implementing them.
Related papers
- A Step Back: Prefix Importance Ratio Stabilizes Policy Optimization [58.116300485427764]
Reinforcement learning post-training can elicit reasoning behaviors in large language models.<n> token-level correction often leads to unstable training dynamics when the degree of off-policyness is large.<n>We propose a simple yet effective objective, Minimum Prefix Ratio (MinPRO)
arXiv Detail & Related papers (2026-01-30T08:47:19Z) - Stable On-Policy Distillation through Adaptive Target Reformulation [7.361248172930405]
Veto is an objective-level reformulation that constructs a geometric bridge in the logit space.<n>Veto consistently outperforms supervised fine-tuning and existing on-policy baselines.
arXiv Detail & Related papers (2026-01-12T02:57:39Z) - CAPIRE Intervention Lab: An Agent-Based Policy Simulation Environment for Curriculum-Constrained Engineering Programmes [0.0]
Engineering programmes in Latin America produce dropout rates that remain stubbornly high despite increasingly accurate early-warning models.<n> Predictive learning analytics can identify students at risk, but they offer limited guidance on which concrete combinations of policies should be implemented.<n>This paper presents the CAPIRE Intervention Lab, an agent-based simulation environment designed to complement predictive models.
arXiv Detail & Related papers (2025-11-22T18:14:15Z) - An Agent-Based Simulation of Regularity-Driven Student Attrition: How Institutional Time-to-Live Constraints Create a Dropout Trap in Higher Education [0.0]
"The Regularity Trap" is a phenomenon where rigid assessment timelines decouple learning from accreditation.<n>We operationalize the CAPIRE framework into a calibrated Agent-Based Model (ABM) simulating 1,343 student trajectories across a 42-course Civil Engineering curriculum.<n>Results reveal that 86.4% of observed dropouts are driven by normative mechanisms (expiry cascades) rather than purely academic failure.
arXiv Detail & Related papers (2025-11-20T11:21:39Z) - Provable Benefit of Curriculum in Transformer Tree-Reasoning Post-Training [76.12556589212666]
We show that curriculum post-training avoids the exponential complexity bottleneck.<n>Under outcome-only reward signals, reinforcement learning finetuning achieves high accuracy with sample complexity.<n>We establish guarantees for test-time scaling, where curriculum-aware querying reduces both reward oracle calls and sampling cost from exponential to order.
arXiv Detail & Related papers (2025-11-10T18:29:54Z) - Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning [77.92320830700797]
Reinforcement Learning has played a central role in enabling reasoning capabilities of Large Language Models.<n>We propose a tractable computational framework that tracks and leverages curvature information during policy updates.<n>The algorithm, Curvature-Aware Policy Optimization (CAPO), identifies samples that contribute to unstable updates and masks them out.
arXiv Detail & Related papers (2025-10-01T12:29:32Z) - Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling [41.834250664485666]
Large reasoning models generate excessively long reasoning paths without any performance benefit.<n>Existing solutions that penalize length often fail, inducing performance degradation.<n>We introduce a novel framework, DECS, built on our theoretical discovery of two previously unaddressed flaws in current length rewards.
arXiv Detail & Related papers (2025-09-30T06:04:43Z) - PCPO: Proportionate Credit Policy Optimization for Aligning Image Generation Models [54.18605375476406]
We introduce Proportionate Credit Policy Optimization (PCPO), a framework that enforces proportional credit assignment through a stable objective reformulation and a principled reweighting of timesteps.<n>PCPO substantially outperforms existing policy gradient baselines on all fronts, including the state-of-the-art DanceGRPO.
arXiv Detail & Related papers (2025-09-30T04:43:58Z) - Hierarchical Decomposition of Prompt-Based Continual Learning:
Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice.
HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics.
Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z) - Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning [11.084321518414226]
We adapt existing importance-sampling ratio estimation techniques for off-policy evaluation to drastically improve the stability and efficiency of so-called hindsight policy methods.
Our hindsight distribution correction facilitates stable, efficient learning across a broad range of environments where credit assignment plagues baseline methods.
arXiv Detail & Related papers (2023-07-21T20:54:52Z) - CRISP: Curriculum Inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning [25.84621883831624]
CRISP is a curriculum-driven framework that tackles instability in hierarchical reinforcement learning.<n>It adaptively re-labels expert demonstrations to always generate reachable subgoals by the current low-level primitive.<n>It improves success rates by more than 40% over strong hierarchical and flat baselines.
arXiv Detail & Related papers (2023-04-07T08:22:50Z) - Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement
Learning [0.0]
We revisit the estimation bias in policy gradients for the discounted episodic Markov decision process (MDP) from Deep Reinforcement Learning perspective.
One of the major policy biases is the state distribution shift.
We show that, despite such state distribution shift, the policy gradient estimation bias can be reduced in the following three ways.
arXiv Detail & Related papers (2023-01-20T06:46:43Z) - DDPG++: Striving for Simplicity in Continuous-control Off-Policy
Reinforcement Learning [95.60782037764928]
We show that simple Deterministic Policy Gradient works remarkably well as long as the overestimation bias is controlled.
Second, we pinpoint training instabilities, typical of off-policy algorithms, to the greedy policy update step.
Third, we show that ideas in the propensity estimation literature can be used to importance-sample transitions from replay buffer and update policy to prevent deterioration of performance.
arXiv Detail & Related papers (2020-06-26T20:21:12Z) - Corruption-robust exploration in episodic reinforcement learning [76.19192549843727]
We study multi-stage episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system.
Our framework yields efficient algorithms which attain near-optimal regret in the absence of corruptions.
Notably, our work provides the first sublinear regret guarantee which any deviation from purely i.i.d. transitions in the bandit-feedback model for episodic reinforcement learning.
arXiv Detail & Related papers (2019-11-20T03:49:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.