Related papers: Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

URL: http://arxiv.org/abs/2602.14643v2
Date: Tue, 17 Feb 2026 16:44:27 GMT
Title: Arbor: A Framework for Reliable Navigation of Critical Conversation Flows
Authors: Luís Silva, Diogo Gonçalves, Catarina Farinha, Clara Matos, Luís Ungaro,
Abstract summary: We present Arbor, a framework that decomposes decision tree navigation into specialized, node-level tasks.<n>Abort improves mean turn accuracy by 29.4 percentage points, reduces per-turn latency by 57.1%, and an average 14.4x reduction in per-turn cost.
Score: 0.19573380763700712
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models struggle to maintain strict adherence to structured workflows in high-stakes domains such as healthcare triage. Monolithic approaches that encode entire decision structures within a single prompt are prone to instruction-following degradation as prompt length increases, including lost-in-the-middle effects and context window overflow. To address this gap, we present Arbor, a framework that decomposes decision tree navigation into specialized, node-level tasks. Decision trees are standardized into an edge-list representation and stored for dynamic retrieval. At runtime, a directed acyclic graph (DAG)-based orchestration mechanism iteratively retrieves only the outgoing edges of the current node, evaluates valid transitions via a dedicated LLM call, and delegates response generation to a separate inference step. The framework is agnostic to the underlying decision logic and model provider. Evaluated against single-prompt baselines across 10 foundation models using annotated turns from real clinical triage conversations. Arbor improves mean turn accuracy by 29.4 percentage points, reduces per-turn latency by 57.1%, and achieves an average 14.4x reduction in per-turn cost. These results indicate that architectural decomposition reduces dependence on intrinsic model capability, enabling smaller models to match or exceed larger models operating under single-prompt baselines.

Related papers

Recursive Concept Evolution for Compositional Reasoning in Large Language Models [0.0]
Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning.<n>We propose Recursive Concept Evolution (RCE), a framework that enables pretrained language models to modify their internal representation geometry during inference.<n>RCE yields 12-18 point gains on ARC-AGI-2, 8-14 point improvements on GPQA and BBH, and consistent reductions in depth-induced error on MATH and HLE.
arXiv Detail & Related papers (2026-02-17T17:01:42Z)
Breaking the Static Graph: Context-Aware Traversal for Robust Retrieval-Augmented Generation [12.71443292660797]
We propose CatRAG, Context-Aware Traversal for robust RAG.<n>CatRAG builds on the HippoRAG 2 architecture and transforms the static KG into a query-adaptive navigation structure.<n> Experiments across four multi-hop benchmarks demonstrate that CatRAG consistently outperforms state of the art baselines.
arXiv Detail & Related papers (2026-02-02T11:13:38Z)
Theoretical Foundations of Prompt Engineering: From Heuristics to Expressivity [0.0]
We study the family of functions obtainable by holding a Transformer backbone fixed as an executor and varying only the prompt.<n>We prove a constructive existential result showing that a single fixed backbone can approximate a broad class of target behaviors via prompts alone.
arXiv Detail & Related papers (2025-12-14T13:42:20Z)
Unleashing Degradation-Carrying Features in Symmetric U-Net: Simpler and Stronger Baselines for All-in-One Image Restoration [52.82397287366076]
All-in-one image restoration aims to handle diverse degradations (e.g., noise, blur, adverse weather) within a unified framework.<n>In this work, we reveal a critical insight: well-crafted feature extraction inherently encodes degradation-carrying information.<n>Our symmetric design preserves intrinsic degradation signals robustly, rendering simple additive fusion in skip connections.
arXiv Detail & Related papers (2025-12-11T12:20:31Z)
Tiny Recursive Models on ARC-AGI-1: Inductive Biases, Identity Conditioning, and Test-Time Compute [0.0]
We empirically analyze the ARC Prize TRM checkpoint on ARC-AGI-1.<n>We show that test-time augmentation and majority-vote ensembling account for a substantial fraction of reported performance.<n>We also compare TRM with a naive QLoRA fine-tune of Llama 3 8B on canonical ARC-AGI-1.
arXiv Detail & Related papers (2025-12-04T06:20:44Z)
LLM-guided Hierarchical Retrieval [54.73080745446999]
LATTICE is a hierarchical retrieval framework that enables an LLM to reason over and navigate large corpora with logarithmic search complexity.<n>A central challenge in such LLM-guided search is that the model's relevance judgments are noisy, context-dependent, and unaware of the hierarchy.<n>Our framework achieves state-of-the-art zero-shot performance on the reasoning-intensive BRIGHT benchmark.
arXiv Detail & Related papers (2025-10-15T07:05:17Z)
Eigen-1: Adaptive Multi-Agent Refinement with Monitor-Based RAG for Scientific Reasoning [53.45095336430027]
We develop a unified framework that combines implicit retrieval and structured collaboration.<n>On Humanity's Last Exam (HLE) Bio/Chem Gold, our framework achieves 48.3% accuracy.<n>Results on SuperGPQA and TRQA confirm robustness across domains.
arXiv Detail & Related papers (2025-09-25T14:05:55Z)
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling [65.46347858249295]
TreePO is a self-guided rollout algorithm that views sequence generation as a tree-structured searching process.<n>TreePO essentially reduces the per-update compute burden while preserving or enhancing exploration diversity.
arXiv Detail & Related papers (2025-08-24T16:52:37Z)
COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks. We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges. Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z)
DepGraph: Towards Any Structural Pruning [68.40343338847664]
We study general structural pruning of arbitrary architecture like CNNs, RNNs, GNNs and Transformers. We propose a general and fully automatic method, emphDependency Graph (DepGraph), to explicitly model the dependency between layers and comprehensively group parameters for pruning. In this work, we extensively evaluate our method on several architectures and tasks, including ResNe(X)t, DenseNet, MobileNet and Vision transformer for images, GAT for graph, DGCNN for 3D point cloud, alongside LSTM for language, and demonstrate that, even with a
arXiv Detail & Related papers (2023-01-30T14:02:33Z)
Neural Transition System for End-to-End Opinion Role Labeling [13.444895891262844]
Unified opinion role labeling (ORL) aims to detect all possible opinion structures of opinion-holder-target' in one shot, given a text. We propose a novel solution by revisiting the transition architecture, and augment it with a pointer network (PointNet) The framework parses out all opinion structures in linear-time complexity, breaks through the limitation of any length of terms with PointNet.
arXiv Detail & Related papers (2021-10-05T12:45:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.