Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data
- URL: http://arxiv.org/abs/2602.20152v1
- Date: Mon, 23 Feb 2026 18:59:04 GMT
- Title: Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data
- Authors: Zhenyao Ma, Yue Liang, Dongxu Li,
- Abstract summary: Behavior Learning (BL) learns interpretable and identifiable optimization structures from data.<n>BL unifies predictive performance, interpretability, and identifiability, with broad applicability to scientific domains involving optimization.
- Score: 17.826786061390962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by behavioral science, we propose Behavior Learning (BL), a novel general-purpose machine learning framework that learns interpretable and identifiable optimization structures from data, ranging from single optimization problems to hierarchical compositions. It unifies predictive performance, intrinsic interpretability, and identifiability, with broad applicability to scientific domains involving optimization. BL parameterizes a compositional utility function built from intrinsically interpretable modular blocks, which induces a data distribution for prediction and generation. Each block represents and can be written in symbolic form as a utility maximization problem (UMP), a foundational paradigm in behavioral science and a universal framework of optimization. BL supports architectures ranging from a single UMP to hierarchical compositions, the latter modeling hierarchical optimization structures. Its smooth and monotone variant (IBL) guarantees identifiability. Theoretically, we establish the universal approximation property of BL, and analyze the M-estimation properties of IBL. Empirically, BL demonstrates strong predictive performance, intrinsic interpretability and scalability to high-dimensional data. Code: https://github.com/MoonYLiang/Behavior-Learning ; install via pip install blnetwork.
Related papers
- Structural Compositional Function Networks: Interpretable Functional Compositions for Tabular Discovery [4.8369208007394215]
We propose Structural Compositional Networks ( StructuralCFN), a novel architecture that imposes a Relation-Aware Inductive Bias via a differentiable structural prior.<n>Our framework enables Structured Knowledge Integration, allowing domain-specific relational priors to be injected directly into the architecture to guide discovery.<n>We evaluate StructuralCFN across a rigorous 10-fold cross-validation suite on 18 benchmarks, demonstrating statistically significant improvements.
arXiv Detail & Related papers (2026-01-27T20:20:07Z) - Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models [48.83701310501069]
Large Language Models (LLMs) offer a transformative approach to Neural Architecture Search (NAS)<n>We formulate the search as a sequence of conditional code generation tasks, where an LLM refines architectural specifications based on performance telemetry.<n>We generate a vast corpus of valid, shape-consistent architectures via Abstract Syntax Tree (AST) mutations.<n> Experimental results on CIFAR-100 validate the efficacy of this approach, demonstrating that the model yields statistically significant improvements in accuracy.
arXiv Detail & Related papers (2026-01-13T13:00:30Z) - Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression [55.51959317490934]
Large language models (LLMs) have demonstrated promising capabilities in Text-Attributed Graph (TAG) understanding.<n>We argue that graphs inherently contain rich structural and semantic information, and that their effective exploitation can unlock potential gains in LLMs reasoning performance.<n>We propose Homophily-aware Structural and Semantic Compression for LLMs (HS2C), a framework centered on exploiting graph homophily.
arXiv Detail & Related papers (2026-01-13T03:35:18Z) - LSRIF: Logic-Structured Reinforcement Learning for Instruction Following [56.517329105764475]
We propose a logic-structured training framework LSRIF that explicitly models instruction logic.<n> Experiments show LSRIF brings significant improvements in instruction-following and general reasoning.
arXiv Detail & Related papers (2026-01-10T05:11:38Z) - MAGIC-Flow: Multiscale Adaptive Conditional Flows for Generation and Interpretable Classification [0.0]
MAGIC-Flow is a conditional multiscale normalizing flow architecture that performs generation and classification within a single framework.<n>We show how this ensures exact likelihood and stable optimization, while invertibility enables explicit visualization of sample likelihoods.<n>We evaluate MAGIC-Flow against top baselines using metrics for similarity, fidelity, and diversity.
arXiv Detail & Related papers (2025-10-24T23:11:25Z) - Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.<n>We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.<n> Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z) - LOCAL: Learning with Orientation Matrix to Infer Causal Structure from Time Series Data [51.47827479376251]
LOCAL is a highly efficient, easy-to-implement, and constraint-free method for recovering dynamic causal structures.<n>Asymptotic Causal Learning Mask (ACML) and Dynamic Graph Learning (DGPL)<n>Experiments on synthetic and real-world datasets demonstrate that LOCAL significantly outperforms existing methods.
arXiv Detail & Related papers (2024-10-25T10:48:41Z) - Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices [88.33936714942996]
We present a unifying framework that enables searching among all linear operators expressible via an Einstein summation.
We show that differences in the compute-optimal scaling laws are mostly governed by a small number of variables.
We find that Mixture-of-Experts (MoE) learns an MoE in every single linear layer of the model, including the projection in the attention blocks.
arXiv Detail & Related papers (2024-10-03T00:44:50Z) - Comparative Code Structure Analysis using Deep Learning for Performance
Prediction [18.226950022938954]
This paper aims to assess the feasibility of using purely static information (e.g., abstract syntax tree or AST) of applications to predict performance change based on the change in code structure.
Our evaluations of several deep embedding learning methods demonstrate that tree-based Long Short-Term Memory (LSTM) models can leverage the hierarchical structure of source-code to discover latent representations and achieve up to 84% (individual problem) and 73% (combined dataset with multiple of problems) accuracy in predicting the change in performance.
arXiv Detail & Related papers (2021-02-12T16:59:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.