Related papers: Delphos: A reinforcement learning framework for assisting discrete choice model specification

Delphos: A reinforcement learning framework for assisting discrete choice model specification

URL: http://arxiv.org/abs/2506.06410v2
Date: Fri, 25 Jul 2025 13:23:22 GMT
Title: Delphos: A reinforcement learning framework for assisting discrete choice model specification
Authors: Gabriel Nova, Stephane Hess, Sander van Cranenburgh,
Abstract summary: We introduce Delphos, a deep reinforcement learning framework for assisting the discrete choice model specification process.<n>In this setting, an agent learns to specify well-performing model candidates by choosing a sequence of modelling actions.<n>We evaluate Delphos on both simulated and empirical datasets, varying the size of the modelling space and the reward function.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We introduce Delphos, a deep reinforcement learning framework for assisting the discrete choice model specification process. Unlike traditional approaches that treat model specification as a static optimisation problem, Delphos represents a paradigm shift: it frames this specification challenge as a sequential decision-making problem, formalised as a Markov Decision Process. In this setting, an agent learns to specify well-performing model candidates by choosing a sequence of modelling actions - such as selecting variables, accommodating both generic and alternative-specific taste parameters, applying non-linear transformations, and including interactions with covariates - and interacting with a modelling environment that estimates each candidate and returns a reward signal. Specifically, Delphos uses a Deep Q-Network that receives delayed rewards based on modelling outcomes (e.g., log-likelihood) and behavioural expectations (e.g., parameter signs), and distributes rewards across the sequence of actions to learn which modelling decisions lead to well-performing candidates. We evaluate Delphos on both simulated and empirical datasets, varying the size of the modelling space and the reward function. To assess the agent's performance in navigating the model space, we analyse the learning curve, the distribution of Q-values, occupancy metrics, and Pareto fronts. Our results show that the agent learns to adaptively explore strategies to identify well-performing models across search spaces, even without prior domain knowledge. It efficiently explores large modelling spaces, concentrates its search in high-reward regions, and suggests candidates that define Pareto frontiers balancing model fit and behavioural plausibility. These findings highlight the potential of this novel adaptive, learning-based framework to assist in the model specification process.

Related papers

On the Reasoning Capacity of AI Models and How to Quantify It [0.0]
Large Language Models (LLMs) have intensified the debate surrounding the fundamental nature of their reasoning capabilities.<n>While achieving high performance on benchmarks such as GPQA and MMLU, these models exhibit limitations in more complex reasoning tasks.<n>We propose a novel phenomenological approach that goes beyond traditional accuracy metrics to probe the underlying mechanisms of model behavior.
arXiv Detail & Related papers (2025-01-23T16:58:18Z)
Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.<n>Yet their widespread adoption poses challenges regarding data attribution and interpretability.<n>We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling. Our research explores task-specific model pruning to inform decisions about designing SMoE architectures. We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z)
FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction [26.26211464623954]
Federated Importance-Aware Submodel Extraction (FIARSE) is a novel approach that dynamically adjusts submodels based on the importance of model parameters.<n>Compared to existing works, the proposed method offers a theoretical foundation for the submodel extraction.<n>Extensive experiments are conducted on various datasets to showcase the superior performance of the proposed FIARSE.
arXiv Detail & Related papers (2024-07-28T04:10:11Z)
Attitudes and Latent Class Choice Models using Machine learning [0.0]
We present a method of efficiently incorporating attitudinal indicators in the specification of Latent Class Choice Models (LCCM) This formulation overcomes structural equations in its capability of exploring the relationship between the attitudinal indicators and the decision choice. We test our proposed framework for estimating a Car-Sharing (CS) service subscription choice with stated preference data from Copenhagen, Denmark.
arXiv Detail & Related papers (2023-02-20T10:03:01Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem. The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z)
DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model. We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z)
Deep Variational Models for Collaborative Filtering-based Recommender Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results. Our proposed models apply the variational concept to injectity in the latent space of the deep architecture. Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation [3.728946517493471]
MEEE is a model-ensemble method that consists of optimistic exploration and weighted exploitation. Our approach outperforms other model-free and model-based state-of-the-art methods, especially in sample complexity.
arXiv Detail & Related papers (2021-07-05T07:18:20Z)
Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation [11.219641045667055]
We propose an end-to-end approach for model learning which directly optimize the expected returns using implicit differentiation. We provide theoretical and empirical evidence highlighting the benefits of our approach in the model misspecification regime compared to likelihood-based methods.
arXiv Detail & Related papers (2021-06-06T23:15:49Z)
An exact counterfactual-example-based approach to tree-ensemble models interpretability [0.0]
High-performance models do not exhibit the necessary transparency to make their decisions fully understandable. We could derive an exact geometrical characterisation of their decision regions under the form of a collection of multidimensional intervals. An adaptation to reasoning on regression problems is also envisaged.
arXiv Detail & Related papers (2021-05-31T09:32:46Z)
Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics. Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z)
Characterizing Fairness Over the Set of Good Models Under Selective Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance. We provide tractable algorithms to compute the range of attainable group-level predictive disparities. We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z)
Learning Consistent Deep Generative Models from Sparse Data via Prediction Constraints [16.48824312904122]
We develop a new framework for learning variational autoencoders and other deep generative models. We show that these two contributions -- prediction constraints and consistency constraints -- lead to promising image classification performance.
arXiv Detail & Related papers (2020-12-12T04:18:50Z)
Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models. We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z)
On the model-based stochastic value gradient for continuous reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward. Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.