Model-Free Counterfactual Subset Selection at Scale
- URL: http://arxiv.org/abs/2502.08326v1
- Date: Wed, 12 Feb 2025 11:48:15 GMT
- Title: Model-Free Counterfactual Subset Selection at Scale
- Authors: Minh Hieu Nguyen, Viet Hung Doan, Anh Tuan Nguyen, Jun Jo, Quoc Viet Hung Nguyen,
- Abstract summary: Streaming explanations offer adaptive, real-time insights without requiring persistent storage of the entire dataset.
Our algorithm operates efficiently in streaming settings, maintaining $O(log k)$ update complexity per item.
Empirical evaluations on both real-world and synthetic datasets demonstrate superior performance over baseline methods.
- Score: 11.646993755965006
- License:
- Abstract: Ensuring transparency in AI decision-making requires interpretable explanations, particularly at the instance level. Counterfactual explanations are a powerful tool for this purpose, but existing techniques frequently depend on synthetic examples, introducing biases from unrealistic assumptions, flawed models, or skewed data. Many methods also assume full dataset availability, an impractical constraint in real-time environments where data flows continuously. In contrast, streaming explanations offer adaptive, real-time insights without requiring persistent storage of the entire dataset. This work introduces a scalable, model-free approach to selecting diverse and relevant counterfactual examples directly from observed data. Our algorithm operates efficiently in streaming settings, maintaining $O(\log k)$ update complexity per item while ensuring high-quality counterfactual selection. Empirical evaluations on both real-world and synthetic datasets demonstrate superior performance over baseline methods, with robust behavior even under adversarial conditions.
Related papers
- Testing Generalizability in Causal Inference [3.547529079746247]
There is no formal procedure for statistically evaluating generalizability in machine learning algorithms.
We propose a systematic and quantitative framework for evaluating model generalizability in causal inference settings.
By basing simulations on real data, our method ensures more realistic evaluations, which is often missing in current work.
arXiv Detail & Related papers (2024-11-05T11:44:00Z) - How to Leverage Diverse Demonstrations in Offline Imitation Learning [39.24627312800116]
Offline Imitation Learning (IL) with imperfect demonstrations has garnered increasing attention owing to the scarcity of expert data.
We introduce a simple yet effective data selection method that identifies positive behaviors based on their resultant states.
We then devise a lightweight behavior cloning algorithm capable of leveraging the expert and selected data correctly.
arXiv Detail & Related papers (2024-05-24T04:56:39Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Model-based Offline Imitation Learning with Non-expert Data [7.615595533111191]
We propose a scalable model-based offline imitation learning algorithmic framework that leverages datasets collected by both suboptimal and optimal policies.
We show that the proposed method textitalways outperforms Behavioral Cloning in the low data regime on simulated continuous control domains.
arXiv Detail & Related papers (2022-06-11T13:08:08Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Training Deep Normalizing Flow Models in Highly Incomplete Data
Scenarios with Prior Regularization [13.985534521589257]
We propose a novel framework to facilitate the learning of data distributions in high paucity scenarios.
The proposed framework naturally stems from posing the process of learning from incomplete data as a joint optimization task.
arXiv Detail & Related papers (2021-04-03T20:57:57Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.