Related papers: A Computational Model for Logical Analysis of Data

A Computational Model for Logical Analysis of Data

URL: http://arxiv.org/abs/2207.05664v1
Date: Tue, 12 Jul 2022 16:47:59 GMT
Title: A Computational Model for Logical Analysis of Data
Authors: Dani\`ele Gardy and Fr\'ed\'eric Lardeux and Fr\'ed\'eric Saubion
Abstract summary: LAD constitutes an interesting rule-based learning alternative to classic statistical learning techniques. We propose several models for representing the data set of observations, according to the information we have on it. Analytic Combinatorics allows us to express the desired probabilities as ratios of generating functions coefficients.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Initially introduced by Peter Hammer, Logical Analysis of Data is a methodology that aims at computing a logical justification for dividing a group of data in two groups of observations, usually called the positive and negative groups. Consider this partition into positive and negative groups as the description of a partially defined Boolean function; the data is then processed to identify a subset of attributes, whose values may be used to characterize the observations of the positive groups against those of the negative group. LAD constitutes an interesting rule-based learning alternative to classic statistical learning techniques and has many practical applications. Nevertheless, the computation of group characterization may be costly, depending on the properties of the data instances. A major aim of our work is to provide effective tools for speeding up the computations, by computing some \emph{a priori} probability that a given set of attributes does characterize the positive and negative groups. To this effect, we propose several models for representing the data set of observations, according to the information we have on it. These models, and the probabilities they allow us to compute, are also helpful for quickly assessing some properties of the real data at hand; furthermore they may help us to better analyze and understand the computational difficulties encountered by solving methods. Once our models have been established, the mathematical tools for computing probabilities come from Analytic Combinatorics. They allow us to express the desired probabilities as ratios of generating functions coefficients, which then provide a quick computation of their numerical values. A further, long-range goal of this paper is to show that the methods of Analytic Combinatorics can help in analyzing the performance of various algorithms in LAD and related fields.

Related papers

Dynamic Logistic Ensembles with Recursive Probability and Automatic Subset Splitting for Enhanced Binary Classification [2.7396014165932923]
This paper presents a novel approach to binary classification using dynamic logistic ensemble models. We develop an algorithm that automatically partitions the dataset into multiple subsets, constructing an ensemble of logistic models to enhance classification accuracy. This work balances computational efficiency with theoretical rigor, providing a robust and interpretable solution for complex classification tasks.
arXiv Detail & Related papers (2024-11-27T00:22:55Z)
Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance. We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features. In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z)
A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups. We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z)
Generating collective counterfactual explanations in score-based classification via mathematical optimization [4.281723404774889]
A counterfactual explanation of an instance indicates how this instance should be minimally modified so that the perturbed instance is classified in the desired class. Most of the Counterfactual Analysis literature focuses on the single-instance single-counterfactual setting. By means of novel Mathematical Optimization models, we provide a counterfactual explanation for each instance in a group of interest.
arXiv Detail & Related papers (2023-10-19T15:18:42Z)
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions. This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z)
Learning to Bound Counterfactual Inference in Structural Causal Models from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm. The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources. It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z)
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data [81.43750358586072]
We propose Data-IQ, a framework to systematically stratify examples into subgroups with respect to their outcomes. We experimentally demonstrate the benefits of Data-IQ on four real-world medical datasets.
arXiv Detail & Related papers (2022-10-24T08:57:55Z)
A Fresh Approach to Evaluate Performance in Distributed Parallel Genetic Algorithms [5.375634674639956]
This work proposes a novel approach to evaluate and analyze the behavior of multi-population parallel genetic algorithms (PGAs) In particular, we deeply study their numerical and computational behavior by proposing a mathematical model representing the observed performance curves. The conclusions based on the real figures and the numerical models fitting them represent a fresh way of understanding their speed-up, running time, and numerical effort.
arXiv Detail & Related papers (2021-06-18T05:07:14Z)
Grouped Feature Importance and Combined Features Effect Plot [2.15867006052733]
Interpretable machine learning has become a very active area of research due to the rising popularity of machine learning algorithms. We provide a comprehensive overview of how existing model-agnostic techniques can be defined for feature groups to assess the grouped feature importance. We introduce the combined features effect plot, which is a technique to visualize the effect of a group of features based on a sparse, interpretable linear combination of features.
arXiv Detail & Related papers (2021-04-23T16:27:38Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
On the Estimation of Complex Circuits Functional Failure Rate by Machine Learning Techniques [0.16311150636417257]
De-Rating or Vulnerability Factors are a major feature of failure analysis efforts mandated by today's Functional Safety requirements. New approach is proposed which uses Machine Learning to estimate the Functional De-Rating of individual flip-flops.
arXiv Detail & Related papers (2020-02-18T15:18:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.