Obstructing Classification via Projection
- URL: http://arxiv.org/abs/2105.09047v1
- Date: Wed, 19 May 2021 10:28:15 GMT
- Title: Obstructing Classification via Projection
- Authors: Pantea Haghighatkhah, Wouter Meulemans, Bettina Speckman, J\'er\^ome
Urhausen, Kevin Verbeek
- Abstract summary: We study a geometric problem which models a possible approach for bias removal.
A priori we assume that it is "easy" to classify the data according to each property.
Our goal is to obstruct the classification according to one property by a suitable projection to a lower-dimensional Euclidean space Rm.
- Score: 2.456909016197174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning and data mining techniques are effective tools to classify
large amounts of data. But they tend to preserve any inherent bias in the data,
for example, with regards to gender or race. Removing such bias from data or
the learned representations is quite challenging. In this paper we study a
geometric problem which models a possible approach for bias removal. Our input
is a set of points P in Euclidean space R^d and each point is labeled with k
binary-valued properties. A priori we assume that it is "easy" to classify the
data according to each property. Our goal is to obstruct the classification
according to one property by a suitable projection to a lower-dimensional
Euclidean space R^m (m < d), while classification according to all other
properties remains easy.
What it means for classification to be easy depends on the classification
model used. We first consider classification by linear separability as employed
by support vector machines. We use Kirchberger's Theorem to show that, under
certain conditions, a simple projection to R^(d-1) suffices to eliminate the
linear separability of one of the properties whilst maintaining the linear
separability of the other properties. We also study the problem of maximizing
the linear "inseparability" of the chosen property. Second, we consider more
complex forms of separability and prove a connection between the number of
projections required to obstruct classification and the Helly-type properties
of such separabilities.
Related papers
- Benign Overfitting and the Geometry of the Ridge Regression Solution in Binary Classification [75.01389991485098]
We show that ridge regression has qualitatively different behavior depending on the scale of the cluster mean vector.
In regimes where the scale is very large, the conditions that allow for benign overfitting turn out to be the same as those for the regression task.
arXiv Detail & Related papers (2025-03-11T01:45:42Z) - Distribution-Specific Agnostic Conditional Classification With Halfspaces [27.197384293699134]
We study selective'' or conditional'' classification problems under an agnostic setting.
We present the first PAC-learning algorithm for homogeneous halfspace selectors with error guarantee.
We prove that approximating conditional classification is at least as hard as approximating classification in both additive and multiplicative form.
arXiv Detail & Related papers (2025-01-31T21:29:20Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - Relative intrinsic dimensionality is intrinsic to learning [49.5738281105287]
We introduce a new notion of the intrinsic dimension of a data distribution, which precisely captures the separability properties of the data.
For this intrinsic dimension, the rule of thumb above becomes a law: high intrinsic dimension guarantees highly separable data.
We show thisRelative intrinsic dimension provides both upper and lower bounds on the probability of successfully learning and generalising in a binary classification problem.
arXiv Detail & Related papers (2023-10-10T10:41:45Z) - Beyond Separability: Analyzing the Linear Transferability of Contrastive
Representations to Related Subpopulations [50.33975968859988]
Contrastive learning is a highly effective method which uses unlabeled data to produce representations which are linearly separable for downstream classification tasks.
Recent works have shown that contrastive representations are not only useful when data come from a single domain, but are also effective for transferring across domains.
arXiv Detail & Related papers (2022-04-06T09:10:23Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Classification of high-dimensional data with spiked covariance matrix
structure [0.2741266294612775]
We study the classification problem for high-dimensional data with $n$ observations on $p$ features.
We propose an adaptive classifier that first performs dimension reduction on the feature vectors prior to classification in the dimensionally reduced space.
We show that the resulting classifier is Bayes optimal whenever $n rightarrow infty$ and $s sqrtn-1 ln p rightarrow 0$.
arXiv Detail & Related papers (2021-10-05T11:26:53Z) - Multilevel orthogonal Bochner function subspaces with applications to
robust machine learning [1.533771872970755]
We consider the data as instances of a random field within a relevant Bochner space.
Our key observation is that the classes can predominantly reside in two distinct subspaces.
arXiv Detail & Related papers (2021-10-04T22:01:01Z) - Linear Classifiers in Mixed Constant Curvature Spaces [40.82908295137667]
We address the problem of linear classification in a product space form -- a mix of Euclidean, spherical, and hyperbolic spaces.
We prove that linear classifiers in $d$-dimensional constant curvature spaces can shatter exactly $d+1$ points.
We describe a novel perceptron classification algorithm, and establish rigorous convergence results.
arXiv Detail & Related papers (2021-02-19T23:29:03Z) - Evaluating Fairness of Machine Learning Models Under Uncertain and
Incomplete Information [25.739240011015923]
We show that the test accuracy of the attribute classifier is not always correlated with its effectiveness in bias estimation for a downstream model.
Our analysis has surprising and counter-intuitive implications where in certain regimes one might want to distribute the error of the attribute classifier as unevenly as possible.
arXiv Detail & Related papers (2021-02-16T19:02:55Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Signal classification using weighted orthogonal regression method [0.0]
This paper proposes a new classification method that exploits the intrinsic structure of each class through the corresponding Eigen components.
The proposed method involves the obtained Eigenvectors by SVD of data from each class to select the bases for each subspace.
It considers an efficient weighting for the decision-making criterion to discriminate two classes.
arXiv Detail & Related papers (2020-10-12T19:12:14Z) - Null It Out: Guarding Protected Attributes by Iterative Nullspace
Projection [51.041763676948705]
Iterative Null-space Projection (INLP) is a novel method for removing information from neural representations.
We show that our method is able to mitigate bias in word embeddings, as well as to increase fairness in a setting of multi-class classification.
arXiv Detail & Related papers (2020-04-16T14:02:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.