Related papers: Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms

Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms

URL: http://arxiv.org/abs/2301.02053v1
Date: Thu, 5 Jan 2023 13:02:35 GMT
Title: Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms
Authors: Yanhao Wang and Michael Mathioudakis and Jia Li and Francesco Fabbri
Abstract summary: We propose an exact algorithm that is suitable for small datasets as well as a $frac1-varepsilon integer integer5$-approximation algorithm for any $varepsilon in (0, 1)$ that scales to large datasets. Experiments on real-world datasets demonstrate the superior performance of our proposed algorithms over existing ones.
Score: 17.57585822765145
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diversity maximization aims to select a diverse and representative subset of items from a large dataset. It is a fundamental optimization task that finds applications in data summarization, feature selection, web search, recommender systems, and elsewhere. However, in a setting where data items are associated with different groups according to sensitive attributes like sex or race, it is possible that algorithmic solutions for this task, if left unchecked, will under- or over-represent some of the groups. Therefore, we are motivated to address the problem of \emph{max-min diversification with fairness constraints}, aiming to select $k$ items to maximize the minimum distance between any pair of selected items while ensuring that the number of items selected from each group falls within predefined lower and upper bounds. In this work, we propose an exact algorithm based on integer linear programming that is suitable for small datasets as well as a $\frac{1-\varepsilon}{5}$-approximation algorithm for any $\varepsilon \in (0, 1)$ that scales to large datasets. Extensive experiments on real-world datasets demonstrate the superior performance of our proposed algorithms over existing ones.

Related papers

Finding a Fair Scoring Function for Top-$k$ Selection: Hardness, Algorithms, and Experiments [0.0]
We consider the problem of identifying a linear scoring function for top-$k$ selection that is fair. The function computes a score for each item as a weighted sum of its (numerical) attribute values. Existing algorithms do not scale effectively on large, high-dimensional datasets.
arXiv Detail & Related papers (2025-03-14T16:40:36Z)
Scalable Private Partition Selection via Adaptive Weighting [66.09199304818928]
In a private set union, users hold subsets of items from an unbounded universe. The goal is to output as many items as possible from the union of the users' sets while maintaining user-level differential privacy. We propose an algorithm for this problem, MaximumDegree (MAD), which adaptively reroutes weight from items with weight far above the threshold needed for privacy to items with smaller weight.
arXiv Detail & Related papers (2025-02-13T01:27:11Z)
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition. Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z)
Multi-objective Binary Coordinate Search for Feature Selection [0.24578723416255746]
We propose the binary multi-objective coordinate search (MOCS) algorithm to solve large-scale feature selection problems. Results indicate the significant superiority of our method over NSGA-II, on five real-world large-scale datasets.
arXiv Detail & Related papers (2024-02-20T00:50:26Z)
Achieving Long-term Fairness in Submodular Maximization through Randomization [16.33001220320682]
It is important to implement fairness-aware algorithms when dealing with data items that may contain sensitive attributes like race or gender. We investigate the problem of maximizing a monotone submodular function while meeting group fairness constraints.
arXiv Detail & Related papers (2023-04-10T16:39:19Z)
Group Fairness in Non-monotone Submodular Maximization [19.29174615532181]
We study the non-monotone submodular problem subject to novel group fairness constraints. We develop the first constant-factor approximation algorithms for this problem.
arXiv Detail & Related papers (2023-02-03T04:51:54Z)
Low Budget Active Learning via Wasserstein Distance: An Integer Programming Approach [81.19737119343438]
Active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label. We propose a new integer optimization problem for selecting a core set that minimizes the discrete Wasserstein distance from the unlabeled pool. Our strategy requires high-quality latent features which we obtain by unsupervised learning on the unlabeled pool.
arXiv Detail & Related papers (2021-06-05T21:25:03Z)
Certifiably Polynomial Algorithm for Best Group Subset Selection [0.9667631210393929]
Best group subset selection aims to choose a small part of non-overlapping groups to achieve the best interpretability on the response variable. We propose a group-splicing algorithm that iteratively detects effective groups and excludes the helpless ones. We demonstrate the efficiency and accuracy of our proposal by comparing state-of-the-art algorithms on both synthetic and real-world datasets.
arXiv Detail & Related papers (2021-04-23T03:05:11Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
Non-Adaptive Adaptive Sampling on Turnstile Streams [57.619901304728366]
We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume on turnstile streams that use space sublinear in $n$. Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model.
arXiv Detail & Related papers (2020-04-23T05:00:21Z)
Optimizing Revenue while showing Relevant Assortments at Scale [1.0200170217746136]
Real-time assortment optimization has become essential in e-commerce operations. We design fast and flexible algorithms that find the optimal assortment in difficult regimes. Empirical validations using a real world dataset show that our algorithms are competitive even when the number of items is $sim 105$ ($10times$ larger instances than previously studied)
arXiv Detail & Related papers (2020-03-06T20:16:49Z)
Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers. We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z)
Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback. We devise an algorithm with a minimal cluster recovery error rate. For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.