Implicit Regularization for Multi-label Feature Selection
- URL: http://arxiv.org/abs/2411.11436v1
- Date: Mon, 18 Nov 2024 10:08:05 GMT
- Title: Implicit Regularization for Multi-label Feature Selection
- Authors: Dou El Kefel Mansouri, Khalid Benabdeslem, Seif-Eddine Benkabou,
- Abstract summary: We address the problem of feature selection in the context of multi-label learning by using a new estimator based on implicit regularization and label embedding.
Experimental results on some known benchmark datasets suggest that the proposed estimator suffers much less from extra bias, and may lead to benign overfitting.
- Score: 1.5771347525430772
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the problem of feature selection in the context of multi-label learning, by using a new estimator based on implicit regularization and label embedding. Unlike the sparse feature selection methods that use a penalized estimator with explicit regularization terms such as $l_{2,1}$-norm, MCP or SCAD, we propose a simple alternative method via Hadamard product parameterization. In order to guide the feature selection process, a latent semantic of multi-label information method is adopted, as a label embedding. Experimental results on some known benchmark datasets suggest that the proposed estimator suffers much less from extra bias, and may lead to benign overfitting.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Embedded Multi-label Feature Selection via Orthogonal Regression [45.55795914923279]
State-of-the-art embedded multi-label feature selection algorithms based on at least square regression cannot preserve sufficient discriminative information in multi-label data.
A novel embedded multi-label feature selection method is proposed to facilitate the multi-label feature selection.
Extensive experimental results on ten multi-label data sets demonstrate the effectiveness of GRROOR.
arXiv Detail & Related papers (2024-03-01T06:18:40Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Combining Self-labeling with Selective Sampling [2.0305676256390934]
This work combines self-labeling techniques with active learning in a selective sampling scenario.
We show that naive application of self-labeling can harm performance by introducing bias towards selected classes.
The proposed method matches current selective sampling methods or achieves better results.
arXiv Detail & Related papers (2023-01-11T11:58:45Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Random Manifold Sampling and Joint Sparse Regularization for Multi-label
Feature Selection [0.0]
The model proposed in this paper can obtain the most relevant few features by solving the joint constrained optimization problems of $ell_2,1$ and $ell_F$ regularization.
Comparative experiments on real-world data sets show that the proposed method outperforms other methods.
arXiv Detail & Related papers (2022-04-13T15:06:12Z) - Unbiased Loss Functions for Multilabel Classification with Missing
Labels [2.1549398927094874]
Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks.
This paper derives the unique unbiased estimators for the different multilabel reductions.
arXiv Detail & Related papers (2021-09-23T10:39:02Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - SPL-MLL: Selecting Predictable Landmarks for Multi-Label Learning [87.27700889147144]
We propose to select a small subset of labels as landmarks which are easy to predict according to input (predictable) and can well recover the other possible labels (representative)
We employ the Alternating Direction Method (ADM) to solve our problem. Empirical studies on real-world datasets show that our method achieves superior classification performance over other state-of-the-art methods.
arXiv Detail & Related papers (2020-08-16T11:07:44Z) - A Compressive Classification Framework for High-Dimensional Data [12.284934135116515]
We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size.
The proposed method, referred to as regularized discriminant analysis (CRDA), is based on linear discriminant analysis.
It has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule.
arXiv Detail & Related papers (2020-05-09T06:55:00Z) - Saliency-based Weighted Multi-label Linear Discriminant Analysis [101.12909759844946]
We propose a new variant of Linear Discriminant Analysis (LDA) to solve multi-label classification tasks.
The proposed method is based on a probabilistic model for defining the weights of individual samples.
The Saliency-based weighted Multi-label LDA approach is shown to lead to performance improvements in various multi-label classification problems.
arXiv Detail & Related papers (2020-04-08T19:40:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.