Active Selection of Classification Features
- URL: http://arxiv.org/abs/2102.13636v1
- Date: Fri, 26 Feb 2021 18:19:08 GMT
- Title: Active Selection of Classification Features
- Authors: Thomas T. Kok and Rachel M. Brouwer and Rene M. Mandl and Hugo G.
Schnack and Georg Krempl
- Abstract summary: Auxiliary data, such as demographics, might help in selecting a smaller sample that comprises the individuals with the most informative MRI scans.
We propose two utility-based approaches for this problem, and evaluate their performance on three public real-world benchmark datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Some data analysis applications comprise datasets, where explanatory
variables are expensive or tedious to acquire, but auxiliary data are readily
available and might help to construct an insightful training set. An example is
neuroimaging research on mental disorders, specifically learning a
diagnosis/prognosis model based on variables derived from expensive Magnetic
Resonance Imaging (MRI) scans, which often requires large sample sizes.
Auxiliary data, such as demographics, might help in selecting a smaller sample
that comprises the individuals with the most informative MRI scans. In active
learning literature, this problem has not yet been studied, despite promising
results in related problem settings that concern the selection of instances or
instance-feature pairs.
Therefore, we formulate this complementary problem of Active Selection of
Classification Features (ASCF): Given a primary task, which requires to learn a
model f: x-> y to explain/predict the relationship between an
expensive-to-acquire set of variables x and a class label y. Then, the
ASCF-task is to use a set of readily available selection variables z to select
these instances, that will improve the primary task's performance most when
acquiring their expensive features z and including them to the primary training
set.
We propose two utility-based approaches for this problem, and evaluate their
performance on three public real-world benchmark datasets. In addition, we
illustrate the use of these approaches to efficiently acquire MRI scans in the
context of neuroimaging research on mental disorders, based on a simulated
study design with real MRI data.
Related papers
- The Relevance Feature and Vector Machine for health applications [0.11538034264098687]
This paper presents a novel model that addresses the challenges of the fat-data problem when dealing with clinical prospective studies.
The model capabilities are tested against state-of-the-art models in several medical datasets with fat-data problems.
arXiv Detail & Related papers (2024-02-11T01:21:56Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Learn to Ignore: Domain Adaptation for Multi-Site MRI Analysis [1.3079444139643956]
We present a novel method that learns to ignore the scanner-related features present in the images, while learning features relevant for the classification task.
Our method outperforms state-of-the-art domain adaptation methods on a classification task between Multiple Sclerosis patients and healthy subjects.
arXiv Detail & Related papers (2021-10-13T15:40:50Z) - Shared Space Transfer Learning for analyzing multi-site fMRI data [83.41324371491774]
Multi-voxel pattern analysis (MVPA) learns predictive models from task-based functional magnetic resonance imaging (fMRI) data.
MVPA works best with a well-designed feature set and an adequate sample size.
Most fMRI datasets are noisy, high-dimensional, expensive to collect, and with small sample sizes.
This paper proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning approach.
arXiv Detail & Related papers (2020-10-24T08:50:26Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.