Interpret3C: Interpretable Student Clustering Through Individualized Feature Selection
- URL: http://arxiv.org/abs/2407.11979v1
- Date: Tue, 28 May 2024 15:32:05 GMT
- Title: Interpret3C: Interpretable Student Clustering Through Individualized Feature Selection
- Authors: Isadora Salles, Paola Mejia-Domenzain, Vinitra Swamy, Julian Blackwell, Tanja Käser,
- Abstract summary: Clustering in education, particularly in large-scale online environments like MOOCs, is essential for understanding and adapting to diverse student needs.
Existing clustering approaches often neglect individual differences in feature importance and rely on a homogenized feature set.
We introduce Interpret3C, a novel clustering pipeline that incorporates interpretable neural networks (NNs) in an unsupervised learning context.
- Score: 2.428031280449541
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering in education, particularly in large-scale online environments like MOOCs, is essential for understanding and adapting to diverse student needs. However, the effectiveness of clustering depends on its interpretability, which becomes challenging with high-dimensional data. Existing clustering approaches often neglect individual differences in feature importance and rely on a homogenized feature set. Addressing this gap, we introduce Interpret3C (Interpretable Conditional Computation Clustering), a novel clustering pipeline that incorporates interpretable neural networks (NNs) in an unsupervised learning context. This method leverages adaptive gating in NNs to select features for each student. Then, clustering is performed using the most relevant features per student, enhancing clusters' relevance and interpretability. We use Interpret3C to analyze the behavioral clusters considering individual feature importances in a MOOC with over 5,000 students. This research contributes to the field by offering a scalable, robust clustering methodology and an educational case study that respects individual student differences and improves interpretability for high-dimensional data.
Related papers
- Explaining Clustering of Ecological Momentary Assessment Data Through Temporal and Feature Attention [4.951599300340955]
Ecological Momentary Assessment (EMA) studies offer rich individual data on psychopathology-relevant variables in real-time.
This paper proposes an attention-based interpretable framework to identify the important time-points and variables that play primary roles in distinguishing between clusters.
arXiv Detail & Related papers (2024-05-08T07:09:43Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - XAI for Self-supervised Clustering of Wireless Spectrum Activity [0.5809784853115825]
We propose a methodology for explaining deep clustering, self-supervised learning architectures.
For the representation learning part, our methodology employs Guided Backpropagation to interpret the regions of interest of the input data.
For the clustering part, the methodology relies on Shallow Trees to explain the clustering result.
Finally, a data-specific visualizations part enables connection for each of the clusters to the input data trough the relevant features.
arXiv Detail & Related papers (2023-05-17T08:56:43Z) - Using Representation Expressiveness and Learnability to Evaluate
Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability.
CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means.
We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Fast and Interpretable Consensus Clustering via Minipatch Learning [0.0]
We develop IMPACC: Interpretable MiniPatch Adaptive Consensus Clustering.
We develop adaptive sampling schemes for observations, which result in both improved reliability and computational savings.
Results show that our approach yields more accurate and interpretable cluster solutions.
arXiv Detail & Related papers (2021-10-05T22:39:28Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Clustering-friendly Representation Learning via Instance Discrimination
and Feature Decorrelation [0.0]
We propose a clustering-friendly representation learning method using instance discrimination and feature decorrelation.
In evaluations of image clustering using CIFAR-10 and ImageNet-10, our method achieves accuracy of 81.5% and 95.4%, respectively.
arXiv Detail & Related papers (2021-05-31T22:59:31Z) - Unsupervised Visual Representation Learning by Online Constrained
K-Means [44.38989920488318]
Cluster discrimination is an effective pretext task for unsupervised representation learning.
We propose a novel clustering-based pretext task with online textbfConstrained textbfK-mtextbfeans (textbfCoKe)
Our online assignment method has a theoretical guarantee to approach the global optimum.
arXiv Detail & Related papers (2021-05-24T20:38:32Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.