Subgroup Discovery in MOOCs: A Big Data Application for Describing Different Types of Learners
- URL: http://arxiv.org/abs/2403.05555v1
- Date: Sat, 10 Feb 2024 16:07:38 GMT
- Title: Subgroup Discovery in MOOCs: A Big Data Application for Describing Different Types of Learners
- Authors: J. M. Luna, H. M. Fardoun, F. Padillo, C. Romero, S. Ventura,
- Abstract summary: This paper aims to categorize and describe different types of learners in massive open online courses (MOOCs) by means of a subgroup discovery approach based on MapReduce.
The proposed subgroup discovery approach considers emerging parallel methodologies like MapReduce to be able to cope with extremely large datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The aim of this paper is to categorize and describe different types of learners in massive open online courses (MOOCs) by means of a subgroup discovery approach based on MapReduce. The final objective is to discover IF-THEN rules that appear in different MOOCs. The proposed subgroup discovery approach, which is an extension of the well-known FP-Growth algorithm, considers emerging parallel methodologies like MapReduce to be able to cope with extremely large datasets. As an additional feature, the proposal includes a threshold value to denote the number of courses that each discovered rule should satisfy. A post-processing step is also included so redundant subgroups can be removed. The experimental stage is carried out by considering de-identified data from the first year of 16 MITx and HarvardX courses on the edX platform. Experimental results demonstrate that the proposed MapReduce approach outperforms traditional sequential subgroup discovery approaches, achieving a runtime that is almost constant for different courses. Additionally, thanks to the final post-processing step, only interesting and not-redundant rules are discovered, hence reducing the number of subgroups in one or two orders of magnitude. Finally, the discovered subgroups are easily used by courses' instructors not only for descriptive purposes but also for additional tasks such as recommendation or personalization.
Related papers
- Discover and Mitigate Multiple Biased Subgroups in Image Classifiers [45.96784278814168]
Machine learning models can perform well on in-distribution data but often fail on biased subgroups that are underrepresented in the training data.
We propose Decomposition, Interpretation, and Mitigation (DIM) to address this problem.
Our approach decomposes the image features into multiple components that represent multiple subgroups.
arXiv Detail & Related papers (2024-03-19T14:44:54Z) - Towards More Practical Group Activity Detection: A New Benchmark and Model [61.39427407758131]
Group activity detection (GAD) is the task of identifying members of each group and classifying the activity of the group at the same time in a video.
We present a new dataset, dubbed Caf'e, which presents more practical scenarios and metrics.
We also propose a new GAD model that deals with an unknown number of groups and latent group members efficiently and effectively.
arXiv Detail & Related papers (2023-12-05T16:48:17Z) - Agnostic Multi-Group Active Learning [24.97598179536084]
We consider a variant of this problem from the perspective of active learning, where the learner is endowed with the power to decide which examples are labeled from each distribution in the collection.
Our main challenge is that standard active learning techniques such as disagreement-based active learning do not directly apply to the multi-group learning objective.
We modify existing algorithms to provide a consistent active learning algorithm for an agnostic formulation of multi-group learning.
arXiv Detail & Related papers (2023-06-02T21:24:13Z) - The Minority Matters: A Diversity-Promoting Collaborative Metric
Learning Algorithm [154.47590401735323]
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems.
This paper focuses on a challenging scenario where a user has multiple categories of interests.
We propose a novel method called textitDiversity-Promoting Collaborative Metric Learning (DPCML)
arXiv Detail & Related papers (2022-09-30T08:02:18Z) - XCon: Learning with Experts for Fine-grained Category Discovery [4.787507865427207]
We present a novel method called Expert-Contrastive Learning (XCon) to help the model to mine useful information from the images.
Experiments on fine-grained datasets show a clear improved performance over the previous best methods, indicating the effectiveness of our method.
arXiv Detail & Related papers (2022-08-03T08:03:12Z) - Many Field Packet Classification with Decomposition and Reinforcement
Learning [2.0915988632142275]
We present a scalable learning-based packet classification engine by building an efficient data structure for different ruleset with many fields.
Our method consists of the decomposition of fields into subsets and building separate decision trees on those subsets using a deep reinforcement learning procedure.
The results show that the SD decomposition metrics results in 11.5% faster than DI metrics, 25% faster than random 2 and 40% faster than random 1.
arXiv Detail & Related papers (2022-05-16T20:24:37Z) - Spacing Loss for Discovering Novel Categories [72.52222295216062]
Novel Class Discovery (NCD) is a learning paradigm, where a machine learning model is tasked to semantically group instances from unlabeled data.
We first characterize existing NCD approaches into single-stage and two-stage methods based on whether they require access to labeled and unlabeled data together.
We devise a simple yet powerful loss function that enforces separability in the latent space using cues from multi-dimensional scaling.
arXiv Detail & Related papers (2022-04-22T09:37:11Z) - Learning What Not to Segment: A New Perspective on Few-Shot Segmentation [63.910211095033596]
Recently few-shot segmentation (FSS) has been extensively developed.
This paper proposes a fresh and straightforward insight to alleviate the problem.
In light of the unique nature of the proposed approach, we also extend it to a more realistic but challenging setting.
arXiv Detail & Related papers (2022-03-15T03:08:27Z) - Learning Multi-Attention Context Graph for Group-Based Re-Identification [214.84551361855443]
Learning to re-identify or retrieve a group of people across non-overlapped camera systems has important applications in video surveillance.
In this work, we consider employing context information for identifying groups of people, i.e., group re-id.
We propose a novel unified framework based on graph neural networks to simultaneously address the group-based re-id tasks.
arXiv Detail & Related papers (2021-04-29T09:57:47Z) - Robust subgroup discovery [0.2578242050187029]
We formalize the problem of optimal robust subgroup discovery using the Minimum Description Length principle.
We propose RSD, a greedy greedy that finds good subgroup lists and guarantees that the most significant subgroup is added in each iteration.
We empirically show on 54 datasets that RSD outperforms previous subgroup set discovery methods in terms of quality and subgroup list size.
arXiv Detail & Related papers (2021-03-25T09:04:13Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.