DMCNet: Diversified Model Combination Network for Understanding
Engagement from Video Screengrabs
- URL: http://arxiv.org/abs/2204.06454v1
- Date: Wed, 13 Apr 2022 15:24:38 GMT
- Title: DMCNet: Diversified Model Combination Network for Understanding
Engagement from Video Screengrabs
- Authors: Sarthak Batra, Hewei Wang, Avishek Nag, Philippe Brodeur, Marianne
Checkley, Annette Klinkert, and Soumyabrata Dev
- Abstract summary: Engagement plays a major role in developing intelligent educational interfaces.
Non-deep learning models are based on the combination of popular algorithms such as Histogram of Oriented Gradient (HOG), Support Vector Machine (SVM), Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF)
The deep learning methods include Densely Connected Convolutional Networks (DenseNet-121), Residual Network (ResNet-18) and MobileNetV1.
- Score: 0.4397520291340695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Engagement is an essential indicator of the Quality-of-Learning Experience
(QoLE) and plays a major role in developing intelligent educational interfaces.
The number of people learning through Massively Open Online Courses (MOOCs) and
other online resources has been increasing rapidly because they provide us with
the flexibility to learn from anywhere at any time. This provides a good
learning experience for the students. However, such learning interface requires
the ability to recognize the level of engagement of the students for a holistic
learning experience. This is useful for both students and educators alike.
However, understanding engagement is a challenging task, because of its
subjectivity and ability to collect data. In this paper, we propose a variety
of models that have been trained on an open-source dataset of video
screengrabs. Our non-deep learning models are based on the combination of
popular algorithms such as Histogram of Oriented Gradient (HOG), Support Vector
Machine (SVM), Scale Invariant Feature Transform (SIFT) and Speeded Up Robust
Features (SURF). The deep learning methods include Densely Connected
Convolutional Networks (DenseNet-121), Residual Network (ResNet-18) and
MobileNetV1. We show the performance of each models using a variety of metrics
such as the Gini Index, Adjusted F-Measure (AGF), and Area Under receiver
operating characteristic Curve (AUC). We use various dimensionality reduction
techniques such as Principal Component Analysis (PCA) and t-Distributed
Stochastic Neighbor Embedding (t-SNE) to understand the distribution of data in
the feature sub-space. Our work will thereby assist the educators and students
in obtaining a fruitful and efficient online learning experience.
Related papers
- Reinforcement Learning Based Multi-modal Feature Fusion Network for
Novel Class Discovery [47.28191501836041]
In this paper, we employ a Reinforcement Learning framework to simulate the cognitive processes of humans.
We also deploy a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information.
We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets.
arXiv Detail & Related papers (2023-08-26T07:55:32Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - Exploring Effective Factors for Improving Visual In-Context Learning [56.14208975380607]
In-Context Learning (ICL) is to understand a new task via a few demonstrations (aka. prompt) and predict new inputs without tuning the models.
This paper shows that prompt selection and prompt fusion are two major factors that have a direct impact on the inference performance of visual context learning.
We propose a simple framework prompt-SelF for visual in-context learning.
arXiv Detail & Related papers (2023-04-10T17:59:04Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Learnable Graph Convolutional Network and Feature Fusion for Multi-view
Learning [30.74535386745822]
This paper proposes a joint deep learning framework called Learnable Graph Convolutional Network and Feature Fusion (LGCN-FF)
It consists of two stages: feature fusion network and learnable graph convolutional network.
The proposed LGCN-FF is validated to be superior to various state-of-the-art methods in multi-view semi-supervised classification.
arXiv Detail & Related papers (2022-11-16T19:07:12Z) - X-Learner: Learning Cross Sources and Tasks for Universal Visual
Representation [71.51719469058666]
We propose a representation learning framework called X-Learner.
X-Learner learns the universal feature of multiple vision tasks supervised by various sources.
X-Learner achieves strong performance on different tasks without extra annotations, modalities and computational costs.
arXiv Detail & Related papers (2022-03-16T17:23:26Z) - Knowledge Distillation By Sparse Representation Matching [107.87219371697063]
We propose Sparse Representation Matching (SRM) to transfer intermediate knowledge from one Convolutional Network (CNN) to another by utilizing sparse representation.
We formulate as a neural processing block, which can be efficiently optimized using gradient descent and integrated into any CNN in a plug-and-play manner.
Our experiments demonstrate that is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.
arXiv Detail & Related papers (2021-03-31T11:47:47Z) - Student sentiment Analysis Using Classification With Feature Extraction
Techniques [0.0]
This paper describes the web-based learning and their effectiveness towards students.
We worked on how machine learning techniques like Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (DT)
arXiv Detail & Related papers (2021-02-01T18:48:06Z) - Sense and Learn: Self-Supervision for Omnipresent Sensors [9.442811508809994]
We present a framework named Sense and Learn for representation or feature learning from raw sensory data.
It consists of several auxiliary tasks that can learn high-level and broadly useful features entirely from unannotated data without any human involvement in the tedious labeling process.
Our methodology achieves results that are competitive with the supervised approaches and close the gap through fine-tuning a network while learning the downstream tasks in most cases.
arXiv Detail & Related papers (2020-09-28T11:57:43Z) - Analyzing Student Strategies In Blended Courses Using Clickstream Data [32.81171098036632]
We use pattern mining and models borrowed from Natural Language Processing to understand student interactions.
Fine-grained clickstream data is collected through Diderot, a non-commercial educational support system.
Our results suggest that the proposed hybrid NLP methods can provide valuable insights even in the low-data setting of blended courses.
arXiv Detail & Related papers (2020-05-31T03:01:00Z) - Efficient Crowd Counting via Structured Knowledge Transfer [122.30417437707759]
Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.
We propose a novel Structured Knowledge Transfer framework to generate a lightweight but still highly effective student network.
Our models obtain at least 6.5$times$ speed-up on an Nvidia 1080 GPU and even achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-03-23T08:05:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.