Related papers: Learning Behavior Recognition in Smart Classroom with Multiple Students Based on YOLOv5

Learning Behavior Recognition in Smart Classroom with Multiple Students Based on YOLOv5

URL: http://arxiv.org/abs/2303.10916v1
Date: Mon, 20 Mar 2023 07:16:58 GMT
Title: Learning Behavior Recognition in Smart Classroom with Multiple Students Based on YOLOv5
Authors: Zhifeng Wang, Jialong Yao, Chunyan Zeng, Wanxuan Wu, Hongmin Xu, Yang Yang
Abstract summary: We propose a YOLOv5s network structure based on you only look once (YOLO) algorithm to recognize and analyze students' classroom behavior. When compared with YOLOv4, the proposed method is able to improve the mAP performance by 11%.
Score: 4.239144309557045
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning-based computer vision technology has grown stronger in recent years, and cross-fertilization using computer vision technology has been a popular direction in recent years. The use of computer vision technology to identify students' learning behavior in the classroom can reduce the workload of traditional teachers in supervising students in the classroom, and ensure greater accuracy and comprehensiveness. However, existing student learning behavior detection systems are unable to track and detect multiple targets precisely, and the accuracy of learning behavior recognition is not high enough to meet the existing needs for the accurate recognition of student behavior in the classroom. To solve this problem, we propose a YOLOv5s network structure based on you only look once (YOLO) algorithm to recognize and analyze students' classroom behavior in this paper. Firstly, the input images taken in the smart classroom are pre-processed. Then, the pre-processed image is fed into the designed YOLOv5 networks to extract deep features through convolutional layers, and the Squeeze-and-Excitation (SE) attention detection mechanism is applied to reduce the weight of background information in the recognition process. Finally, the extracted features are classified by the Feature Pyramid Networks (FPN) and Path Aggregation Network (PAN) structures. Multiple groups of experiments were performed to compare with traditional learning behavior recognition methods to validate the effectiveness of the proposed method. When compared with YOLOv4, the proposed method is able to improve the mAP performance by 11%.

Related papers

Multi-Scale Deformable Transformers for Student Learning Behavior Detection in Smart Classroom [5.487296795434267]
We introduce the Student Learning Behavior Detection with Multi-Scale Deformable Transformers (SCB-DETR) This technique significantly improves the detection capabilities for multi-scale and occluded targets, offering a robust solution for analyzing student behavior. SCB-DETR achieves a mean Average Precision (mAP) of 0.626, which is a 1.5% improvement over the baseline model's mAP and a 6% increase in AP50.
arXiv Detail & Related papers (2024-10-10T11:51:57Z)
Student Activity Recognition in Classroom Environments using Transfer Learning [0.0]
This paper proposes a system for detecting and recognizing the activities of students in a classroom environment. Xception achieved an accuracy of 93%, on the novel classroom dataset.
arXiv Detail & Related papers (2023-12-01T04:51:57Z)
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes. We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z)
SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method. We modernize the 3D convolutional backbone by introducing multi-head self-attention modules. In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z)
Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues. No human annotations are involved in our framework during the whole training process. Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z)
Improving Music Performance Assessment with Contrastive Learning [78.8942067357231]
This study investigates contrastive learning as a potential method to improve existing MPA systems. We introduce a weighted contrastive loss suitable for regression tasks applied to a convolutional neural network. Our results show that contrastive-based methods are able to match and exceed SoTA performance for MPA regression tasks.
arXiv Detail & Related papers (2021-08-03T19:24:25Z)
Bridging the gap between Human Action Recognition and Online Action Detection [0.456877715768796]
Action recognition, early prediction, and online action detection are complementary disciplines that are often studied independently. We address the task-specific feature extraction with a teacher-student framework between the aforementioned disciplines. Our network embeds online early prediction and online temporal segment proposalworks in parallel.
arXiv Detail & Related papers (2021-01-21T21:01:46Z)
Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks. We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI) In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z)
Point Adversarial Self Mining: A Simple Method for Facial Expression Recognition [79.75964372862279]
We propose Point Adversarial Self Mining (PASM) to improve the recognition accuracy in facial expression recognition. PASM uses a point adversarial attack method and a trained teacher network to locate the most informative position related to the target task. The adaptive learning materials generation and teacher/student update can be conducted more than one time, improving the network capability iteratively.
arXiv Detail & Related papers (2020-08-26T06:39:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.