Modern Deep Learning Approaches for Cricket Shot Classification: A Comprehensive Baseline Study
- URL: http://arxiv.org/abs/2510.09187v1
- Date: Fri, 10 Oct 2025 09:32:29 GMT
- Title: Modern Deep Learning Approaches for Cricket Shot Classification: A Comprehensive Baseline Study
- Authors: Sungwoo Kang,
- Abstract summary: This paper presents the first comprehensive baseline study comparing seven different deep learning approaches for cricket shot classification.<n>We implement and evaluate traditional CNN-LSTM architectures, attention-based models, vision transformers, transfer learning approaches, and modern EfficientNet-GRU combinations.<n>Our modern SOTA approach, combining EfficientNet-B0 with a GRU-based temporal model, achieves 92.25% accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cricket shot classification from video sequences remains a challenging problem in sports video analysis, requiring effective modeling of both spatial and temporal features. This paper presents the first comprehensive baseline study comparing seven different deep learning approaches across four distinct research paradigms for cricket shot classification. We implement and systematically evaluate traditional CNN-LSTM architectures, attention-based models, vision transformers, transfer learning approaches, and modern EfficientNet-GRU combinations on a unified benchmark. A critical finding of our study is the significant performance gap between claims in academic literature and practical implementation results. While previous papers reported accuracies of 96\% (Balaji LRCN), 99.2\% (IJERCSE), and 93\% (Sensors), our standardized re-implementations achieve 46.0\%, 55.6\%, and 57.7\% respectively. Our modern SOTA approach, combining EfficientNet-B0 with a GRU-based temporal model, achieves 92.25\% accuracy, demonstrating that substantial improvements are possible with modern architectures and systematic optimization. All implementations follow modern MLOps practices with PyTorch Lightning, providing a reproducible research platform that exposes the critical importance of standardized evaluation protocols in sports video analysis research.
Related papers
- A Comprehensive Forecasting-Based Framework for Time Series Anomaly Detection: Benchmarking on the Numenta Anomaly Benchmark (NAB) [0.0]
Time series anomaly detection is critical for modern digital infrastructures.<n>We present a forecasting-based framework unifying classical methods with deep learning architectures.<n>We conduct the first complete evaluation on the Numenta Anomaly Benchmark.
arXiv Detail & Related papers (2025-10-13T08:31:42Z) - Advanced Multi-Architecture Deep Learning Framework for BIRADS-Based Mammographic Image Retrieval: Comprehensive Performance Analysis with Super-Ensemble Optimization [0.0]
mammographic image retrieval systems require exact BIRADS categorical matching across five distinct classes.<n>Current medical image retrieval studies suffer from methodological limitations.
arXiv Detail & Related papers (2025-08-06T18:05:18Z) - Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey [4.030910640265943]
Data Augmentation (DA) has become a critical approach in Time Series Classification (TSC)<n>The current landscape of DA in TSC is plagued with fragmented literature reviews, nebulous methodological challenges, and a dearth of accessible user-oriented tools.<n>This study addresses these challenges through a comprehensive examination of DA methodologies within the TSC domain.
arXiv Detail & Related papers (2023-10-16T04:49:51Z) - DiversiGATE: A Comprehensive Framework for Reliable Large Language
Models [2.616506436169964]
We introduce DiversiGATE, a unified framework that consolidates diverse methodologies for LLM verification.
We propose a novel SelfLearner' model that conforms to the DiversiGATE framework and refines its performance over time.
Our results demonstrate that our approach outperforms traditional LLMs, achieving a considerable 54.8% -> 61.8% improvement on the GSM8K benchmark.
arXiv Detail & Related papers (2023-06-22T22:29:40Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - SLCA: Slow Learner with Classifier Alignment for Continual Learning on a
Pre-trained Model [73.80068155830708]
We present an extensive analysis for continual learning on a pre-trained model (CLPM)
We propose a simple but extremely effective approach named Slow Learner with Alignment (SLCA)
Across a variety of scenarios, our proposal provides substantial improvements for CLPM.
arXiv Detail & Related papers (2023-03-09T08:57:01Z) - Fast Hierarchical Learning for Few-Shot Object Detection [57.024072600597464]
Transfer learning approaches have recently achieved promising results on the few-shot detection task.
These approaches suffer from catastrophic forgetting'' issue due to finetuning of base detector.
We tackle the aforementioned issues in this work.
arXiv Detail & Related papers (2022-10-10T20:31:19Z) - Training Strategies for Improved Lip-reading [61.661446956793604]
We investigate the performance of state-of-the-art data augmentation approaches, temporal models and other training strategies.
A combination of all the methods results in a classification accuracy of 93.4%, which is an absolute improvement of 4.6% over the current state-of-the-art performance.
An error analysis of the various training strategies reveals that the performance improves by increasing the classification accuracy of hard-to-recognise words.
arXiv Detail & Related papers (2022-09-03T09:38:11Z) - Contextualized Spatio-Temporal Contrastive Learning with
Self-Supervision [106.77639982059014]
We present ConST-CL framework to effectively learn-temporally fine-grained representations.
We first design a region-based self-supervised task which requires the model to learn to transform instance representations from one view to another guided by context features.
We then introduce a simple design that effectively reconciles the simultaneous learning of both holistic and local representations.
arXiv Detail & Related papers (2021-12-09T19:13:41Z) - RethinkCWS: Is Chinese Word Segmentation a Solved Task? [81.11161697133095]
The performance of the Chinese Word (CWS) systems has gradually reached a plateau with the rapid development of deep neural networks.
In this paper, we take stock of what we have achieved and rethink what's left in the CWS task.
arXiv Detail & Related papers (2020-11-13T11:07:08Z) - Generalized Reinforcement Meta Learning for Few-Shot Optimization [3.7675996866306845]
We present a generic and flexible Reinforcement Learning (RL) based meta-learning framework for the problem of few-shot learning.
Our framework could be easily extended to do network architecture search.
arXiv Detail & Related papers (2020-05-04T03:21:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.