OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters
- URL: http://arxiv.org/abs/2409.00286v1
- Date: Fri, 30 Aug 2024 22:39:35 GMT
- Title: OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters
- Authors: Zexin Chen, Chengxi Li, Xiangyu Xie, Parijat Dube,
- Abstract summary: This paper explores the potential of a small, domain-specific language model trained exclusively on sports-related data.
OnlySportsLM achieves a 37.62%/34.08% accuracy improvement over previous 135M/360M state-of-the-art models.
- Score: 3.2586293270380717
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper explores the potential of a small, domain-specific language model trained exclusively on sports-related data. We investigate whether extensive training data with specially designed small model structures can overcome model size constraints. The study introduces the OnlySports collection, comprising OnlySportsLM, OnlySports Dataset, and OnlySports Benchmark. Our approach involves: 1) creating a massive 600 billion tokens OnlySports Dataset from FineWeb, 2) optimizing the RWKV architecture for sports-related tasks, resulting in a 196M parameters model with 20-layer, 640-dimension structure, 3) training the OnlySportsLM on part of OnlySports Dataset, and 4) testing the resultant model on OnlySports Benchmark. OnlySportsLM achieves a 37.62%/34.08% accuracy improvement over previous 135M/360M state-of-the-art models and matches the performance of larger models such as SomlLM 1.7B and Qwen 1.5B in the sports domain. Additionally, the OnlySports collection presents a comprehensive workflow for building high-quality, domain-specific language models, providing a replicable blueprint for efficient AI development across various specialized fields.
Related papers
- SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models [15.062299319625701]
SPORTU is a benchmark designed to assess Multimodal Large Language Models (MLLMs) across multi-level sports reasoning tasks.
SPORTU comprises two key components: SPORTU-text, featuring 900 multiple-choice questions with human-annotated explanations for rule comprehension and strategy understanding.
SPORTU-video consists of 1,701 slow-motion video clips across 7 different sports and 12,048 QA pairs, designed to assess multi-level reasoning.
arXiv Detail & Related papers (2024-10-11T02:58:38Z) - LRM: Large Reconstruction Model for Single Image to 3D [61.47357798633123]
We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds.
LRM adopts a highly scalable transformer-based architecture with 500 million learnable parameters to directly predict a neural radiance field (NeRF) from the input image.
We train our model in an end-to-end manner on massive multi-view data containing around 1 million objects.
arXiv Detail & Related papers (2023-11-08T00:03:52Z) - Matcher: Segment Anything with One Shot Using All-Purpose Feature
Matching [63.88319217738223]
We present Matcher, a novel perception paradigm that utilizes off-the-shelf vision foundation models to address various perception tasks.
Matcher demonstrates impressive generalization performance across various segmentation tasks, all without training.
Our results further showcase the open-world generality and flexibility of Matcher when applied to images in the wild.
arXiv Detail & Related papers (2023-05-22T17:59:43Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - LLaMA: Open and Efficient Foundation Language Models [62.94749698865241]
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.
We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively.
arXiv Detail & Related papers (2023-02-27T17:11:15Z) - ESTA: An Esports Trajectory and Action Dataset [0.0]
We use esports data to develop machine learning models for win prediction.
Awpy is an open-source library that can extract player trajectories and actions from game logs.
ESTA is one of the largest and most granular publicly available sports data sets to date.
arXiv Detail & Related papers (2022-09-20T17:13:50Z) - Sports Video Analysis on Large-Scale Data [10.24207108909385]
This paper investigates the modeling of automated machine description on sports video.
We propose a novel large-scale NBA dataset for Sports Video Analysis (NSVA) with a focus on captioning.
arXiv Detail & Related papers (2022-08-09T16:59:24Z) - SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in
Soccer Videos [62.686484228479095]
We propose a novel dataset for multiple object tracking composed of 200 sequences of 30s each.
The dataset is fully annotated with bounding boxes and tracklet IDs.
Our analysis shows that multiple player, referee and ball tracking in soccer videos is far from being solved.
arXiv Detail & Related papers (2022-04-14T12:22:12Z) - DeepSportLab: a Unified Framework for Ball Detection, Player Instance
Segmentation and Pose Estimation in Team Sports Scenes [19.845244830593067]
This paper presents a unified framework to (i) locate the ball, (ii) predict the pose, and (iii) segment the instance mask of players in team sports scenes.
arXiv Detail & Related papers (2021-12-01T16:30:51Z) - MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized
Sports Actions [39.27858380391081]
This paper aims to present a new multi-person dataset of atomic-temporal actions, coined as MultiSports.
We build the dataset of MultiSports v1.0 by selecting 4 sports classes, collecting around 3200 video clips, and annotating around 37790 action instances with 907k bounding boxes.
arXiv Detail & Related papers (2021-05-16T10:40:30Z) - BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage
Models [59.95091850331499]
We propose BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies.
Our discovered model family, BigNASModels, achieve top-1 accuracies ranging from 76.5% to 80.9%.
arXiv Detail & Related papers (2020-03-24T23:00:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.