An Empirical Study of On-Device Translation for Real-Time Live-Stream Chat on Mobile Devices
- URL: http://arxiv.org/abs/2601.02641v1
- Date: Tue, 06 Jan 2026 01:22:56 GMT
- Title: An Empirical Study of On-Device Translation for Real-Time Live-Stream Chat on Mobile Devices
- Authors: Jeiyoon Park, Daehwan Lee, Changmin Yeo, Yongshin Han, Minseop Kim,
- Abstract summary: We investigate two key issues that must be addressed to deploy on-device models in real-world services.<n>We focus on a task of translating live-stream chat messages and manually construct LiveChatBench, a benchmark consisting of 1,000 Korean-English parallel sentence pairs.<n>Our approach achieves performance comparable to commercial models such as GPT-5.1 on the well-targeted task.
- Score: 0.8699280339422538
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite its efficiency, there has been little research on the practical aspects required for real-world deployment of on-device AI models, such as the device's CPU utilization and thermal conditions. In this paper, through extensive experiments, we investigate two key issues that must be addressed to deploy on-device models in real-world services: (i) the selection of on-device models and the resource consumption of each model, and (ii) the capability and potential of on-device models for domain adaptation. To this end, we focus on a task of translating live-stream chat messages and manually construct LiveChatBench, a benchmark consisting of 1,000 Korean-English parallel sentence pairs. Experiments on five mobile devices demonstrate that, although serving a large and heterogeneous user base requires careful consideration of highly constrained deployment settings and model selection, the proposed approach nevertheless achieves performance comparable to commercial models such as GPT-5.1 on the well-targeted task. We expect that our findings will provide meaningful insights to the on-device AI community.
Related papers
- TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models [56.94569090844015]
TokaMark is a structured benchmark to evaluate AI models on real experimental data collected from the Mega Ampere Spherical Tokamak (MAST)<n>TokaMark aims to accelerate progress in data-driven AI-based plasma modeling, contributing to the broader goal of achieving sustainable and stable fusion energy.
arXiv Detail & Related papers (2026-02-05T16:49:44Z) - MobileLLM-Pro Technical Report [28.511762884727883]
MobileLLM-Pro is a 1-billion- parameter language model optimized for on-device deployment.<n>It significantly outperforms Gemma 3-1B and Llama 3.2-1B on 11 standard benchmarks.<n>It supports context windows of up to 128,000 tokens and shows only minor performance regressions at 4-bit quantization.
arXiv Detail & Related papers (2025-11-10T05:28:31Z) - PointArena: Probing Multimodal Grounding Through Language-Guided Pointing [79.80132157576978]
Pointing serves as a fundamental and intuitive mechanism for grounding language within visual contexts.<n>We introduce PointArena, a comprehensive platform for evaluating multimodal pointing across diverse reasoning scenarios.
arXiv Detail & Related papers (2025-05-15T06:04:42Z) - AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results [55.33807002543901]
We present AIvaluateXR, a comprehensive evaluation framework for benchmarking large language models (LLMs) running on XR devices.<n>We deploy 17 selected LLMs across four XR platforms: Magic Leap 2, Meta Quest 3, Vivo X100s Pro, and Apple Vision Pro, and conduct an extensive evaluation.<n>We propose a unified evaluation method based on the 3D Optimality theory to select the optimal device-model pairs from quality and speed objectives.
arXiv Detail & Related papers (2025-02-13T20:55:48Z) - Foundations and Recent Trends in Multimodal Mobile Agents: A Survey [72.29426995154088]
Mobile agents are essential for automating tasks in complex and dynamic mobile environments.<n>Recent advancements enhance real-time adaptability and multimodal interaction.<n>We categorize these advancements into two main approaches: prompt-based methods and training-based methods.
arXiv Detail & Related papers (2024-11-04T11:50:58Z) - A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources.
We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z) - On-Device Language Models: A Comprehensive Review [26.759861320845467]
Review examines the challenges of deploying computationally expensive large language models on resource-constrained devices.
Paper investigates on-device language models, their efficient architectures, as well as state-of-the-art compression techniques.
Case studies of on-device language models from major mobile manufacturers demonstrate real-world applications and potential benefits.
arXiv Detail & Related papers (2024-08-26T03:33:36Z) - Precision and Adaptability of YOLOv5 and YOLOv8 in Dynamic Robotic Environments [0.0]
This study provides a comparative analysis of YOLOv5 and YOLOv8 models.
Contrary to initial expectations, YOLOv5 models demonstrated comparable, and in some cases superior, precision in object detection tasks.
arXiv Detail & Related papers (2024-06-01T06:17:43Z) - Benchmarking Mobile Device Control Agents across Diverse Configurations [19.01954948183538]
B-MoCA is a benchmark for evaluating and developing mobile device control agents.<n>We benchmark diverse agents, including agents employing large language models (LLMs) or multi-modal LLMs.<n>While these agents demonstrate proficiency in executing straightforward tasks, their poor performance on complex tasks highlights significant opportunities for future research to improve effectiveness.
arXiv Detail & Related papers (2024-04-25T14:56:32Z) - On-device modeling of user's social context and familiar places from
smartphone-embedded sensor data [7.310043452300736]
This paper proposes an unsupervised and lightweight approach to model the user's social context and locations directly on the mobile device.
For the social context, the approach utilizes data on physical and cyber social interactions among users and their devices.
The effectiveness of the proposed approach is demonstrated through three sets of experiments, employing five real-world datasets.
arXiv Detail & Related papers (2023-06-27T12:53:14Z) - U-TOE: Universal TinyML On-board Evaluation Toolkit for Low-Power IoT [3.981958767941474]
U-TOE is a universal toolkit designed to facilitate the task of IoT designers and researchers.
We provide an open source implementation of U-TOE and demonstrate its use to experimentally evaluate the performance of various models.
arXiv Detail & Related papers (2023-06-26T10:35:31Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.