Fugu-MT 論文翻訳(概要): Benchmarking On-Device Machine Learning on Apple Silicon with MLX

論文の概要: Benchmarking On-Device Machine Learning on Apple Silicon with MLX

arxiv url: http://arxiv.org/abs/2510.18921v1
Date: Tue, 21 Oct 2025 08:19:27 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:14.319212
Title: Benchmarking On-Device Machine Learning on Apple Silicon with MLX
Title（参考訳）: MLXによるApple Silicon上でのオンデバイス機械学習のベンチマーク
Authors: Oluwaseun A. Ajayi, Ogundepo Odunayo,
Abstract要約: MLXは、Appleのシリコンデバイス上での機械学習(ML)計算に最適化されたフレームワークである。本稿では,トランスモデルのレイテンシに着目し,MLXの性能評価を行う。結果は、Appleのエコシステム内で、効率的でアクセスしやすいオンデバイスMLアプリケーションを可能にするMLXの可能性を強調している。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The recent widespread adoption of Large Language Models (LLMs) and machine learning in general has sparked research interest in exploring the possibilities of deploying these models on smaller devices such as laptops and mobile phones. This creates a need for frameworks and approaches that are capable of taking advantage of on-device hardware. The MLX framework was created to address this need. It is a framework optimized for machine learning (ML) computations on Apple silicon devices, facilitating easier research, experimentation, and prototyping. This paper presents a performance evaluation of MLX, focusing on inference latency of transformer models. We compare the performance of different transformer architecture implementations in MLX with their Pytorch counterparts. For this research we create a framework called MLX-transformers which includes different transformer implementations in MLX and downloads the model checkpoints in pytorch and converts it to the MLX format. By leveraging the advanced architecture and capabilities of Apple Silicon, MLX-Transformers enables seamless execution of transformer models directly sourced from Hugging Face, eliminating the need for checkpoint conversion often required when porting models between frameworks. Our study benchmarks different transformer models on two Apple Silicon macbook devices against an NVIDIA CUDA GPU. Specifically, we compare the inference latency performance of models with the same parameter sizes and checkpoints. We evaluate the performance of BERT, RoBERTa, and XLM-RoBERTa models, with the intention of extending future work to include models of different modalities, thus providing a more comprehensive assessment of MLX's capabilities. The results highlight MLX's potential in enabling efficient and more accessible on-device ML applications within Apple's ecosystem.
Abstract（参考訳）: 最近、LLM(Large Language Models)やマシンラーニングが広く採用されていることから、ラップトップや携帯電話などの小さなデバイスにこれらのモデルをデプロイする可能性を探究する研究が注目されている。これにより、オンデバイスハードウェアを活用可能なフレームワークやアプローチの必要性が生まれます。このニーズに対処するためにMLXフレームワークが作られた。これはAppleのシリコンデバイス上での機械学習(ML)計算に最適化されたフレームワークであり、研究、実験、プロトタイピングを容易にする。本稿では,トランスモデルの推論遅延に着目し,MLXの性能評価を行う。 MLXの異なるトランスフォーマーアーキテクチャの実装性能とPytorchの実装性能を比較した。この研究のために、MLX-transformersと呼ばれるフレームワークを作成し、MLXで異なるトランスフォーマーを実装し、モデルチェックポイントをpytorchでダウンロードし、MLXフォーマットに変換する。 Apple Siliconの高度なアーキテクチャと機能を活用することで、MLX-Transformersは、Hugging Faceから直接ソースされたトランスフォーマーモデルのシームレスな実行を可能にし、フレームワーク間でモデルを移植する際に必要となるチェックポイント変換を不要にする。われわれの研究では、Apple Silicon macbookデバイス2機種の異なるトランスフォーマーモデルをNVIDIA CUDA GPUと比較した。具体的には、モデルの推論遅延性能をパラメータサイズとチェックポイントで比較する。我々は,様々なモダリティのモデルを含むよう今後の作業を拡張することを目的として,BERT,RoBERTa,XLM-RoBERTaモデルの性能評価を行った。結果は、Appleのエコシステム内で、効率的でアクセスしやすいオンデバイスMLアプリケーションを可能にするMLXの可能性を強調している。

論文の概要: Benchmarking On-Device Machine Learning on Apple Silicon with MLX

関連論文リスト