Towards an Efficient, Customizable, and Accessible AI Tutor
- URL: http://arxiv.org/abs/2510.06255v1
- Date: Sat, 04 Oct 2025 13:33:40 GMT
- Title: Towards an Efficient, Customizable, and Accessible AI Tutor
- Authors: Juan Segundo Hevia, Facundo Arredondo, Vishesh Kumar,
- Abstract summary: We propose an offline Retrieval-Augmented Generation (RAG) pipeline that pairs a small language model (SLM) with a robust retrieval mechanism.<n>We evaluate the efficacy of this pipeline using domain-specific educational content, focusing on biology coursework.
- Score: 5.225254533678075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The integration of large language models (LLMs) into education offers significant potential to enhance accessibility and engagement, yet their high computational demands limit usability in low-resource settings, exacerbating educational inequities. To address this, we propose an offline Retrieval-Augmented Generation (RAG) pipeline that pairs a small language model (SLM) with a robust retrieval mechanism, enabling factual, contextually relevant responses without internet connectivity. We evaluate the efficacy of this pipeline using domain-specific educational content, focusing on biology coursework. Our analysis highlights key challenges: smaller models, such as SmolLM, struggle to effectively leverage extended contexts provided by the RAG pipeline, particularly when noisy or irrelevant chunks are included. To improve performance, we propose exploring advanced chunking techniques, alternative small or quantized versions of larger models, and moving beyond traditional metrics like MMLU to a holistic evaluation framework assessing free-form response. This work demonstrates the feasibility of deploying AI tutors in constrained environments, laying the groundwork for equitable, offline, and device-based educational tools.
Related papers
- FutureMind: Equipping Small Language Models with Strategic Thinking-Pattern Priors via Adaptive Knowledge Distillation [13.855534865501369]
Small Language Models (SLMs) are attractive for cost-sensitive and resource-limited settings due to their efficient, low-latency inference.<n>We propose FutureMind, a modular reasoning framework that equips SLMs with strategic thinking-pattern priors.
arXiv Detail & Related papers (2026-02-01T13:26:04Z) - Grounded in Reality: Learning and Deploying Proactive LLM from Offline Logs [72.08224879435762]
textttLearn-to-Ask is a simulator-free framework for learning and deploying proactive dialogue agents.<n>Our approach culminates in the successful deployment of LLMs into a live, large-scale online AI service.
arXiv Detail & Related papers (2025-10-29T12:08:07Z) - Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments [70.42705564227548]
We propose an automated environment construction pipeline for large language models (LLMs)<n>This enables the creation of high-quality training environments that provide detailed and measurable feedback without relying on external tools.<n>We also introduce a verifiable reward mechanism that evaluates both the precision of tool use and the completeness of task execution.
arXiv Detail & Related papers (2025-08-12T09:45:19Z) - Deploying Large AI Models on Resource-Limited Devices with Split Federated Learning [39.73152182572741]
This paper proposes a novel framework, named Quantized Split Federated Fine-Tuning Large AI Model (SFLAM)<n>By partitioning the training load between edge devices and servers, SFLAM can facilitate the operation of large models on devices.<n>SFLAM incorporates quantization management, power control, and bandwidth allocation strategies to enhance training efficiency.
arXiv Detail & Related papers (2025-04-12T07:55:11Z) - FamilyTool: A Multi-hop Personalized Tool Use Benchmark [93.80355496575281]
FamilyTool is a benchmark grounded in a family-based knowledge graph (KG) that simulates personalized, multi-hop tool use scenarios.<n> Experiments reveal significant performance gaps in state-of-the-art Large Language Models (LLMs)<n>FamilyTool serves as a critical resource for evaluating and advancing LLM agents' reasoning, adaptability, and scalability in complex, dynamic environments.
arXiv Detail & Related papers (2025-04-09T10:42:36Z) - Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge [3.1471494780647795]
Recent trends show a growing focus on compact models-typically under 10 billion parameters-enabled by techniques such as quantization.<n>This shift paves the way for LMs on edge devices, offering potential benefits such as enhanced privacy, reduced latency, and improved data sovereignty.<n>We present a comprehensive evaluation of generative LM inference on representative CPU-based and GPU-accelerated edge devices.
arXiv Detail & Related papers (2025-03-12T07:01:34Z) - Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks [2.362412515574206]
Large Language Models (LLMs) have proven immensely beneficial in education by capturing vast amounts of literature-based information.<n>We propose a generative AI-powered GATE question-answering framework that leverages LLMs to explain GATE solutions and support students in their exam preparation.
arXiv Detail & Related papers (2025-03-02T08:11:07Z) - RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards [4.334100270812517]
Large language models (LLMs) struggle with technical standards in telecommunications.<n>We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 small language model (SLM)<n>Our experiments demonstrate substantial improvements over existing question-answering approaches in the telecom domain.
arXiv Detail & Related papers (2024-08-21T17:00:05Z) - Benchmarking General-Purpose In-Context Learning [19.40952728849431]
In-context learning (ICL) empowers generative models to address new tasks effectively and efficiently on the fly.
In this paper, we study extending ICL to address a broader range of tasks with an extended learning horizon and higher improvement potential.
We introduce two benchmarks specifically crafted to train and evaluate GPICL functionalities.
arXiv Detail & Related papers (2024-05-27T14:50:42Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.