TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain
- URL: http://arxiv.org/abs/2412.15891v1
- Date: Fri, 20 Dec 2024 13:47:02 GMT
- Title: TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain
- Authors: Camille Barboule, Viet-Phi Huynh, Adrien Bufort, Yoan Chabot, Géraldine Damnati, Gwénolé Lecorvé,
- Abstract summary: Telecommunications (telco) is a particularly challenging domain due to the large amount of lexical, semantic and conceptual peculiarities.<n>This paper studies how Large Language Models can be adapted to the telco domain.
- Score: 1.1457130176786257
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Despite outstanding processes in many tasks, Large Language Models (LLMs) still lack accuracy when dealing with highly technical domains. Especially, telecommunications (telco) is a particularly challenging domain due the large amount of lexical, semantic and conceptual peculiarities. Yet, this domain holds many valuable use cases, directly linked to industrial needs. Hence, this paper studies how LLMs can be adapted to the telco domain. It reports our effort to (i) collect a massive corpus of domain-specific data (800M tokens, 80K instructions), (ii) perform adaptation using various methodologies, and (iii) benchmark them against larger generalist models in downstream tasks that require extensive knowledge of telecommunications. Our experiments on Llama-2-7b show that domain-adapted models can challenge the large generalist models. They also suggest that adaptation can be restricted to a unique instruction-tuning step, dicarding the need for any fine-tuning on raw texts beforehand.
Related papers
- Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation [52.8352968531863]
Large language models (LLMs) have made significant progress in general-purpose natural language processing tasks.
This paper presents a novel framework that combines knowledge graph (KG) and retrieval-augmented generation (RAG) techniques to enhance LLM performance in the telecom domain.
arXiv Detail & Related papers (2025-03-31T15:58:08Z) - On Domain-Specific Post-Training for Multimodal Large Language Models [72.67107077850939]
This paper systematically investigates domain adaptation of MLLMs through post-training.
We focus on data synthesis, training pipelines, and task evaluation.
We conduct experiments in high-impact domains such as biomedicine, food, and remote sensing.
arXiv Detail & Related papers (2024-11-29T18:42:28Z) - TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text [5.523385345486362]
We have developed language models specifically designed for legal applications.
Our innovative approach significantly improves capabilities in legal tasks by using Large Language Models (LLMs) to convert raw training data into reading comprehension text.
arXiv Detail & Related papers (2024-10-28T19:32:18Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.
We devise a series of experiments to explain the performance gap empirically.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications [20.36003316123051]
We develop and open-source Tele-LLMs, the first series of language models ranging from 1B to 8B parameters, specifically tailored for telecommunications.
Our evaluations demonstrate that these models outperform their general-purpose counterparts on Tele-Eval while retaining their previously acquired capabilities.
arXiv Detail & Related papers (2024-09-09T03:58:51Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z) - Learning causal representations for robust domain adaptation [31.261956776418618]
In many real-world applications, target domain data may not always be available.
In this paper, we study the cases where at the training phase the target domain data is unavailable.
We propose a novel Causal AutoEncoder (CAE), which integrates deep autoencoder and causal structure learning into a unified model.
arXiv Detail & Related papers (2020-11-12T11:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.