Related papers: A Comprehensive Overview of Large Language Models

A Comprehensive Overview of Large Language Models

URL: http://arxiv.org/abs/2307.06435v9
Date: Tue, 9 Apr 2024 21:38:33 GMT
Title: A Comprehensive Overview of Large Language Models
Authors: Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes, Ajmal Mian,
Abstract summary: Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks. This article provides an overview of the existing literature on a broad range of LLM-related concepts.
Score: 68.22178313875618
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.

Related papers

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment [291.03029298928857]
This paper introduces the concept of "full-stack" safety to systematically consider safety issues throughout the entire process of LLM training, deployment, and commercialization. Our research is grounded in an exhaustive review of over 800+ papers, ensuring comprehensive coverage and systematic organization of security issues. Our work identifies promising research directions, including safety in data generation, alignment techniques, model editing, and LLM-based agent systems.
arXiv Detail & Related papers (2025-04-22T05:02:49Z)
Bridging Language Models and Financial Analysis [49.361943182322385]
The rapid advancements in Large Language Models (LLMs) have unlocked transformative possibilities in natural language processing. Financial data is often embedded in intricate relationships across textual content, numerical tables, and visual charts. Despite the fast pace of innovation in LLM research, there remains a significant gap in their practical adoption within the finance industry.
arXiv Detail & Related papers (2025-03-14T01:35:20Z)
When Continue Learning Meets Multimodal Large Language Model: A Survey [7.250878248686215]
Fine-tuning MLLMs for specific tasks often causes performance degradation in the model's prior knowledge domain. This review paper presents an overview and analysis of 440 research papers in this area.
arXiv Detail & Related papers (2025-02-27T03:39:10Z)
When Text Embedding Meets Large Language Model: A Comprehensive Survey [17.263184207651072]
This survey focuses on the interplay between large language models (LLMs) and text embeddings. It offers a novel and systematic overview of contributions from various research and application domains. Building on this analysis, we outline prospective directions for the evolution of text embedding.
arXiv Detail & Related papers (2024-12-12T10:50:26Z)
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models [32.336273322481276]
Despite their diverse capabilities, Large Language Models (LLMs) exhibit varying strengths and weaknesses. To address these challenges, recent studies have explored collaborative strategies for LLMs. This paper provides a comprehensive overview of this emerging research area, highlighting the motivation behind such collaborations.
arXiv Detail & Related papers (2024-07-08T16:29:08Z)
Can LLMs Solve longer Math Word Problems Better? [47.227621867242]
Math Word Problems (MWPs) are crucial for evaluating the capability of Large Language Models (LLMs) This study pioneers the exploration of Context Length Generalizability (CoLeG) Two novel metrics are proposed to assess the efficacy and resilience of LLMs in solving these problems.
arXiv Detail & Related papers (2024-05-23T17:13:50Z)
ChatGPT Alternative Solutions: Large Language Models Survey [0.0]
Large Language Models (LLMs) have ignited a surge in research contributions within this domain. Recent years have witnessed a dynamic synergy between academia and industry, propelling the field of LLM research to new heights. This survey furnishes a well-rounded perspective on the current state of generative AI, shedding light on opportunities for further exploration, enhancement, and innovation.
arXiv Detail & Related papers (2024-03-21T15:16:50Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z)
Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks. The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human. These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z)
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs [72.49064988035126]
We propose an approach called MKS2, aimed at enhancing multimodal large language models (MLLMs) Specifically, we introduce the Modular Visual Memory, a component integrated into the internal blocks of LLMs, designed to store open-world visual information efficiently. Our experiments demonstrate that MKS2 substantially augments the reasoning capabilities of LLMs in contexts necessitating physical or commonsense knowledge.
arXiv Detail & Related papers (2023-11-27T12:29:20Z)
A Survey on Multimodal Large Language Models [71.63375558033364]
Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot. This paper aims to trace and summarize the recent progress of MLLMs.
arXiv Detail & Related papers (2023-06-23T15:21:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.