A Comprehensive Overview of Large Language Models
- URL: http://arxiv.org/abs/2307.06435v9
- Date: Tue, 9 Apr 2024 21:38:33 GMT
- Title: A Comprehensive Overview of Large Language Models
- Authors: Humza Naveed, Asad Ullah Khan, Shi Qiu, Muhammad Saqib, Saeed Anwar, Muhammad Usman, Naveed Akhtar, Nick Barnes, Ajmal Mian,
- Abstract summary: Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks.
This article provides an overview of the existing literature on a broad range of LLM-related concepts.
- Score: 68.22178313875618
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.
Related papers
- Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models [32.336273322481276]
Despite their diverse capabilities, Large Language Models (LLMs) exhibit varying strengths and weaknesses.
To address these challenges, recent studies have explored collaborative strategies for LLMs.
This paper provides a comprehensive overview of this emerging research area, highlighting the motivation behind such collaborations.
arXiv Detail & Related papers (2024-07-08T16:29:08Z) - Can LLMs Solve longer Math Word Problems Better? [47.227621867242]
Math Word Problems (MWPs) are crucial for evaluating the capability of Large Language Models (LLMs)
This study pioneers the exploration of Context Length Generalizability (CoLeG)
Two novel metrics are proposed to assess the efficacy and resilience of LLMs in solving these problems.
arXiv Detail & Related papers (2024-05-23T17:13:50Z) - ChatGPT Alternative Solutions: Large Language Models Survey [0.0]
Large Language Models (LLMs) have ignited a surge in research contributions within this domain.
Recent years have witnessed a dynamic synergy between academia and industry, propelling the field of LLM research to new heights.
This survey furnishes a well-rounded perspective on the current state of generative AI, shedding light on opportunities for further exploration, enhancement, and innovation.
arXiv Detail & Related papers (2024-03-21T15:16:50Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage
and Sharing in LLMs [72.49064988035126]
We propose an approach called MKS2, aimed at enhancing multimodal large language models (MLLMs)
Specifically, we introduce the Modular Visual Memory, a component integrated into the internal blocks of LLMs, designed to store open-world visual information efficiently.
Our experiments demonstrate that MKS2 substantially augments the reasoning capabilities of LLMs in contexts necessitating physical or commonsense knowledge.
arXiv Detail & Related papers (2023-11-27T12:29:20Z) - A Survey on Multimodal Large Language Models [71.63375558033364]
Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot.
This paper aims to trace and summarize the recent progress of MLLMs.
arXiv Detail & Related papers (2023-06-23T15:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.