Several categories of Large Language Models (LLMs): A Short Survey
- URL: http://arxiv.org/abs/2307.10188v1
- Date: Wed, 5 Jul 2023 18:18:23 GMT
- Title: Several categories of Large Language Models (LLMs): A Short Survey
- Authors: Saurabh Pahune, Manoj Chandrasekharan
- Abstract summary: Large Language Models(LLMs)have become effective tools for natural language processing and have been used in many different fields.
The survey emphasizes recent developments and efforts made for various LLM kinds, including task-based financial LLMs, multilingual language LLMs, biomedical and clinical LLMs, vision language LLMs, and code language models.
- Score: 3.73538163699716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models(LLMs)have become effective tools for natural language
processing and have been used in many different fields. This essay offers a
succinct summary of various LLM subcategories. The survey emphasizes recent
developments and efforts made for various LLM kinds, including task-based
financial LLMs, multilingual language LLMs, biomedical and clinical LLMs,
vision language LLMs, and code language models. The survey gives a general
summary of the methods, attributes, datasets, transformer models, and
comparison metrics applied in each category of LLMs. Furthermore, it highlights
unresolved problems in the field of developing chatbots and virtual assistants,
such as boosting natural language processing, enhancing chatbot intelligence,
and resolving moral and legal dilemmas. The purpose of this study is to provide
readers, developers, academics, and users interested in LLM-based chatbots and
virtual intelligent assistant technologies with useful information and future
directions.
Related papers
- LLMs Meet Multimodal Generation and Editing: A Survey [89.76691959033323]
This survey elaborates on multimodal generation and editing across various domains, comprising image, video, 3D, and audio.
We summarize the notable advancements with milestone works in these fields and categorize these studies into LLM-based and CLIP/T5-based methods.
We dig into tool-augmented multimodal agents that can leverage existing generative models for human-computer interaction.
arXiv Detail & Related papers (2024-05-29T17:59:20Z) - Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions [2.5179515260542544]
Large Language Models (LLMs) have gained significant attention across academia and industry for their versatile applications in text generation, question answering, and text summarization.
To quantify the performance, it's crucial to have a comprehensive grasp of existing metrics.
This paper offers a comprehensive exploration of LLM evaluation from a metrics perspective, providing insights into the selection and interpretation of metrics currently in use.
arXiv Detail & Related papers (2024-04-14T03:54:00Z) - A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias [5.104497013562654]
We present an overview of MLLMs, covering their evolution, key techniques, and multilingual capacities.
We explore widely utilized multilingual corpora for MLLMs' training and multilingual datasets oriented for downstream tasks.
We discuss bias on MLLMs including its category and evaluation metrics, and summarize the existing debiasing techniques.
arXiv Detail & Related papers (2024-04-01T05:13:56Z) - Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.
We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.
Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - A Survey on Multimodal Large Language Models [71.63375558033364]
Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot.
This paper aims to trace and summarize the recent progress of MLLMs.
arXiv Detail & Related papers (2023-06-23T15:21:52Z) - A Primer on Pretrained Multilingual Language Models [18.943173499882885]
Multilingual Language Models (MLLMs) have emerged as a viable option for bringing the power of pretraining to a large number of languages.
We review the existing literature covering the above broad areas of research pertaining to MLLMs.
arXiv Detail & Related papers (2021-07-01T18:01:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.