BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval
Augmented Long-Context Large Language Models
- URL: http://arxiv.org/abs/2402.11573v1
- Date: Sun, 18 Feb 2024 12:41:01 GMT
- Title: BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval
Augmented Long-Context Large Language Models
- Authors: Kun Luo and Zheng Liu and Shitao Xiao and Kang Liu
- Abstract summary: Large language models (LLMs) call for extension of context to handle many critical applications.
Existing approaches are prone to expensive costs and inferior quality of context extension.
Extensible embedding stand as an enhancement of typical token embedding.
- Score: 13.229325187638432
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) call for extension of context to handle many
critical applications. However, the existing approaches are prone to expensive
costs and inferior quality of context extension. In this work, we
proposeExtensible Embedding, which realizes high-quality extension of LLM's
context with strong flexibility and cost-effectiveness. Extensible embedding
stand as an enhancement of typical token embedding, which represents the
information for an extensible scope of context instead of a single token. By
leveraging such compact input units of higher information density, the LLM can
access to a vast scope of context even with a small context window. Extensible
embedding is systematically optimized in architecture and training method,
which leads to multiple advantages. 1) High flexibility of context extension,
which flexibly supports ad-hoc extension of diverse context lengths. 2) Strong
sample efficiency of training, which enables the embedding model to be learned
in a cost-effective way. 3) Superior compatibility with the existing LLMs,
where the extensible embedding can be seamlessly introduced as a plug-in
component. Comprehensive evaluations on long-context language modeling and
understanding tasks verify extensible embedding as an effective, efficient,
flexible, and compatible method to extend the LLM's context.
Related papers
- LLMs Can Evolve Continually on Modality for X-Modal Reasoning [62.2874638875554]
Existing methods rely heavily on modal-specific pretraining and joint-modal tuning, leading to significant computational burdens when expanding to new modalities.
We propose PathWeave, a flexible and scalable framework with modal-Path sWitching and ExpAnsion abilities.
PathWeave performs comparably to state-of-the-art MLLMs while concurrently reducing parameter training burdens by 98.73%.
arXiv Detail & Related papers (2024-10-26T13:19:57Z) - ELICIT: LLM Augmentation via External In-Context Capability [16.237679215248196]
alg is a framework consisting of two modules designed to effectively store and reuse task vectors.
alg serves as a plug-and-play performance booster to enable adaptive elicitation of model capabilities.
arXiv Detail & Related papers (2024-10-12T03:19:06Z) - SEGMENT+: Long Text Processing with Short-Context Language Models [53.40059130780192]
SEGMENT+ is a framework that enables LMs to handle extended inputs within limited context windows efficiently.
SEGMENT+ utilizes structured notes and a filtering module to manage information flow, resulting in a system that is both controllable and interpretable.
arXiv Detail & Related papers (2024-10-09T03:40:22Z) - ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning [72.90823351726374]
We introduce the Unified framework for Large Language Model Embedding (ULLME), a flexible, plug-and-play implementation that enables bidirectional attention across various LLMs.
We also propose Generation-augmented Representation Learning (GRL), a novel fine-tuning method to boost LLMs for text embedding tasks.
To showcase our framework's flexibility and effectiveness, we release three pre-trained models from ULLME with different backbone architectures.
arXiv Detail & Related papers (2024-08-06T18:53:54Z) - Fine-tuning Multimodal Large Language Models for Product Bundling [53.01642741096356]
We introduce Bundle-MLLM, a novel framework that fine-tunes large language models (LLMs) through a hybrid item tokenization approach.
Specifically, we integrate textual, media, and relational data into a unified tokenization, introducing a soft separation token to distinguish between textual and non-textual tokens.
We propose a progressive optimization strategy that fine-tunes LLMs for disentangled objectives: 1) learning bundle patterns and 2) enhancing multimodal semantic understanding specific to product bundling.
arXiv Detail & Related papers (2024-07-16T13:30:14Z) - Long Context Alignment with Short Instructions and Synthesized Positions [56.1267385315404]
This paper introduces Step-Skipping Alignment (SkipAlign)
It is a new technique designed to enhance the long-context capabilities of Large Language Models (LLMs)
With a careful selection of the base model and alignment datasets, SkipAlign with only 6B parameters achieves it's best performance and comparable with strong baselines like GPT-3.5-Turbo-16K on LongBench.
arXiv Detail & Related papers (2024-05-07T01:56:22Z) - Extensible Embedding: A Flexible Multipler For LLM's Context Length [6.9004592877749005]
Large language models (LLMs) call for extension of context to handle many critical applications.
Existing approaches are prone to expensive costs and inferior quality of context extension.
We propose Extensible Embedding, which realizes high-quality extension of LLM's context with strong flexibility and cost-effectiveness.
arXiv Detail & Related papers (2024-02-18T12:50:19Z) - Flexibly Scaling Large Language Models Contexts Through Extensible
Tokenization [6.9004592877749005]
Large language models (LLMs) are in need of sufficient contexts to handle many critical applications.
Although the size of context window can be extended by fine-tuning, it will result in a substantial cost in both training and inference stage.
We present Extensible Tokenization as an alternative method which realizes the flexible scaling of LLMs' context.
arXiv Detail & Related papers (2024-01-15T16:00:50Z) - Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled.
First, we quantize and embed both text and visual prompt into a unified representational space.
Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.