Related papers: BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models

BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models

URL: http://arxiv.org/abs/2402.11573v1
Date: Sun, 18 Feb 2024 12:41:01 GMT
Title: BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models
Authors: Kun Luo and Zheng Liu and Shitao Xiao and Kang Liu
Abstract summary: Large language models (LLMs) call for extension of context to handle many critical applications. Existing approaches are prone to expensive costs and inferior quality of context extension. Extensible embedding stand as an enhancement of typical token embedding.
Score: 13.229325187638432
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) call for extension of context to handle many critical applications. However, the existing approaches are prone to expensive costs and inferior quality of context extension. In this work, we proposeExtensible Embedding, which realizes high-quality extension of LLM's context with strong flexibility and cost-effectiveness. Extensible embedding stand as an enhancement of typical token embedding, which represents the information for an extensible scope of context instead of a single token. By leveraging such compact input units of higher information density, the LLM can access to a vast scope of context even with a small context window. Extensible embedding is systematically optimized in architecture and training method, which leads to multiple advantages. 1) High flexibility of context extension, which flexibly supports ad-hoc extension of diverse context lengths. 2) Strong sample efficiency of training, which enables the embedding model to be learned in a cost-effective way. 3) Superior compatibility with the existing LLMs, where the extensible embedding can be seamlessly introduced as a plug-in component. Comprehensive evaluations on long-context language modeling and understanding tasks verify extensible embedding as an effective, efficient, flexible, and compatible method to extend the LLM's context.

Related papers

InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation [57.310236384112834]
In-context learning (ICL) is critical for large language models (LLMs) but its effectiveness is constrained by finite context windows. We introduce InfiniteICL, a framework that parallels context and parameters in LLMs with short- and long-term memory. We demonstrate that our method reduces context length by 90% while achieving 103% average performance of full-context prompting.
arXiv Detail & Related papers (2025-04-02T13:15:44Z)
LLMs Can Evolve Continually on Modality for X-Modal Reasoning [62.2874638875554]
Existing methods rely heavily on modal-specific pretraining and joint-modal tuning, leading to significant computational burdens when expanding to new modalities. We propose PathWeave, a flexible and scalable framework with modal-Path sWitching and ExpAnsion abilities. PathWeave performs comparably to state-of-the-art MLLMs while concurrently reducing parameter training burdens by 98.73%.
arXiv Detail & Related papers (2024-10-26T13:19:57Z)
ELICIT: LLM Augmentation via External In-Context Capability [16.237679215248196]
alg is a framework consisting of two modules designed to effectively store and reuse task vectors. alg serves as a plug-and-play performance booster to enable adaptive elicitation of model capabilities.
arXiv Detail & Related papers (2024-10-12T03:19:06Z)
SEGMENT+: Long Text Processing with Short-Context Language Models [53.40059130780192]
SEGMENT+ is a framework that enables LMs to handle extended inputs within limited context windows efficiently. SEGMENT+ utilizes structured notes and a filtering module to manage information flow, resulting in a system that is both controllable and interpretable.
arXiv Detail & Related papers (2024-10-09T03:40:22Z)
ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning [72.90823351726374]
We introduce the Unified framework for Large Language Model Embedding (ULLME), a flexible, plug-and-play implementation that enables bidirectional attention across various LLMs. We also propose Generation-augmented Representation Learning (GRL), a novel fine-tuning method to boost LLMs for text embedding tasks. To showcase our framework's flexibility and effectiveness, we release three pre-trained models from ULLME with different backbone architectures.
arXiv Detail & Related papers (2024-08-06T18:53:54Z)
Fine-tuning Multimodal Large Language Models for Product Bundling [53.01642741096356]
We introduce Bundle-MLLM, a novel framework that fine-tunes large language models (LLMs) through a hybrid item tokenization approach. Specifically, we integrate textual, media, and relational data into a unified tokenization, introducing a soft separation token to distinguish between textual and non-textual tokens. We propose a progressive optimization strategy that fine-tunes LLMs for disentangled objectives: 1) learning bundle patterns and 2) enhancing multimodal semantic understanding specific to product bundling.
arXiv Detail & Related papers (2024-07-16T13:30:14Z)
Long Context Alignment with Short Instructions and Synthesized Positions [56.1267385315404]
This paper introduces Step-Skipping Alignment (SkipAlign) It is a new technique designed to enhance the long-context capabilities of Large Language Models (LLMs) With a careful selection of the base model and alignment datasets, SkipAlign with only 6B parameters achieves it's best performance and comparable with strong baselines like GPT-3.5-Turbo-16K on LongBench.
arXiv Detail & Related papers (2024-05-07T01:56:22Z)
Extensible Embedding: A Flexible Multipler For LLM's Context Length [6.9004592877749005]
Large language models (LLMs) call for extension of context to handle many critical applications. Existing approaches are prone to expensive costs and inferior quality of context extension. We propose Extensible Embedding, which realizes high-quality extension of LLM's context with strong flexibility and cost-effectiveness.
arXiv Detail & Related papers (2024-02-18T12:50:19Z)
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization [6.9004592877749005]
Large language models (LLMs) are in need of sufficient contexts to handle many critical applications. Although the size of context window can be extended by fine-tuning, it will result in a substantial cost in both training and inference stage. We present Extensible Tokenization as an alternative method which realizes the flexible scaling of LLMs' context.
arXiv Detail & Related papers (2024-01-15T16:00:50Z)
Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled. First, we quantize and embed both text and visual prompt into a unified representational space. Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.