FreeChunker: A Cross-Granularity Chunking Framework
- URL: http://arxiv.org/abs/2510.20356v1
- Date: Thu, 23 Oct 2025 08:57:00 GMT
- Title: FreeChunker: A Cross-Granularity Chunking Framework
- Authors: Wenxuan Zhang, Yuan-Hao Jiang, Yonghe Wu,
- Abstract summary: Chunking strategies significantly impact the effectiveness of Retrieval-Augmented Generation (RAG) systems.<n>This paper presents FreeChunker, a Cross-Granularity Framework that transforms the traditional chunking paradigm.
- Score: 16.790630771624162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chunking strategies significantly impact the effectiveness of Retrieval-Augmented Generation (RAG) systems. Existing methods operate within fixed-granularity paradigms that rely on static boundary identification, limiting their adaptability to diverse query requirements. This paper presents FreeChunker, a Cross-Granularity Encoding Framework that fundamentally transforms the traditional chunking paradigm: the framework treats sentences as atomic units and shifts from static chunk segmentation to flexible retrieval supporting arbitrary sentence combinations. This paradigm shift not only significantly reduces the computational overhead required for semantic boundary detection but also enhances adaptability to complex queries. Experimental evaluation on LongBench V2 demonstrates that FreeChunker achieves superior retrieval performance compared to traditional chunking methods, while significantly outperforming existing approaches in computational efficiency.
Related papers
- Structure-Aware Robust Counterfactual Explanations via Conditional Gaussian Network Classifiers [0.26999000177990923]
This work presents a structure-aware robustness-and-counterfactual search method based on conditional conditional graphs.<n>Results show that our method achieves strong consistency, with direct optimization of the original formulation providing especially stable dependencies.<n>The proposed framework lays the groundwork for future advances in counterfactual reasoning under noncyclic constraints.
arXiv Detail & Related papers (2026-02-08T15:51:45Z) - Short Chains, Deep Thoughts: Balancing Reasoning Efficiency and Intra-Segment Capability via Split-Merge Optimization [68.89915707647138]
Large Reasoning Models (LRMs) have demonstrated impressive capabilities in solving complex tasks through the generation of long reasoning chains.<n>We propose textbfCoSMo (textbfSplit-textbfMerge textbfOptimization), a framework designed to eliminate structural redundancy rather than indiscriminately restricting token volume.
arXiv Detail & Related papers (2026-02-03T05:54:28Z) - Gated Differentiable Working Memory for Long-Context Language Modeling [80.27483324685434]
We propose Gdwm (Gated Differentiable Working Memory), a framework that introduces a write controller to gate the consolidation process.<n>Experiments on ZeroSCROLLS and LongBench v2 demonstrate that Gdwm achieves comparable or superior performance with 4$times$ fewer gradient steps than uniform baselines.
arXiv Detail & Related papers (2026-01-19T10:00:33Z) - Accelerate Speculative Decoding with Sparse Computation in Verification [49.74839681322316]
Speculative decoding accelerates autoregressive language model inference by verifying multiple draft tokens in parallel.<n>Existing sparsification methods are designed primarily for standard token-by-token autoregressive decoding.<n>We propose a sparse verification framework that jointly sparsifies attention, FFN, and MoE components during the verification stage to reduce the dominant computation cost.
arXiv Detail & Related papers (2025-12-26T07:53:41Z) - Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling [52.05149789178508]
CCF is a novel context compression framework designed to enable efficient long-context modeling.<n>CCF integrates segment-wise semantic aggregation with key-value memory encoding, forming compact representations.<n> Empirical results on multiple long-context language modeling benchmarks demonstrate that CCF achieves competitive perplexity under high compression ratios.
arXiv Detail & Related papers (2025-09-11T07:13:49Z) - READER: Retrieval-Assisted Drafter for Efficient LLM Inference [0.0386965802948046]
Autoregressive Language Models instantiate a factorized likelihood over token sequences, yet their strictly sequential decoding process imposes an intrinsic lower bound on latency inference.<n>This bottleneck has emerged as a central obstacle to the scalable deployment of large-scale generative models.<n>We present READER, a speculative decoding framework that bypasses the training of the auxiliary draft model.
arXiv Detail & Related papers (2025-08-12T16:47:48Z) - Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers [58.98923344096319]
REFORM is a novel inference framework that efficiently handles long contexts through a two-phase approach.<n>It achieves over 50% and 27% performance gains on RULER and BABILong respectively at 1M context length.<n>It also outperforms baselines on Infinite-Bench and MM-NIAH, demonstrating flexibility across diverse tasks and domains.
arXiv Detail & Related papers (2025-06-01T23:49:14Z) - UGCE: User-Guided Incremental Counterfactual Exploration [2.2789818122188925]
Counterfactual explanations (CFEs) are a popular approach for interpreting machine learning predictions by identifying minimal feature changes that alter model outputs.<n>Existing methods fail to support such iterative updates, instead recomputing explanations from scratch with each change, an inefficient and rigid approach.<n>We propose User-Guided Incremental Counterfactual Exploration (UGCE), a genetic algorithm-based framework that incrementally updates counterfactuals in response to evolving user constraints.
arXiv Detail & Related papers (2025-05-27T15:24:43Z) - Long Context In-Context Compression by Getting to the Gist of Gisting [50.24627831994713]
GistPool is an in-context compression method with no architectural modification to the decoder transformer.<n>We demonstrate that gisting struggles with longer contexts, with significant performance drops even at minimal compression rates.<n>GistPool preserves the simplicity of gisting, while significantly boosting its performance on long context compression tasks.
arXiv Detail & Related papers (2025-04-11T19:23:31Z) - MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System [11.793639794583498]
This paper introduces a dual-metric evaluation method, comprising Boundary Clarity and Chunk Stickiness.<n>We highlight the inherent limitations of traditional and semantic chunking in handling complex contextual nuances.<n>We devise the Mixture-aware Mixture-of-Chunkers (MoC) framework, which consists of a three-stage processing mechanism.
arXiv Detail & Related papers (2025-03-12T17:59:42Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Stochastic Reweighted Gradient Descent [4.355567556995855]
We propose an importance-sampling-based algorithm we call SRG (stochastic reweighted gradient)
We pay particular attention to the time and memory overhead of our proposed method.
We present empirical results to support our findings.
arXiv Detail & Related papers (2021-03-23T04:09:43Z) - Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results.
Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples.
Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.