Group Communication with Context Codec for Ultra-Lightweight Source
Separation
- URL: http://arxiv.org/abs/2012.07291v1
- Date: Mon, 14 Dec 2020 06:57:58 GMT
- Title: Group Communication with Context Codec for Ultra-Lightweight Source
Separation
- Authors: Yi Luo, Cong Han, Nima Mesgarani
- Abstract summary: We propose the group communication with context (GC3) design to decrease both model size and complexity without sacrificing the model performance.
GC3 can achieve on par or better performance than a wide range of baseline architectures with as small as 2.5% model size.
- Score: 32.975741399690214
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Ultra-lightweight model design is an important topic for the deployment of
existing speech enhancement and source separation techniques on low-resource
platforms. Various lightweight model design paradigms have been proposed in
recent years; however, most models still suffer from finding a balance between
model size, model complexity, and model performance. In this paper, we propose
the group communication with context codec (GC3) design to decrease both model
size and complexity without sacrificing the model performance. Group
communication splits a high-dimensional feature into groups of low-dimensional
features and applies a module to capture the inter-group dependency. A model
can then be applied to the groups in parallel with a significantly smaller
width. A context codec is applied to decrease the length of a sequential
feature, where a context encoder compresses the temporal context of local
features into a single feature representing the global characteristics of the
context, and a context decoder decompresses the transformed global features
back to the context features. Experimental results show that GC3 can achieve on
par or better performance than a wide range of baseline architectures with as
small as 2.5% model size.
Related papers
- Towards Unifying Feature Interaction Models for Click-Through Rate Prediction [19.149554121852724]
We propose a general framework called IPA to unify existing models.
We demonstrate that most existing models can be categorized within our framework by making specific choices for these three components.
We introduce a novel model that achieves competitive results compared to state-of-the-art CTR models.
arXiv Detail & Related papers (2024-11-19T12:04:02Z) - Collective Model Intelligence Requires Compatible Specialization [29.590052023903457]
We show that as models specialize, the similarity in their feature space structure diminishes, hindering their capacity for collective use.
We propose a new direction for achieving collective model intelligence through what we call compatible specialization.
arXiv Detail & Related papers (2024-11-04T15:59:16Z) - Two are better than one: Context window extension with multi-grained self-injection [111.1376461868317]
SharedLLM is a novel approach grounded in the design philosophy of multi-grained context compression and query-aware information retrieval.
We introduce a specialized tree-style data structure to efficiently encode, store and retrieve multi-grained contextual information for text chunks.
arXiv Detail & Related papers (2024-10-25T06:08:59Z) - HM3: Heterogeneous Multi-Class Model Merging [0.0]
We explore training-free model merging techniques to consolidate auxiliary guard-rail models into a single, multi-functional model.
We propose Heterogeneous Multi-Class Model Merging (HM3) as a simple technique for merging multi-class classifiers with heterogeneous label spaces.
We report promising results for merging BERT-based guard models, some of which attain an average F1-score higher than the source models while reducing the inference time by up to 44%.
arXiv Detail & Related papers (2024-09-27T22:42:45Z) - GrootVL: Tree Topology is All You Need in State Space Model [66.36757400689281]
GrootVL is a versatile multimodal framework that can be applied to both visual and textual tasks.
Our method significantly outperforms existing structured state space models on image classification, object detection and segmentation.
By fine-tuning large language models, our approach achieves consistent improvements in multiple textual tasks at minor training cost.
arXiv Detail & Related papers (2024-06-04T15:09:29Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Speculative Decoding with Big Little Decoder [108.95187338417541]
Big Little Decoder (BiLD) is a framework that can improve inference efficiency and latency for a wide range of text generation applications.
On an NVIDIA T4 GPU, our framework achieves a speedup of up to 2.12x speedup with minimal generation quality degradation.
Our framework is fully plug-and-play and can be applied without any modifications in the training process or model architecture.
arXiv Detail & Related papers (2023-02-15T18:55:29Z) - Sparsity-guided Network Design for Frame Interpolation [39.828644638174225]
We present a compression-driven network design for frame-based algorithms.
We leverage model pruning through sparsity-inducing optimization to greatly reduce the model size.
We achieve a considerable performance gain with a quarter of the size of the original AdaCoF.
arXiv Detail & Related papers (2022-09-09T23:13:25Z) - SUNet: Scale-aware Unified Network for Panoptic Segmentation [25.626882426111198]
We propose two lightweight modules to mitigate the problem of segmenting objects of various scales.
We present an end-to-end Scale-aware Unified Network (SUNet) which is more adaptable to multi-scale objects.
arXiv Detail & Related papers (2022-09-07T01:40:41Z) - Multi-Scale Semantics-Guided Neural Networks for Efficient
Skeleton-Based Human Action Recognition [140.18376685167857]
A simple yet effective multi-scale semantics-guided neural network is proposed for skeleton-based action recognition.
MS-SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets.
arXiv Detail & Related papers (2021-11-07T03:50:50Z) - Model Patching: Closing the Subgroup Performance Gap with Data
Augmentation [50.35010342284508]
We introduce model patching, a framework for improving robustness of machine learning models.
Model patching encourages the model to be invariant to subgroup differences, and focus on class information shared by subgroups.
We instantiate model patching with CAMEL, which (1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and (2) balances subgroup performance using a theoretically-motivated consistency regularizer.
We demonstrate CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error up to 33% relative to the best baseline.
arXiv Detail & Related papers (2020-08-15T20:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.