The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants
- URL: http://arxiv.org/abs/2505.19797v3
- Date: Wed, 18 Jun 2025 09:47:20 GMT
- Title: The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants
- Authors: Yiqun Zhang, Hao Li, Chenxu Wang, Linyao Chen, Qiaosheng Zhang, Peng Ye, Shi Feng, Daling Wang, Zhen Wang, Xinrun Wang, Jia Xu, Lei Bai, Wanli Ouyang, Shuyue Hu,
- Abstract summary: We present the Avengers -- a simple recipe that leverages the collective intelligence of smaller models.<n>With 10 open-source models, the Avengers surpasses GPT-4o, 4.1, and 4.5 in average performance across 15 diverse datasets.<n>In particular, it surpasses GPT-4.1 on mathematics tasks by 18.21% and on code tasks by 7.46%.
- Score: 66.6636608563034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proprietary giants are increasingly dominating the race for ever-larger language models. Can open-source, smaller models remain competitive across a broad range of tasks? In this paper, we present the Avengers -- a simple recipe that leverages the collective intelligence of these smaller models. The Avengers builds upon four lightweight operations: (i) embedding: encode queries using a text embedding model; (ii) clustering: group queries based on their semantic similarity; (iii) scoring: scores each model's performance within each cluster; and (iv) voting: improve outputs via repeated sampling and voting. At inference time, each query is embedded and assigned to its nearest cluster. The top-performing model(s) within that cluster are selected to generate the response with repeated sampling. Remarkably, with 10 open-source models (~7B parameters each), the Avengers surpasses GPT-4o, 4.1, and 4.5 in average performance across 15 diverse datasets spanning mathematics, coding, logical reasoning, general knowledge, and affective tasks. In particular, it surpasses GPT-4.1 on mathematics tasks by 18.21% and on code tasks by 7.46%. Furthermore, the Avengers delivers superior out-of-distribution generalization, and remains robust across various embedding models, clustering algorithms, ensemble strategies, and values of its sole parameter -- the number of clusters.
Related papers
- An Enhanced Model-based Approach for Short Text Clustering [58.60681789677676]
Short text clustering has become increasingly important with the popularity of social media like Twitter, Google+, and Facebook.<n>Existing methods can be broadly categorized into two paradigms: topic model-based approaches and deep representation learning-based approaches.<n>We propose a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM), which effectively handles the sparsity and high dimensionality of short texts.<n>Based on several aspects of GSDMM that warrant further refinement, we propose an improved approach, GSDMM+, designed to further optimize its performance.
arXiv Detail & Related papers (2025-07-18T10:07:42Z) - HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization [0.0]
HERCULES is an algorithm and Python package designed for hierarchical k-means clustering of diverse data types.<n>It generates semantically rich titles and descriptions for clusters at each level of the hierarchy.<n>An interactive visualization tool facilitates thorough analysis and understanding of the clustering results.
arXiv Detail & Related papers (2025-06-24T20:22:00Z) - Lazy But Effective: Collaborative Personalized Federated Learning with Heterogeneous Data [15.15596911693489]
In Federated Learning, a single global model does not have the best performance for individual clients.<n>We propose a personalized federated learning framework (pFedLIA) that utilizes a computationally efficient influence approximation.<n>Our method has been shown to successfully recover the global model's performance drop due to the non-IID-Lazyness in various synthetic and real-world settings.
arXiv Detail & Related papers (2025-05-05T10:26:35Z) - Cluster Specific Representation Learning [1.6727186769396276]
Despite its widespread application, there is no established definition of a good'' representation.<n>We propose a downstream-agnostic formulation: when inherent clusters exist in the data, the representations should be specific to each cluster.<n>Under this idea, we develop a meta-algorithm that jointly learns cluster-specific representations and cluster assignments.
arXiv Detail & Related papers (2024-12-04T16:59:37Z) - Fair Minimum Representation Clustering via Integer Programming [0.6906005491572401]
Clustering is an unsupervised learning task that aims to partition data into a set of clusters.
In this paper, we study the k-means and k-medians clustering problems with the additional constraint that each group must have a minimum level of representation.
We present an alternating minimization algorithm, called MiniReL, that directly incorporates the fairness constraints.
arXiv Detail & Related papers (2024-09-04T00:13:40Z) - Clustered FedStack: Intermediate Global Models with Bayesian Information
Criterion [8.478300563501035]
We propose a novel Clustered FedStack framework based on the Stacked Federated Learning (FedStack) framework.
The local clients send their model predictions and output layer weights to a server, which then builds a robust global model.
This global model clusters the local clients based on their output layer weights using a clustering mechanism.
arXiv Detail & Related papers (2023-09-20T03:47:53Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Interpretable Deep Clustering for Tabular Data [7.972599673048582]
Clustering is a fundamental learning task widely used in data analysis.
We propose a new deep-learning framework that predicts interpretable cluster assignments at the instance and cluster levels.
We show that the proposed method can reliably predict cluster assignments in biological, text, image, and physics datasets.
arXiv Detail & Related papers (2023-06-07T21:08:09Z) - ClusterLLM: Large Language Models as a Guide for Text Clustering [45.835625439515]
We introduce ClusterLLM, a novel text clustering framework that leverages feedback from an instruction-tuned large language model, such as ChatGPT.
ClusterLLM consistently improves clustering quality, at an average cost of $0.6 per dataset.
arXiv Detail & Related papers (2023-05-24T08:24:25Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Multitask Prompted Training Enables Zero-Shot Task Generalization [70.12770442071657]
We develop a system for mapping general natural language tasks into a human-readable prompted form.
We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks.
The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size.
arXiv Detail & Related papers (2021-10-15T17:08:57Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.