Related papers: Improving Neural Ranking Models with Traditional IR Methods

Improving Neural Ranking Models with Traditional IR Methods

URL: http://arxiv.org/abs/2308.15027v1
Date: Tue, 29 Aug 2023 05:18:47 GMT
Title: Improving Neural Ranking Models with Traditional IR Methods
Authors: Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener
Abstract summary: TF-IDF, a traditional keyword matching method, with a shallow embedding model provides a low cost path to compete well with the performance of complex neural ranking models on 3 datasets. Adding TF-IDF measures improves the performance of large-scale fine tuned models on these tasks.
Score: 13.354623448774877
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions. Nevertheless, they are computationally expensive to create, and require a great deal of labeled data for specialized corpora. In this paper, we explore a low resource alternative which is a bag-of-embedding model for document retrieval and find that it is competitive with large transformer models fine tuned on information retrieval tasks. Our results show that a simple combination of TF-IDF, a traditional keyword matching method, with a shallow embedding model provides a low cost path to compete well with the performance of complex neural ranking models on 3 datasets. Furthermore, adding TF-IDF measures improves the performance of large-scale fine tuned models on these tasks.

Related papers

Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
Aggregating Low Rank Adapters in Federated Fine-tuning [0.0]
Fine-tuning large language models requires high computational and memory resources, and is therefore associated with significant costs. We propose a novel aggregation method and compare it with different existing aggregation methods of low rank adapters trained in a federated fine-tuning of large machine learning models. We evaluate their performance with respect to selected GLUE benchmark datasets.
arXiv Detail & Related papers (2025-01-10T20:24:33Z)
Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality.<n>We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z)
Enhancing Few-Shot Learning with Integrated Data and GAN Model Approaches [35.431340001608476]
This paper presents an innovative approach to enhancing few-shot learning by integrating data augmentation with model fine-tuning. It aims to tackle the challenges posed by small-sample data in fields such as drug discovery, target recognition, and malicious traffic detection. Results confirm that the MhERGAN algorithm developed in this research is highly effective for few-shot learning.
arXiv Detail & Related papers (2024-11-25T16:51:11Z)
A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models. Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning. We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z)
Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development. This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
One-Shot Heterogeneous Federated Learning with Local Model-Guided Diffusion Models [40.83058938096914]
FedLMG is a one-shot Federated learning method with Local Model-Guided diffusion models. Clients do not need access to any foundation models but only train and upload their local models.
arXiv Detail & Related papers (2023-11-15T11:11:25Z)
Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic. We first benchmark the performance of these models under different perturbations and datasets. We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
FedNet2Net: Saving Communication and Computations in Federated Learning with Model Growing [0.0]
Federated learning (FL) is a recently developed area of machine learning. In this paper, a novel scheme based on the notion of "model growing" is proposed. The proposed approach is tested extensively on three standard benchmarks and is shown to achieve substantial reduction in communication and client computation.
arXiv Detail & Related papers (2022-07-19T21:54:53Z)
A transformer-based model for default prediction in mid-cap corporate markets [13.535770763481905]
We study mid-cap companies with less than US $10 billion in market capitalisation. We look to predict the default probability term structure over the medium term. We understand which data sources contribute most to the default risk.
arXiv Detail & Related papers (2021-11-18T19:01:00Z)
Rank-R FNN: A Tensor-Based Learning Model for High-Order Data Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters. First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension. We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
Heterogeneous Network Embedding for Deep Semantic Relevance Match in E-commerce Search [29.881612817309716]
We design an end-to-end First-and-Second-order Relevance prediction model for e-commerce item relevance. We introduce external knowledge generated from BERT to refine the network of user behaviors. Results of offline experiments showed that the new model significantly improved the prediction accuracy in terms of human relevance judgment.
arXiv Detail & Related papers (2021-01-13T03:12:53Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.