Preference Modeling with Context-Dependent Salient Features
- URL: http://arxiv.org/abs/2002.09615v2
- Date: Sat, 27 Jun 2020 01:45:10 GMT
- Title: Preference Modeling with Context-Dependent Salient Features
- Authors: Amanda Bower and Laura Balzano
- Abstract summary: We consider the problem of estimating a ranking on a set of items from noisy pairwise comparisons given item features.
Our key observation is that two items compared in isolation from other items may be compared based on only a salient subset of features.
- Score: 12.403492796441434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of estimating a ranking on a set of items from noisy
pairwise comparisons given item features. We address the fact that pairwise
comparison data often reflects irrational choice, e.g. intransitivity. Our key
observation is that two items compared in isolation from other items may be
compared based on only a salient subset of features. Formalizing this
framework, we propose the salient feature preference model and prove a finite
sample complexity result for learning the parameters of our model and the
underlying ranking with maximum likelihood estimation. We also provide
empirical results that support our theoretical bounds and illustrate how our
model explains systematic intransitivity. Finally we demonstrate strong
performance of maximum likelihood estimation of our model on both synthetic
data and two real data sets: the UT Zappos50K data set and comparison data
about the compactness of legislative districts in the US.
Related papers
- Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks [59.47851630504264]
Free-text explanations are expressive and easy to understand, but many datasets lack annotated explanation data.
We fine-tune T5-Large and OLMo-7B models and assess the impact of fine-tuning data quality, the number of fine-tuning samples, and few-shot selection methods.
The models are evaluated on 19 diverse OOD datasets across three tasks: natural language inference (NLI), fact-checking, and hallucination detection in abstractive summarization.
arXiv Detail & Related papers (2025-02-07T10:01:32Z) - A Statistical Framework for Ranking LLM-Based Chatbots [57.59268154690763]
We propose a statistical framework that incorporates key advancements to address specific challenges in pairwise comparison analysis.
First, we introduce a factored tie model that enhances the ability to handle groupings of human-judged comparisons.
Second, we extend the framework to model covariance tiers between competitors, enabling deeper insights into performance relationships.
Third, we resolve optimization challenges arising from parameter non-uniqueness by introducing novel constraints.
arXiv Detail & Related papers (2024-12-24T12:54:19Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Statistical inference for pairwise comparison models [5.487882744996216]
This paper establishes a near-optimal normality for the maximum likelihood in a broad class of pairwise comparison models.
The key idea lies in identifying the Fisher information matrix as a weighted graph Laplacian, which can be studied via a meticulous spectral analysis.
arXiv Detail & Related papers (2024-01-16T16:14:09Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Two-Sample Testing on Ranked Preference Data and the Role of Modeling
Assumptions [57.77347280992548]
In this paper, we design two-sample tests for pairwise comparison data and ranking data.
Our test requires essentially no assumptions on the distributions.
By applying our two-sample test on real-world pairwise comparison data, we conclude that ratings and rankings provided by people are indeed distributed differently.
arXiv Detail & Related papers (2020-06-21T20:51:09Z) - Evaluating Text Coherence at Sentence and Paragraph Levels [17.99797111176988]
We investigate the adaptation of existing sentence ordering methods to a paragraph ordering task.
We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets.
We conclude that the recurrent graph neural network-based model is an optimal choice for coherence modeling.
arXiv Detail & Related papers (2020-06-05T03:31:49Z) - Interpretable Meta-Measure for Model Performance [4.91155110560629]
We introduce a new meta-score assessment named Elo-based Predictive Power (EPP)
EPP is built on top of other performance measures and allows for interpretable comparisons of models.
We prove the mathematical properties of EPP and support them with empirical results of a large scale benchmark on 30 classification data sets and a real-world benchmark for visual data.
arXiv Detail & Related papers (2020-06-02T14:10:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.