Related papers: COPA: Comparing the Incomparable to Explore the Pareto Front

COPA: Comparing the Incomparable to Explore the Pareto Front

URL: http://arxiv.org/abs/2503.14321v1
Date: Tue, 18 Mar 2025 14:51:42 GMT
Title: COPA: Comparing the Incomparable to Explore the Pareto Front
Authors: Adrián Javaloy, Antonio Vergari, Isabel Valera,
Abstract summary: In machine learning (ML) it is common to account for multiple objectives when selecting a model to deploy.<n>It is often unclear how one should compare, aggregate and, ultimately, trade-off these objectives.
Score: 19.11658981007657
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In machine learning (ML), it is common to account for multiple objectives when, e.g., selecting a model to deploy. However, it is often unclear how one should compare, aggregate and, ultimately, trade-off these objectives, as they might be measured in different units or scales. For example, when deploying large language models (LLMs), we might not only care about their performance, but also their CO2 consumption. In this work, we investigate how objectives can be sensibly compared and aggregated to navigate their Pareto front. To do so, we propose to make incomparable objectives comparable via their CDFs, approximated by their relative rankings. This allows us to aggregate them while matching user-specific preferences, allowing practitioners to meaningfully navigate and search for models in the Pareto front. We demonstrate the potential impact of our methodology in diverse areas such as LLM selection, domain generalization, and AutoML benchmarking, where classical ways to aggregate and normalize objectives fail.

Related papers

Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF [13.612504157832708]
Reinforcement Learning with Human Feedback (RLHF) is a widely used fine-tuning approach that aligns machine learning model with human preferences.<n>In this work, we transform the non-linear aggregation problem into a series of sub-problems and extend our framework to handle multi-group scenarios.<n>We demonstrate that our algorithmic framework achieves sublinear regret and can be easily adapted to a reward-free algorithm.
arXiv Detail & Related papers (2025-02-21T01:56:52Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.<n> Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.<n>We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
Towards Unified Benchmark and Models for Multi-Modal Perceptual Metrics [37.86612817818566]
General purpose vision-language models, such as CLIP and large multi-modal models (LMMs), can be applied as zero-shot perceptual metrics.<n>We introduce UniSim-Bench, a benchmark encompassing 7 multi-modal perceptual similarity tasks with a total of 25 datasets.<n>Our evaluation reveals that while general-purpose models perform reasonably well on average, they often lag behind specialized models on individual tasks.
arXiv Detail & Related papers (2024-12-13T22:38:09Z)
From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons [85.99268361356832]
We introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA)<n>GEA is a single unified model capable of grounding itself across varied domains through a multi-embodiment action tokenizer.<n>Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents.
arXiv Detail & Related papers (2024-12-11T15:06:25Z)
Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels [64.94853276821992]
Large multimodal models (LMMs) are increasingly deployed across diverse applications.<n>Traditional evaluation methods are largely dataset-centric, relying on fixed, labeled datasets and supervised metrics.<n>We explore unsupervised model ranking for LMMs by leveraging their uncertainty signals, such as softmax probabilities.
arXiv Detail & Related papers (2024-12-09T13:05:43Z)
Deciphering AutoML Ensembles: cattleia's Assistance in Decision-Making [0.0]
Cattleia is an application that deciphers the ensembles for regression, multiclass, and binary classification tasks. It works with models built by three AutoML packages: auto-sklearn, AutoGluon, and FLAML.
arXiv Detail & Related papers (2024-03-19T11:56:21Z)
Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data. We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters. We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z)
Revisiting Few-Shot Object Detection with Vision-Language Models [49.79495118650838]
We revisit the task of few-shot object detection (FSOD) in the context of recent foundational vision-language models (VLMs) We propose Foundational FSOD, a new benchmark protocol that evaluates detectors pre-trained on any external data. We discuss our recent CVPR 2024 Foundational FSOD competition and share insights from the community.
arXiv Detail & Related papers (2023-12-22T07:42:00Z)
Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration [32.15773300068426]
Membership Inference Attacks aim to infer whether a target data record has been utilized for model training. We propose a Membership Inference Attack based on Self-calibrated Probabilistic Variation (SPV-MIA)
arXiv Detail & Related papers (2023-11-10T13:55:05Z)
Universal Semi-supervised Model Adaptation via Collaborative Consistency Training [92.52892510093037]
We introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA) We propose a collaborative consistency training framework that regularizes the prediction consistency between two models. Experimental results demonstrate the effectiveness of our method on several benchmark datasets.
arXiv Detail & Related papers (2023-07-07T08:19:40Z)
Variable Importance Matching for Causal Inference [73.25504313552516]
We describe a general framework called Model-to-Match that achieves these goals. Model-to-Match uses variable importance measurements to construct a distance metric. We operationalize the Model-to-Match framework with LASSO.
arXiv Detail & Related papers (2023-02-23T00:43:03Z)
Learning with Impartiality to Walk on the Pareto Frontier of Fairness, Privacy, and Utility [28.946180502706504]
We argue that machine learning pipelines should not favor one objective over another. We propose impartially-specified models that show the inherent trade-offs between the objectives. We provide an answer to the question of where fairness mitigation should be integrated within a privacy-aware ML pipeline.
arXiv Detail & Related papers (2023-02-17T23:23:45Z)
Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups. We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective. Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z)
Customized Video QoE Estimation with Algorithm-Agnostic Transfer Learning [1.452875650827562]
Small datasets, lack of diversity in user profiles in source domain, and too much diversity in target domains of QoE models are challenges for QoE models. We present a transfer learning-based ML model training approach, which allows decentralized local models to share generic indicators on Mean Opinion Scores (MOS) We show that the proposed approach is agnostic to specific ML algorithms, stacked upon each other, as it does not necessitate the collaborating localized nodes to run the same ML algorithm.
arXiv Detail & Related papers (2020-03-12T15:28:10Z)
Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario. We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.