Related papers: We Should Chart an Atlas of All the World's Models

Related papers

GRAM: A Generative Foundation Reward Model for Reward Generalization [48.63394690265176]
We develop a generative reward model that is first trained via large-scale unsupervised learning and then fine-tuned via supervised learning.<n>This model generalizes well across several tasks, including response ranking, reinforcement learning from human feedback, and task adaptation with fine-tuning.
arXiv Detail & Related papers (2025-06-17T04:34:27Z)
Learning on Model Weights using Tree Experts [39.90685550999956]
Training machine learning models to infer missing documentation directly from model weights is challenging.<n>We identify a key property of real-world models: most public models belong to a small set of Model Trees.<n>We introduce Probing Experts (ProbeX), a theoretically motivated and lightweight method to learn from the weights of a single model layer.
arXiv Detail & Related papers (2024-10-17T17:17:09Z)
Exploring Model Kinship for Merging Large Language Models [52.01652098827454]
We introduce model kinship, the degree of similarity or relatedness between Large Language Models.<n>We find that there is a certain relationship between model kinship and the performance gains after model merging.<n>We propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets.
arXiv Detail & Related papers (2024-10-16T14:29:29Z)
Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset. We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding. Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z)
Efficiently Robustify Pre-trained Models [18.392732966487582]
robustness of large scale models towards real-world settings is still a less-explored topic. We first benchmark the performance of these models under different perturbations and datasets. We then discuss on how complete model fine-tuning based existing robustification schemes might not be a scalable option given very large scale networks.
arXiv Detail & Related papers (2023-09-14T08:07:49Z)
OpenGDA: Graph Domain Adaptation Benchmark for Cross-network Learning [42.48479966907126]
OpenGDA is a benchmark for evaluating graph domain adaptation models. It provides abundant pre-processed and unified datasets for different types of tasks. It integrates state-of-the-art models with standardized and end-to-end pipelines.
arXiv Detail & Related papers (2023-07-21T04:11:43Z)
MGit: A Model Versioning and Management System [7.2678752235785735]
MGit is a model versioning and management system that makes it easier to store, test, update, and collaborate on model derivatives. MGit is able to reduce the lineage graph's storage footprint by up to 7x and automatically update downstream models in response to updates to upstream models.
arXiv Detail & Related papers (2023-07-14T17:56:48Z)
Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data. Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z)
Graph Few-shot Class-incremental Learning [25.94168397283495]
The ability to incrementally learn new classes is vital to all real-world artificial intelligence systems. In this paper, we investigate the challenging yet practical problem, Graph Few-shot Class-incremental (Graph FCL) problem. We put forward a Graph Pseudo Incremental Learning paradigm by sampling tasks recurrently from the base classes. We present a task-sensitive regularizer calculated from task-level attention and node class prototypes to mitigate overfitting onto either novel or base classes.
arXiv Detail & Related papers (2021-12-23T19:46:07Z)
Scalable Scene Flow from Point Clouds in the Real World [30.437100097997245]
We introduce a new large scale benchmark for scene flow based on the Open dataset. We show how previous works were bounded based on the amount of real LiDAR data available. We introduce the model architecture FastFlow3D that provides real time inference on the full point cloud.
arXiv Detail & Related papers (2021-03-01T20:56:05Z)
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets. We leverage a largely ignored source of information: the behavior of the model on individual instances during training. Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z)
Hidden Footprints: Learning Contextual Walkability from 3D Human Trails [70.01257397390361]
Current datasets only tell you where people are, not where they could be. We first augment the set of valid, labeled walkable regions by propagating person observations between images, utilizing 3D information to create what we call hidden footprints. We devise a training strategy designed for such sparse labels, combining a class-balanced classification loss with a contextual adversarial loss.
arXiv Detail & Related papers (2020-08-19T23:19:08Z)
Model Generalization in Deep Learning Applications for Land Cover Mapping [19.570391828806567]
We show that when deep learning models are trained on data from specific continents/seasons, there is a high degree of variability in model performance on out-of-sample continents/seasons. This suggests that just because a model accurately predicts land-use classes in one continent or season does not mean that the model will accurately predict land-use classes in a different continent or season.
arXiv Detail & Related papers (2020-08-09T01:50:52Z)
Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets. We introduce a novel training approach for existing FSC models. We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z)
$n$-Reference Transfer Learning for Saliency Prediction [73.17061116358036]
We propose a few-shot transfer learning paradigm for saliency prediction. The proposed framework is gradient-based and model-agnostic. The results show that the proposed framework achieves a significant performance improvement.
arXiv Detail & Related papers (2020-07-09T23:20:44Z)
MapLUR: Exploring a new Paradigm for Estimating Air Pollution using Deep Learning on Map Images [4.7791671364702575]
Land-use regression models are important for the assessment of air pollution concentrations in areas without measurement stations. We propose the Data-driven, Open, Global (DOG) paradigm that entails models based on purely data-driven approaches using only openly and globally available data.
arXiv Detail & Related papers (2020-02-18T11:21:55Z)
Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application. In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model. Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.