Related papers: AnyUp: Universal Feature Upsampling

AnyUp: Universal Feature Upsampling

URL: http://arxiv.org/abs/2510.12764v1
Date: Tue, 14 Oct 2025 17:45:17 GMT
Title: AnyUp: Universal Feature Upsampling
Authors: Thomas Wimmer, Prune Truong, Marie-Julie Rakotosaona, Michael Oechsle, Federico Tombari, Bernt Schiele, Jan Eric Lenssen,
Abstract summary: We introduce AnyUp, a method for feature upsampling that can be applied to any vision feature at any resolution.<n>Existing learning-based upsamplers for features like DINO or CLIP need to be re-trained for every feature extractor.
Score: 90.67845351280933
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce AnyUp, a method for feature upsampling that can be applied to any vision feature at any resolution, without encoder-specific training. Existing learning-based upsamplers for features like DINO or CLIP need to be re-trained for every feature extractor and thus do not generalize to different feature types at inference time. In this work, we propose an inference-time feature-agnostic upsampling architecture to alleviate this limitation and improve upsampling quality. In our experiments, AnyUp sets a new state of the art for upsampled features, generalizes to different feature types, and preserves feature semantics while being efficient and easy to apply to a wide range of downstream tasks.

Related papers

UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders [50.099672495919975]
UPLiFT is an architecture for Universal Pixel-dense Lightweight Feature Transforms.<n>We show that our Local Attender allows UPLiFT to maintain stable features throughout upsampling.<n>We also show that it achieves competitive performance with state-of-the-art Coupled Flow Matching models for VAE feature upsampling.
arXiv Detail & Related papers (2026-01-25T18:59:45Z)
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models [27.379438040350188]
Feature upsampling offers a promising direction to address this challenge.<n>We introduce a coordinate-based cross-attention transformer that integrates the high-resolution images with coordinates and low-resolution VFM features.<n>Our approach effectively captures fine-grained details and adapts flexibly to various input and feature resolutions.
arXiv Detail & Related papers (2025-04-18T18:46:08Z)
A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
A popular similarity-based feature upsampling pipeline has been proposed, which utilizes a high-resolution feature as guidance.<n>We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives.<n>We develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts.
arXiv Detail & Related papers (2024-07-02T14:12:21Z)
Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training. Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration. We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z)
Multi-view Feature Augmentation with Adaptive Class Activation Mapping [16.479606228303368]
We propose an end-to-end-trainable feature augmentation module built for image classification. We sample and ensemble diverse multi-view local features to improve model robustness. Experiments demonstrate consistent and noticeable performance gains achieved by our multi-view feature augmentation module.
arXiv Detail & Related papers (2022-06-26T19:05:27Z)
BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling [60.257912103351394]
We develop a new point cloud upsampling pipeline called BIMS-PU. We decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors. We show that our method achieves superior results to state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-25T13:13:37Z)
POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples [19.311470287767385]
We propose to use out-of-distribution samples, i.e., unlabeled samples coming from outside the target classes, to improve few-shot learning. Our approach is simple to implement, agnostic to feature extractors, lightweight without any additional cost for pre-training, and applicable to both inductive and transductive settings.
arXiv Detail & Related papers (2022-06-08T18:59:21Z)
Compositional Fine-Grained Low-Shot Learning [58.53111180904687]
We develop a novel compositional generative model for zero- and few-shot learning to recognize fine-grained classes with a few or no training samples. We propose a feature composition framework that learns to extract attribute features from training samples and combines them to construct fine-grained features for rare and unseen classes.
arXiv Detail & Related papers (2021-05-21T16:18:24Z)
Universal-Prototype Augmentation for Few-Shot Object Detection [128.4592084104352]
Few-shot object detection (FSOD) aims to strengthen the performance of novel object detection with few labeled samples. To alleviate the constraint of few samples, enhancing the generalization ability of learned features for novel objects plays a key role. We propose a new prototype, namely universal prototype, that is learned from all object categories.
arXiv Detail & Related papers (2021-03-01T15:35:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.