Related papers: Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling

Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling

URL: http://arxiv.org/abs/2511.16301v2
Date: Mon, 24 Nov 2025 11:32:47 GMT
Title: Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling
Authors: Minseok Seo, Mark Hamilton, Changick Kim,
Abstract summary: Upsample Anything restores low-resolution features to high-resolution, pixel-wise outputs without any training.<n>It runs in only $approx0.419 texts$ per 224x224 image and achieves state-of-the-art performance on semantic segmentation, depth estimation, and both depth and probability map upsampling.
Score: 38.24831571443335
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present \textbf{Upsample Anything}, a lightweight test-time optimization (TTO) framework that restores low-resolution features to high-resolution, pixel-wise outputs without any training. Although Vision Foundation Models demonstrate strong generalization across diverse downstream tasks, their representations are typically downsampled by 14x/16x (e.g., ViT), which limits their direct use in pixel-level applications. Existing feature upsampling approaches depend on dataset-specific retraining or heavy implicit optimization, restricting scalability and generalization. Upsample Anything addresses these issues through a simple per-image optimization that learns an anisotropic Gaussian kernel combining spatial and range cues, effectively bridging Gaussian Splatting and Joint Bilateral Upsampling. The learned kernel acts as a universal, edge-aware operator that transfers seamlessly across architectures and modalities, enabling precise high-resolution reconstruction of features, depth, or probability maps. It runs in only $\approx0.419 \text{s}$ per 224x224 image and achieves state-of-the-art performance on semantic segmentation, depth estimation, and both depth and probability map upsampling. \textbf{Project page:} \href{https://seominseok0429.github.io/Upsample-Anything/}{https://seominseok0429.github.io/Upsample-Anything/}

Related papers

UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders [50.099672495919975]
UPLiFT is an architecture for Universal Pixel-dense Lightweight Feature Transforms.<n>We show that our Local Attender allows UPLiFT to maintain stable features throughout upsampling.<n>We also show that it achieves competitive performance with state-of-the-art Coupled Flow Matching models for VAE feature upsampling.
arXiv Detail & Related papers (2026-01-25T18:59:45Z)
SimpleMatch: A Simple and Strong Baseline for Semantic Correspondence [1.0039285760896914]
We present SimpleMatch, a framework for semantic correspondence that delivers strong performance even at low resolutions.<n>At a resolution of 252x252 (3.3x smaller than current SOTA methods), SimpleMatch achieves superior performance with 84.1% PCK@0.1 on the SPair-71k benchmark.
arXiv Detail & Related papers (2026-01-18T11:31:46Z)
Depth-Guided Bundle Sampling for Efficient Generalizable Neural Radiance Field Reconstruction [22.057122296909142]
High-resolution images remain computationally intensive due to the need for dense sampling of all rays.<n>We propose a novel depth-guided bundle sampling strategy to accelerate rendering.<n>Our method achieves up to a 1.27 dB PSNR improvement and a 47% increase in FPS on the DTU dataset.
arXiv Detail & Related papers (2025-05-26T10:23:59Z)
Lighten CARAFE: Dynamic Lightweight Upsampling with Guided Reassemble Kernels [18.729177307412645]
We propose a lightweight upsampling operation, termed Dynamic Lightweight Upsampling (DLU) Experiments on several mainstream vision tasks show that our DLU achieves comparable and even better performance to the original CARAFE.
arXiv Detail & Related papers (2024-10-29T15:35:14Z)
A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
A popular similarity-based feature upsampling pipeline has been proposed, which utilizes a high-resolution feature as guidance.<n>We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives.<n>We develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts.
arXiv Detail & Related papers (2024-07-02T14:12:21Z)
Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling [2.1465347972460367]
Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. We introduce a novel method named Edge-preserving Probabilistic Downsampling (EPD) It utilizes class uncertainty within a local window to produce soft labels, with the window size dictating the downsampling factor.
arXiv Detail & Related papers (2024-04-05T10:01:31Z)
CUF: Continuous Upsampling Filters [25.584630142930123]
In this paper, we consider one of the most important operations in image processing: upsampling. We propose to parameterize upsampling kernels as neural fields. This parameterization leads to a compact architecture that obtains a 40-fold reduction in the number of parameters when compared with competing arbitrary-scale super-resolution architectures.
arXiv Detail & Related papers (2022-10-13T12:45:51Z)
BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling [60.257912103351394]
We develop a new point cloud upsampling pipeline called BIMS-PU. We decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors. We show that our method achieves superior results to state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-25T13:13:37Z)
Toward Real-World Super-Resolution via Adaptive Downsampling Models [58.38683820192415]
This study proposes a novel method to simulate an unknown downsampling process without imposing restrictive prior knowledge. We propose a generalizable low-frequency loss (LFL) in the adversarial training framework to imitate the distribution of target LR images without using any paired examples.
arXiv Detail & Related papers (2021-09-08T06:00:32Z)
InfinityGAN: Towards Infinite-Resolution Image Synthesis [92.40782797030977]
We present InfinityGAN, a method to generate arbitrary-resolution images. We show how it trains and infers patch-by-patch seamlessly with low computational resources.
arXiv Detail & Related papers (2021-04-08T17:59:30Z)
Learning Affinity-Aware Upsampling for Deep Image Matting [83.02806488958399]
We show that learning affinity in upsampling provides an effective and efficient approach to exploit pairwise interactions in deep networks. In particular, results on the Composition-1k matting dataset show that A2U achieves a 14% relative improvement in the SAD metric against a strong baseline. Compared with the state-of-the-art matting network, we achieve 8% higher performance with only 40% model complexity.
arXiv Detail & Related papers (2020-11-29T05:09:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.