Related papers: CDFI: Compression-Driven Network Design for Frame Interpolation

CDFI: Compression-Driven Network Design for Frame Interpolation

URL: http://arxiv.org/abs/2103.10559v1
Date: Thu, 18 Mar 2021 22:59:42 GMT
Title: CDFI: Compression-Driven Network Design for Frame Interpolation
Authors: Tianyu Ding, Luming Liang, Zhihui Zhu, Ilya Zharkov
Abstract summary: We propose a compression-driven network design for frame size. We show that a 10X compressed AdaCoF model performs similarly as its original counterpart. Our model performs favorably against other state-of-the-arts in a broad range of datasets.
Score: 24.205673014182356
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: DNN-based frame interpolation--that generates the intermediate frames given two consecutive frames--typically relies on heavy model architectures with a huge number of features, preventing them from being deployed on systems with limited resources, e.g., mobile devices. We propose a compression-driven network design for frame interpolation (CDFI), that leverages model pruning through sparsity-inducing optimization to significantly reduce the model size while achieving superior performance. Concretely, we first compress the recently proposed AdaCoF model and show that a 10X compressed AdaCoF performs similarly as its original counterpart; then we further improve this compressed model by introducing a multi-resolution warping module, which boosts visual consistencies with multi-level details. As a consequence, we achieve a significant performance gain with only a quarter in size compared with the original AdaCoF. Moreover, our model performs favorably against other state-of-the-arts in a broad range of datasets. Finally, the proposed compression-driven framework is generic and can be easily transferred to other DNN-based frame interpolation algorithm. Our source code is available at https://github.com/tding1/CDFI.

Related papers

Semantic Frame Interpolation [66.81586538775366]
Traditional frame tasks primarily focus on scenarios with a small number of frames, no text control, and minimal differences between the first and last frames.<n>Recent community developers have utilized large video models represented by Wan to endow frame-to-frame capabilities.<n>We first propose a new practical Semantic Frame Interpolation (SFI) task from the perspective of academic definition, which covers the above two settings and supports inference at multiple frame rates.
arXiv Detail & Related papers (2025-07-07T16:25:47Z)
Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations. Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd Counting [3.5066463427087777]
We introduce two lightweight models to enhance the versatility of crowd-counting models. These models maintain the same downstream architecture while incorporating two distinct backbones: MobileNet and MobileViT. We leverage Adjacent Feature Fusion to extract diverse scale features from a Pre-Trained Model (PTM) and subsequently combine these features seamlessly.
arXiv Detail & Related papers (2024-01-11T15:13:31Z)
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation [80.33846577924363]
We present All-Pairs Multi-Field Transforms (AMT), a new network architecture for video framegithub. It is based on two essential designs. First, we build bidirectional volumes for all pairs of pixels, and use the predicted bilateral flows to retrieve correlations. Second, we derive multiple groups of fine-grained flow fields from one pair of updated coarse flows for performing backward warping on the input frames separately.
arXiv Detail & Related papers (2023-04-19T16:18:47Z)
Sparsity-guided Network Design for Frame Interpolation [39.828644638174225]
We present a compression-driven network design for frame-based algorithms. We leverage model pruning through sparsity-inducing optimization to greatly reduce the model size. We achieve a considerable performance gain with a quarter of the size of the original AdaCoF.
arXiv Detail & Related papers (2022-09-09T23:13:25Z)
Multi-encoder Network for Parameter Reduction of a Kernel-based Interpolation Architecture [10.08097582267397]
Convolutional neural networks (CNNs) have been at the forefront of the recent advances in this field. Many of these networks require a lot of parameters, with more parameters meaning a heavier burden. This paper presents a method for parameter reduction for a popular flow-less kernel-based network.
arXiv Detail & Related papers (2022-05-13T16:02:55Z)
DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting [50.17500790309477]
DeMFI-Net is a joint deblurring and multi-frame framework. It converts blurry videos of lower-frame-rate to sharp videos at higher-frame-rate. It achieves state-of-the-art (SOTA) performances for diverse datasets.
arXiv Detail & Related papers (2021-11-19T00:00:15Z)
Collegial Ensembles [11.64359837358763]
We show that collegial ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers. We also show how our framework can be used to analytically derive optimal group convolution modules without having to train a single model.
arXiv Detail & Related papers (2020-06-13T16:40:26Z)
A Generic Network Compression Framework for Sequential Recommender Systems [71.81962915192022]
Sequential recommender systems (SRS) have become the key technology in capturing user's dynamic interests and generating high-quality recommendations. We propose a compressed sequential recommendation framework, termed as CpRec, where two generic model shrinking techniques are employed. By the extensive ablation studies, we demonstrate that the proposed CpRec can achieve up to 4$sim$8 times compression rates in real-world SRS datasets.
arXiv Detail & Related papers (2020-04-21T08:40:55Z)
Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR) Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data. We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)
Neural Network Compression Framework for fast model inference [59.65531492759006]
We present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF) It leverages recent advances of various network compression methods and implements some of them, such as sparsity, quantization, and binarization. The framework can be used within the training samples, which are supplied with it, or as a standalone package that can be seamlessly integrated into the existing training code.
arXiv Detail & Related papers (2020-02-20T11:24:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.