CompressAI-Vision: Open-source software to evaluate compression methods for computer vision tasks
- URL: http://arxiv.org/abs/2509.20777v1
- Date: Thu, 25 Sep 2025 06:01:55 GMT
- Title: CompressAI-Vision: Open-source software to evaluate compression methods for computer vision tasks
- Authors: Hyomin Choi, Heeji Han, Chris Rosewarne, Fabien Racapé,
- Abstract summary: A consolidated platform is needed to implement and evaluate compression methods optimized for downstream vision tasks.<n>CompressAI-Vision is introduced as a comprehensive evaluation platform where new coding tools compete to efficiently compress the input of vision network.<n>The evaluation platform has been developed as open-source software and is adopted by the Moving Pictures Experts Group.
- Score: 4.608162684700302
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With the increasing use of neural network (NN)-based computer vision applications that process image and video data as input, interest has emerged in video compression technology optimized for computer vision tasks. In fact, given the variety of vision tasks, associated NN models and datasets, a consolidated platform is needed as a common ground to implement and evaluate compression methods optimized for downstream vision tasks. CompressAI-Vision is introduced as a comprehensive evaluation platform where new coding tools compete to efficiently compress the input of vision network while retaining task accuracy in the context of two different inference scenarios: "remote" and "split" inferencing. Our study showcases various use cases of the evaluation platform incorporated with standard codecs (under development) by examining the compression gain on several datasets in terms of bit-rate versus task accuracy. This evaluation platform has been developed as open-source software and is adopted by the Moving Pictures Experts Group (MPEG) for the development the Feature Coding for Machines (FCM) standard. The software is available publicly at https://github.com/InterDigitalInc/CompressAI-Vision.
Related papers
- A Preprocessing Framework for Video Machine Vision under Compression [26.253209831074184]
We propose a video preprocessing framework tailored for machine vision tasks to address this challenge.<n>The proposed method incorporates a neural preprocessor which retaining crucial information for subsequent tasks, resulting in the boosting of rate-accuracy performance.
arXiv Detail & Related papers (2025-12-17T11:26:19Z) - Plug-and-Play Versatile Compressed Video Enhancement [57.62582951699999]
Video compression effectively reduces the size of files, making it possible for real-time cloud computing.<n>However, it comes at the cost of visual quality, challenges the robustness of downstream vision models.<n>We present a versatile-aware enhancement framework that adaptively enhance videos under different compression settings.
arXiv Detail & Related papers (2025-04-21T18:39:31Z) - DMVC: Multi-Camera Video Compression Network aimed at Improving Deep Learning Accuracy [22.871591373774802]
We introduce a cutting-edge video compression framework tailored for the age of ubiquitous video data.
Unlike traditional compression methods that prioritize human visual perception, our innovative approach focuses on preserving semantic information critical for deep learning accuracy.
Based on a designed deep learning algorithms, it adeptly segregates essential information from redundancy, ensuring machine learning tasks are fed with data of the highest relevance.
arXiv Detail & Related papers (2024-10-24T03:29:57Z) - The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine [49.16996486119006]
Deep learning has emerged as a powerful tool in point cloud coding.<n> JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding standard.<n>This paper provides a complete technical description of the JPEG PCC standard.
arXiv Detail & Related papers (2024-09-12T15:20:23Z) - Computer Vision Model Compression Techniques for Embedded Systems: A Survey [75.38606213726906]
This paper covers the main model compression techniques applied for computer vision tasks.
We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique.
We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges.
arXiv Detail & Related papers (2024-08-15T16:41:55Z) - Deep Learning-based Compressed Domain Multimedia for Man and Machine: A
Taxonomy and Application to Point Cloud Classification [27.071264214506108]
This paper proposes the first taxonomy for designing compressed domain computer vision solutions.
The potential of the proposed taxonomy is demonstrated for the specific case of point cloud classification.
arXiv Detail & Related papers (2023-10-28T23:38:30Z) - VNVC: A Versatile Neural Video Coding Framework for Efficient
Human-Machine Vision [59.632286735304156]
It is more efficient to enhance/analyze the coded representations directly without decoding them into pixels.
We propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis.
arXiv Detail & Related papers (2023-06-19T03:04:57Z) - Video Coding for Machine: Compact Visual Representation Compression for
Intelligent Collaborative Analytics [101.35754364753409]
Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression.
This paper summarizes VCM methodology and philosophy based on existing academia and industrial efforts.
arXiv Detail & Related papers (2021-10-18T12:42:13Z) - Video Coding for Machines: A Paradigm of Collaborative Compression and
Intelligent Analytics [127.65410486227007]
Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale.
Recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, promote the sustainable and fast development in their own directions.
In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG
arXiv Detail & Related papers (2020-01-10T17:24:13Z) - An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond
Feature and Signal [99.49099501559652]
Video Coding for Machine (VCM) aims to bridge the gap between visual feature compression and classical video coding.
We employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern.
By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames.
arXiv Detail & Related papers (2020-01-09T14:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.