Towards Transparent Application of Machine Learning in Video Processing
- URL: http://arxiv.org/abs/2105.12700v2
- Date: Thu, 27 May 2021 09:35:54 GMT
- Title: Towards Transparent Application of Machine Learning in Video Processing
- Authors: Luka Murn, Marc Gorriz Blanch, Maria Santamaria, Fiona Rivera, Marta
Mrak
- Abstract summary: Machine learning techniques for more efficient video compression and video enhancement have been developed thanks to breakthroughs in deep learning.
New techniques typically come in the form of resource-hungry black-boxes (overly complex with little transparency regarding the inner workings)
The aim of this work is to understand and optimise learned models in video processing applications so systems that incorporate them can be used in a more trustworthy manner.
- Score: 3.491870689686827
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine learning techniques for more efficient video compression and video
enhancement have been developed thanks to breakthroughs in deep learning. The
new techniques, considered as an advanced form of Artificial Intelligence (AI),
bring previously unforeseen capabilities. However, they typically come in the
form of resource-hungry black-boxes (overly complex with little transparency
regarding the inner workings). Their application can therefore be unpredictable
and generally unreliable for large-scale use (e.g. in live broadcast). The aim
of this work is to understand and optimise learned models in video processing
applications so systems that incorporate them can be used in a more trustworthy
manner. In this context, the presented work introduces principles for
simplification of learned models targeting improved transparency in
implementing machine learning for video production and distribution
applications. These principles are demonstrated on video compression examples,
showing how bitrate savings and reduced complexity can be achieved by
simplifying relevant deep learning models.
Related papers
- DMVC: Multi-Camera Video Compression Network aimed at Improving Deep Learning Accuracy [22.871591373774802]
We introduce a cutting-edge video compression framework tailored for the age of ubiquitous video data.
Unlike traditional compression methods that prioritize human visual perception, our innovative approach focuses on preserving semantic information critical for deep learning accuracy.
Based on a designed deep learning algorithms, it adeptly segregates essential information from redundancy, ensuring machine learning tasks are fed with data of the highest relevance.
arXiv Detail & Related papers (2024-10-24T03:29:57Z) - Stop overkilling simple tasks with black-box models and use transparent
models instead [57.42190785269343]
Deep learning approaches are able to extract features autonomously from raw data.
This allows for bypassing the feature engineering process.
Deep learning strategies often outperform traditional models in terms of accuracy.
arXiv Detail & Related papers (2023-02-06T14:28:49Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - InternVideo: General Video Foundation Models via Generative and
Discriminative Learning [52.69422763715118]
We present general video foundation models, InternVideo, for dynamic and complex video-level understanding tasks.
InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives.
InternVideo achieves state-of-the-art performance on 39 video datasets from extensive tasks including video action recognition/detection, video-language alignment, and open-world video applications.
arXiv Detail & Related papers (2022-12-06T18:09:49Z) - From Actions to Events: A Transfer Learning Approach Using Improved Deep
Belief Networks [1.0554048699217669]
This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model.
Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process.
arXiv Detail & Related papers (2022-11-30T14:47:10Z) - Frozen CLIP Models are Efficient Video Learners [86.73871814176795]
Video recognition has been dominated by the end-to-end learning paradigm.
Recent advances in Contrastive Vision-Language Pre-training pave the way for a new route for visual recognition tasks.
We present Efficient Video Learning -- an efficient framework for directly training high-quality video recognition models.
arXiv Detail & Related papers (2022-08-06T17:38:25Z) - Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning [56.676110454594344]
VideoSuperResolution (Ada-SR) uses external as well as internal, information through meta-transfer learning and internal learning, respectively.
Model trained using our approach can quickly adapt to a specific video condition with only a few gradient updates, which reduces the inference time significantly.
arXiv Detail & Related papers (2021-08-05T19:59:26Z) - Analytic Simplification of Neural Network based Intra-Prediction Modes
for Video Compression [10.08097582267397]
This paper presents two ways to derive simplified intra-prediction from learnt models.
It shows that these streamlined techniques can lead to efficient compression solutions.
arXiv Detail & Related papers (2020-04-23T10:25:54Z) - Non-Adversarial Video Synthesis with Learned Priors [53.26777815740381]
We focus on the problem of generating videos from latent noise vectors, without any reference input frames.
We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning.
Our approach generates superior quality videos compared to the existing state-of-the-art methods.
arXiv Detail & Related papers (2020-03-21T02:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.