Related papers: A Review of Machine Learning Methods Applied to Video Analysis Systems

A Review of Machine Learning Methods Applied to Video Analysis Systems

URL: http://arxiv.org/abs/2312.05352v1
Date: Fri, 8 Dec 2023 20:24:03 GMT
Title: A Review of Machine Learning Methods Applied to Video Analysis Systems
Authors: Marios S. Pattichis, Venkatesh Jatla, Alvaro E. Ullao Cerna
Abstract summary: The paper provides a survey of the development of machine-learning techniques for video analysis. We provide summaries of the development of self-supervised learning, semi-supervised learning, active learning, and zero-shot learning for applications in video analysis.
Score: 3.518774226658318
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The paper provides a survey of the development of machine-learning techniques for video analysis. The survey provides a summary of the most popular deep learning methods used for human activity recognition. We discuss how popular architectures perform on standard datasets and highlight the differences from real-life datasets dominated by multiple activities performed by multiple participants over long periods. For real-life datasets, we describe the use of low-parameter models (with 200X or 1,000X fewer parameters) that are trained to detect a single activity after the relevant objects have been successfully detected. Our survey then turns to a summary of machine learning methods that are specifically developed for working with a small number of labeled video samples. Our goal here is to describe modern techniques that are specifically designed so as to minimize the amount of ground truth that is needed for training and testing video analysis systems. We provide summaries of the development of self-supervised learning, semi-supervised learning, active learning, and zero-shot learning for applications in video analysis. For each method, we provide representative examples.

Related papers

A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning [22.870129496984546]
We establish a unified benchmark that enables fair comparisons across different methods. We investigate five critical aspects of self-supervised learning in videos: (1) dataset size, (2) model complexity, (3) data distribution, (4) data noise, and (5) feature representations. We propose a novel approach that significantly reduces training data requirements while surpassing state-of-the-art methods that rely on 10% more pretraining data.
arXiv Detail & Related papers (2025-04-08T15:47:58Z)
A Comprehensive Review of Few-shot Action Recognition [64.47305887411275]
Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data. It requires accurately classifying human actions in videos using only a few labeled examples per class. Numerous approaches have driven significant advancements in few-shot action recognition.
arXiv Detail & Related papers (2024-07-20T03:53:32Z)
Any-point Trajectory Modeling for Policy Learning [64.23861308947852]
We introduce Any-point Trajectory Modeling (ATM) to predict future trajectories of arbitrary points within a video frame. ATM outperforms strong video pre-training baselines by 80% on average. We show effective transfer learning of manipulation skills from human videos and videos from a different robot morphology.
arXiv Detail & Related papers (2023-12-28T23:34:43Z)
A Large-Scale Analysis on Self-Supervised Video Representation Learning [15.205738030787673]
We study five different aspects of self-supervised learning important for videos; 1) dataset size, 2) complexity, 3) data distribution, 4) data noise, and, 5)feature analysis. We present several interesting insights from this study which span across different properties of pretraining and target datasets, pretext-tasks, and model architectures. We propose an approach that requires a limited amount of training data and outperforms existing state-of-the-art approaches which use 10x pretraining data.
arXiv Detail & Related papers (2023-06-09T16:27:14Z)
Lifelong Ensemble Learning based on Multiple Representations for Few-Shot Object Recognition [6.282068591820947]
We present a lifelong ensemble learning approach based on multiple representations to address the few-shot object recognition problem. To facilitate lifelong learning, each approach is equipped with a memory unit for storing and retrieving object information instantly. We have performed extensive sets of experiments to assess the performance of the proposed approach in offline, and open-ended scenarios.
arXiv Detail & Related papers (2022-05-04T10:29:10Z)
Continuous Human Action Recognition for Human-Machine Interaction: A Review [39.593687054839265]
Recognising actions within an input video are challenging but necessary tasks for applications that require real-time human-machine interaction. We provide on the feature extraction and learning strategies that are used on most state-of-the-art methods. We investigate the application of such models to real-world scenarios and discuss several limitations and key research directions.
arXiv Detail & Related papers (2022-02-26T09:25:44Z)
Self-Supervised Visual Representation Learning Using Lightweight Architectures [0.0]
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine. We critically examine the most notable pretext tasks to extract features from image data. We study the performance of various self-supervised techniques keeping all other parameters uniform.
arXiv Detail & Related papers (2021-10-21T14:13:10Z)
Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only. We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z)
Curriculum Learning: A Survey [65.31516318260759]
Curriculum learning strategies have been successfully employed in all areas of machine learning. We construct a taxonomy of curriculum learning approaches by hand, considering various classification criteria. We build a hierarchical tree of curriculum learning methods using an agglomerative clustering algorithm.
arXiv Detail & Related papers (2021-01-25T20:08:32Z)
Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots. We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector. We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
Bayesian active learning for production, a systematic study and a reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques. We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process. We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
A System for Real-Time Interactive Analysis of Deep Learning Training [66.06880335222529]
Currently available systems are limited to monitoring only the logged data that must be specified before the training process starts. We present a new system that enables users to perform interactive queries on live processes generating real-time information.
arXiv Detail & Related papers (2020-01-05T11:33:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.