Learning Grammar of Complex Activities via Deep Neural Networks
- URL: http://arxiv.org/abs/2101.02774v1
- Date: Thu, 7 Jan 2021 21:48:58 GMT
- Title: Learning Grammar of Complex Activities via Deep Neural Networks
- Authors: Becky Mashaido
- Abstract summary: This report provides a theoretical insight into deep neural networks for video learning, under label constraints.
I build upon previous work in video learning for computer vision, make observations on model performance and propose further mechanisms to help improve our observations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the growing amount of publicly available video data on online
streaming services and an increased interest in applications that analyze
continuous video streams such as autonomous driving, this technical report
provides a theoretical insight into deep neural networks for video learning,
under label constraints. I build upon previous work in video learning for
computer vision, make observations on model performance and propose further
mechanisms to help improve our observations.
Related papers
- Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - Integration and Performance Analysis of Artificial Intelligence and
Computer Vision Based on Deep Learning Algorithms [5.734290974917728]
This paper focuses on the analysis of the application effectiveness of the integration of deep learning and computer vision technologies.
Deep learning achieves a historic breakthrough by constructing hierarchical neural networks, enabling end-to-end feature learning and semantic understanding of images.
The successful experiences in the field of computer vision provide strong support for training deep learning algorithms.
arXiv Detail & Related papers (2023-12-20T09:37:06Z) - Deep Learning Techniques for Video Instance Segmentation: A Survey [19.32547752428875]
Video instance segmentation is an emerging computer vision research area introduced in 2019.
Deep-learning techniques take a dominant role in various computer vision areas.
This survey offers a multifaceted view of deep-learning schemes for video instance segmentation.
arXiv Detail & Related papers (2023-10-19T00:27:30Z) - Learning with Capsules: A Survey [73.31150426300198]
Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations.
Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships.
arXiv Detail & Related papers (2022-06-06T15:05:36Z) - Measuring and modeling the motor system with machine learning [117.44028458220427]
The utility of machine learning in understanding the motor system is promising a revolution in how to collect, measure, and analyze data.
We discuss the growing use of machine learning: from pose estimation, kinematic analyses, dimensionality reduction, and closed-loop feedback, to its use in understanding neural correlates and untangling sensorimotor systems.
arXiv Detail & Related papers (2021-03-22T12:42:16Z) - Video Summarization Using Deep Neural Networks: A Survey [72.98424352264904]
Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content.
This work focuses on the recent advances in the area and provides a comprehensive survey of the existing deep-learning-based methods for generic video summarization.
arXiv Detail & Related papers (2021-01-15T11:41:29Z) - Neuro-Symbolic Representations for Video Captioning: A Case for
Leveraging Inductive Biases for Vision and Language [148.0843278195794]
We propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning.
Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions.
arXiv Detail & Related papers (2020-11-18T20:21:19Z) - Graph-Based Neural Network Models with Multiple Self-Supervised
Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points.
We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z) - Continual Learning for Anomaly Detection in Surveillance Videos [36.24563211765782]
We propose an online anomaly detection method for surveillance videos using transfer learning and continual learning.
Our proposed algorithm leverages the feature extraction power of neural network-based models for transfer learning, and the continual learning capability of statistical detection methods.
arXiv Detail & Related papers (2020-04-15T16:41:20Z) - Interactive Summarizing -- Automatic Slide Localization Technology as
Generative Learning Tool [10.81386784858998]
Video summarization is an effective technology applied to enhance learners' summarizing experience in a video lecture.
An interactive summarizing model is designed to explain how learners are engaged in the video lecture learning process supported by convolutional neural network.
arXiv Detail & Related papers (2020-02-25T22:22:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.