Related papers: Open-Source Tools for Behavioral Video Analysis: Setup, Methods, and Development

Open-Source Tools for Behavioral Video Analysis: Setup, Methods, and Development

URL: http://arxiv.org/abs/2204.02842v1
Date: Wed, 6 Apr 2022 14:06:43 GMT
Title: Open-Source Tools for Behavioral Video Analysis: Setup, Methods, and Development
Authors: Kevin Luxem, Jennifer J. Sun, Sean P. Bradley, Keerthi Krishnan, Talmo D. Pereira, Eric A. Yttri, Jan Zimmermann, and Mark Laubach
Abstract summary: Methods for video analysis are transforming behavioral quantification to be more precise, scalable, and reproducible. Open-source tools for video analysis have led to new experimental approaches to understand behavior. We review currently available open source tools for video analysis, how to set them up in a lab that is new to video recording methods, and some issues that should be addressed.
Score: 2.248500763940652
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recently developed methods for video analysis, especially models for pose estimation and behavior classification, are transforming behavioral quantification to be more precise, scalable, and reproducible in fields such as neuroscience and ethology. These tools overcome long-standing limitations of manual scoring of video frames and traditional "center of mass" tracking algorithms to enable video analysis at scale. The expansion of open-source tools for video acquisition and analysis has led to new experimental approaches to understand behavior. Here, we review currently available open source tools for video analysis, how to set them up in a lab that is new to video recording methods, and some issues that should be addressed by developers and advanced users, including the need to openly share datasets and code, how to compare algorithms and their parameters, and the need for documentation and community-wide standards. We hope to encourage more widespread use and continued development of the tools. They have tremendous potential for accelerating scientific progress for understanding the brain and behavior.

Related papers

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding [126.15907330726067]
We build a Perception Model Language (PLM) in a fully open and reproducible framework for transparent research in image and video understanding. We analyze standard training pipelines without distillation from models and explore large-scale synthetic data to identify critical data gaps.
arXiv Detail & Related papers (2025-04-17T17:59:56Z)
Watch and Learn: Leveraging Expert Knowledge and Language for Surgical Video Understanding [1.024113475677323]
The lack of datasets hinders the development of accurate and comprehensive workflow analysis solutions. We introduce a novel approach for addressing the sparsity and heterogeneity of data inspired by the human learning procedure of watching experts and understanding their explanations. We present the first comprehensive solution for dense video captioning (DVC) of surgical videos, addressing this task despite the absence of existing datasets in the surgical domain.
arXiv Detail & Related papers (2025-03-14T13:36:13Z)
Understanding Long Videos via LLM-Powered Entity Relation Graphs [51.13422967711056]
GraphVideoAgent is a framework that maps and monitors the evolving relationships between visual entities throughout the video sequence. Our approach demonstrates remarkable effectiveness when tested against industry benchmarks.
arXiv Detail & Related papers (2025-01-27T10:57:24Z)
psifx -- Psychological and Social Interactions Feature Extraction Package [3.560429497877327]
psifx is a plug-and-play multi-modal feature extraction toolkit. It aims to facilitate and democratize the use of state-of-the-art machine learning techniques for human sciences research.
arXiv Detail & Related papers (2024-07-14T16:20:42Z)
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs [20.168429351519055]
Video understanding is a crucial next step for multimodal large language models (LMLMs) We propose VideoNIAH (Video Needle In A Haystack), a benchmark construction framework through synthetic video generation. We conduct a comprehensive evaluation of both proprietary and open-source models, uncovering significant differences in their video understanding capabilities.
arXiv Detail & Related papers (2024-06-13T17:50:05Z)
A Review of Machine Learning Methods Applied to Video Analysis Systems [3.518774226658318]
The paper provides a survey of the development of machine-learning techniques for video analysis. We provide summaries of the development of self-supervised learning, semi-supervised learning, active learning, and zero-shot learning for applications in video analysis.
arXiv Detail & Related papers (2023-12-08T20:24:03Z)
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models [81.84810348214113]
Video-based large language models (Video-LLMs) have been recently introduced, targeting both fundamental improvements in perception and comprehension, and a diverse range of user inquiries. To guide the development of such a model, the establishment of a robust and comprehensive evaluation system becomes crucial. This paper proposes textitVideo-Bench, a new comprehensive benchmark along with a toolkit specifically designed for evaluating Video-LLMs.
arXiv Detail & Related papers (2023-11-27T18:59:58Z)
What and How of Machine Learning Transparency: Building Bespoke Explainability Tools with Interoperable Algorithmic Components [77.87794937143511]
This paper introduces a collection of hands-on training materials for explaining data-driven predictive models. These resources cover the three core building blocks of this technique: interpretable representation composition, data sampling and explanation generation.
arXiv Detail & Related papers (2022-09-08T13:33:25Z)
Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis [60.13902294276283]
We present VideoSham, a dataset consisting of 826 videos (413 real and 413 manipulated). Many of the existing deepfake datasets focus exclusively on two types of facial manipulations -- swapping with a different subject's face or altering the existing face. Our analysis shows that state-of-the-art manipulation detection algorithms only work for a few specific attacks and do not scale well on VideoSham.
arXiv Detail & Related papers (2022-07-26T17:39:04Z)
PosePipe: Open-Source Human Pose Estimation Pipeline for Clinical Research [0.0]
We develop a human pose estimation pipeline that facilitates running state-of-the-art algorithms on data acquired in clinical context. Our goal in this work is not to train new algorithms, but to advance the use of cutting-edge human pose estimation algorithms for clinical and translation research.
arXiv Detail & Related papers (2022-03-16T17:54:37Z)
Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning [56.676110454594344]
VideoSuperResolution (Ada-SR) uses external as well as internal, information through meta-transfer learning and internal learning, respectively. Model trained using our approach can quickly adapt to a specific video condition with only a few gradient updates, which reduces the inference time significantly.
arXiv Detail & Related papers (2021-08-05T19:59:26Z)
DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature [0.7349727826230862]
We open source DRIFT, which allows researchers to track research trends and development over the years. The analysis methods are collated from well-cited research works, with a few of our own methods added for good measure. To demonstrate the utility and efficacy of our tool, we perform a case study on the cs.CL corpus of the arXiv repository and draw inferences from the analysis methods.
arXiv Detail & Related papers (2021-07-02T17:33:25Z)
Non-Adversarial Video Synthesis with Learned Priors [53.26777815740381]
We focus on the problem of generating videos from latent noise vectors, without any reference input frames. We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Our approach generates superior quality videos compared to the existing state-of-the-art methods.
arXiv Detail & Related papers (2020-03-21T02:57:33Z)
Comprehensive Instructional Video Analysis: The COIN Dataset and Performance Evaluation [100.68317848808327]
We present a large-scale dataset named as "COIN" for COmprehensive INstructional video analysis. COIN dataset contains 11,827 videos of 180 tasks in 12 domains related to our daily life. With a new developed toolbox, all the videos are annotated efficiently with a series of step labels and the corresponding temporal boundaries.
arXiv Detail & Related papers (2020-03-20T16:59:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.