Related papers: MouseGPT: A Large-scale Vision-Language Model for Mouse Behavior Analysis

MouseGPT: A Large-scale Vision-Language Model for Mouse Behavior Analysis

URL: http://arxiv.org/abs/2503.10212v2
Date: Thu, 27 Mar 2025 05:38:37 GMT
Title: MouseGPT: A Large-scale Vision-Language Model for Mouse Behavior Analysis
Authors: Teng Xu, Taotao Zhou, Youjia Wang, Peng Yang, Simin Tang, Kuixiang Shao, Zifeng Tang, Yifei Liu, Xinyuan Chen, Hongshuang Wang, Xiaohui Wang, Huoqing Luo, Jingya Wang, Ji Hu, Jingyi Yu,
Abstract summary: We introduce MouseGPT, a Vision-Language Model (VLM) that integrates visual cues with natural language to revolutionize mouse behavior analysis.<n>Our holistic analysis framework enables detailed behavior profiling, clustering, and novel behavior discovery, offering deep insights without the need for labor.
Score: 33.31737496121747
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Analyzing animal behavior is crucial in advancing neuroscience, yet quantifying and deciphering its intricate dynamics remains a significant challenge. Traditional machine vision approaches, despite their ability to detect spontaneous behaviors, fall short due to limited interpretability and reliance on manual labeling, which restricts the exploration of the full behavioral spectrum. Here, we introduce MouseGPT, a Vision-Language Model (VLM) that integrates visual cues with natural language to revolutionize mouse behavior analysis. Built upon our first-of-its-kind dataset - incorporating pose dynamics and open-vocabulary behavioral annotations across over 42 million frames of diverse psychiatric conditions - MouseGPT provides a novel, context-rich method for comprehensive behavior interpretation. Our holistic analysis framework enables detailed behavior profiling, clustering, and novel behavior discovery, offering deep insights without the need for labor - intensive manual annotation. Evaluations reveal that MouseGPT surpasses existing models in precision, adaptability, and descriptive richness, positioning it as a transformative tool for ethology and for unraveling complex behavioral dynamics in animal models.

Related papers

Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding [12.25140375320834]
BEAST (BEhavioral Analysis via Self-supervised pretraining of Transformers) is a novel framework that pretrains experiment-specific vision transformers for diverse neuro-behavior analyses.<n>Our method establishes a powerful and versatile backbone model that accelerates behavioral analysis in scenarios where labeled data remains scarce.
arXiv Detail & Related papers (2025-07-13T06:43:05Z)
Mice to Machines: Neural Representations from Visual Cortex for Domain Generalization [0.0]
We investigate the functional alignment between the mouse visual cortex and deep learning models for object classification tasks.<n>Our work proposes a novel framework for comparing the functional architecture of the mouse visual cortex with deep learning models.<n>Our findings carry broad implications for the development of advanced AI models that draw inspiration from the mouse visual cortex.
arXiv Detail & Related papers (2025-05-11T07:37:37Z)
AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming [0.0]
We introduce a multimodal vision framework for precision livestock farming. We harness the power of GroundingDINO, HQSAM, and ViTPose models. This suite enables comprehensive behavioral analytics from video data without invasive animal tagging.
arXiv Detail & Related papers (2024-06-14T04:42:44Z)
A Multimodal Automated Interpretability Agent [63.8551718480664]
MAIA is a system that uses neural models to automate neural model understanding tasks. We first characterize MAIA's ability to describe (neuron-level) features in learned representations of images. We then show that MAIA can aid in two additional interpretability tasks: reducing sensitivity to spurious features, and automatically identifying inputs likely to be mis-classified.
arXiv Detail & Related papers (2024-04-22T17:55:11Z)
Computer Vision for Primate Behavior Analysis in the Wild [61.08941894580172]
Video-based behavioral monitoring has great potential for transforming how we study animal cognition and behavior. There is still a fairly large gap between the exciting prospects and what can actually be achieved in practice today.
arXiv Detail & Related papers (2024-01-29T18:59:56Z)
Exploring Novel Object Recognition and Spontaneous Location Recognition Machine Learning Analysis Techniques in Alzheimer's Mice [0.0]
This study is centered on the development, application, and evaluation of a state-of-the-art computational pipeline. The pipeline integrates three advanced computational models: Any-Maze for initial data collection, DeepLabCut for detailed pose estimation, and Convolutional Neural Networks (CNNs) for nuanced behavioral classification.
arXiv Detail & Related papers (2023-12-12T00:54:39Z)
Multi-intention Inverse Q-learning for Interpretable Behavior Representation [12.135423420992334]
inverse reinforcement learning (IRL) methods have proven instrumental in reconstructing animal's intentions underlying complex behaviors. We introduce the class of hierarchical inverse Q-learning (HIQL) algorithms. Applying HIQL to simulated experiments and several real animal behavior datasets, our approach outperforms current benchmarks in behavior prediction.
arXiv Detail & Related papers (2023-11-23T09:27:08Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
LISBET: a machine learning model for the automatic segmentation of social behavior motifs [0.0]
We introduce LISBET (LISBET Is a Social BEhavior Transformer), a machine learning model for detecting and segmenting social interactions. Using self-supervised learning on body tracking data, our model eliminates the need for extensive human annotation. In vivo electrophysiology revealed distinct neural signatures in the Ventral Tegmental Area corresponding to motifs identified by our model.
arXiv Detail & Related papers (2023-11-07T15:35:17Z)
Predicting the long-term collective behaviour of fish pairs with deep learning [52.83927369492564]
This study introduces a deep learning model to assess social interactions in the fish species Hemigrammus rhodostomus. We compare the results of our deep learning approach to experiments and to the results of a state-of-the-art analytical model. We demonstrate that machine learning models social interactions can directly compete with their analytical counterparts in subtle experimental observables.
arXiv Detail & Related papers (2023-02-14T05:25:03Z)
SuperAnimal pretrained pose estimation models for behavioral analysis [42.206265576708255]
Quantification of behavior is critical in applications ranging from neuroscience, veterinary medicine and animal conservation efforts. We present a series of technical innovations that enable a new method, collectively called SuperAnimal, to develop unified foundation models.
arXiv Detail & Related papers (2022-03-14T18:46:57Z)
Overcoming the Domain Gap in Contrastive Learning of Neural Action Representations [60.47807856873544]
A fundamental goal in neuroscience is to understand the relationship between neural activity and behavior. We generated a new multimodal dataset consisting of the spontaneous behaviors generated by fruit flies. This dataset and our new set of augmentations promise to accelerate the application of self-supervised learning methods in neuroscience.
arXiv Detail & Related papers (2021-11-29T15:27:51Z)
Unsupervised Behaviour Analysis and Magnification (uBAM) using Deep Learning [5.101123537955207]
Motor behaviour analysis provides a non-invasive strategy for identifying motor impairment and its change caused by interventions. We introduce unsupervised behaviour analysis and magnification (uBAM), an automatic deep learning algorithm for analysing behaviour by discovering and magnifying deviations. A central aspect is unsupervised learning of posture and behaviour representations to enable an objective comparison of movement.
arXiv Detail & Related papers (2020-12-16T20:07:36Z)
Muti-view Mouse Social Behaviour Recognition with Deep Graphical Model [124.26611454540813]
Social behaviour analysis of mice is an invaluable tool to assess therapeutic efficacy of neurodegenerative diseases. Because of the potential to create rich descriptions of mouse social behaviors, the use of multi-view video recordings for rodent observations is increasingly receiving much attention. We propose a novel multiview latent-attention and dynamic discriminative model that jointly learns view-specific and view-shared sub-structures.
arXiv Detail & Related papers (2020-11-04T18:09:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.