AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming
- URL: http://arxiv.org/abs/2406.09711v1
- Date: Fri, 14 Jun 2024 04:42:44 GMT
- Title: AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming
- Authors: Ahmed Qazi, Taha Razzaq, Asim Iqbal,
- Abstract summary: We introduce a multimodal vision framework for precision livestock farming.
We harness the power of GroundingDINO, HQSAM, and ViTPose models.
This suite enables comprehensive behavioral analytics from video data without invasive animal tagging.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We introduce a multimodal vision framework for precision livestock farming, harnessing the power of GroundingDINO, HQSAM, and ViTPose models. This integrated suite enables comprehensive behavioral analytics from video data without invasive animal tagging. GroundingDINO generates accurate bounding boxes around livestock, while HQSAM segments individual animals within these boxes. ViTPose estimates key body points, facilitating posture and movement analysis. Demonstrated on a sheep dataset with grazing, running, sitting, standing, and walking activities, our framework extracts invaluable insights: activity and grazing patterns, interaction dynamics, and detailed postural evaluations. Applicable across species and video resolutions, this framework revolutionizes non-invasive livestock monitoring for activity detection, counting, health assessments, and posture analyses. It empowers data-driven farm management, optimizing animal welfare and productivity through AI-powered behavioral understanding.
Related papers
- PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions [57.871692507044344]
Pose estimation aims to accurately identify anatomical keypoints in humans and animals using monocular images.
Current models are typically trained and tested on clean data, potentially overlooking the corruption during real-world deployment.
We introduce PoseBench, a benchmark designed to evaluate the robustness of pose estimation models against real-world corruption.
arXiv Detail & Related papers (2024-06-20T14:40:17Z) - Public Computer Vision Datasets for Precision Livestock Farming: A Systematic Survey [3.3651853492305177]
This study presents the first systematic survey of publicly available livestock CV datasets.
Among 58 public datasets identified and analyzed, almost half of them are for cattle, followed by swine, poultry, and other animals.
Individual animal detection and color imaging are the dominant application and imaging modality for livestock.
arXiv Detail & Related papers (2024-06-15T13:22:41Z) - Computer Vision for Primate Behavior Analysis in the Wild [61.08941894580172]
Video-based behavioral monitoring has great potential for transforming how we study animal cognition and behavior.
There is still a fairly large gap between the exciting prospects and what can actually be achieved in practice today.
arXiv Detail & Related papers (2024-01-29T18:59:56Z) - APTv2: Benchmarking Animal Pose Estimation and Tracking with a
Large-scale Dataset and Beyond [27.50166679588048]
APTv2 is the pioneering large-scale benchmark for animal pose estimation and tracking.
It comprises 2,749 video clips filtered and collected from 30 distinct animal species.
We provide high-quality keypoint and tracking annotations for a total of 84,611 animal instances.
arXiv Detail & Related papers (2023-12-25T04:49:49Z) - CattleEyeView: A Multi-task Top-down View Cattle Dataset for Smarter
Precision Livestock Farming [6.291219495092237]
We introduce CattleEyeView dataset, the first top-down view multi-task cattle video dataset.
The dataset contains 753 distinct top-down cow instances in 30,703 frames.
We perform benchmark experiments to evaluate the model's performance for each task.
arXiv Detail & Related papers (2023-12-14T09:18:02Z) - Multimodal Foundation Models for Zero-shot Animal Species Recognition in
Camera Trap Images [57.96659470133514]
Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe.
Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts.
Reducing the reliance on costly labelled data has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor.
arXiv Detail & Related papers (2023-11-02T08:32:00Z) - MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations
of Behavior [28.878568752724235]
We introduce MABe22, a benchmark to assess the quality of learned behavior representations.
This dataset is collected from a variety of biology experiments.
We test self-supervised video and trajectory representation learning methods to demonstrate the use of our benchmark.
arXiv Detail & Related papers (2022-07-21T15:51:30Z) - APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking [77.87449881852062]
APT-36K is the first large-scale benchmark for animal pose estimation and tracking.
It consists of 2,400 video clips collected and filtered from 30 animal species with 15 frames for each video, resulting in 36,000 frames in total.
We benchmark several representative models on the following three tracks: (1) supervised animal pose estimation on a single frame under intra- and inter-domain transfer learning settings, (2) inter-species domain generalization test for unseen animals, and (3) animal pose estimation with animal tracking.
arXiv Detail & Related papers (2022-06-12T07:18:36Z) - Persistent Animal Identification Leveraging Non-Visual Markers [71.14999745312626]
We aim to locate and provide a unique identifier for each mouse in a cluttered home-cage environment through time.
This is a very challenging problem due to (i) the lack of distinguishing visual features for each mouse, and (ii) the close confines of the scene with constant occlusion.
Our approach achieves 77% accuracy on this animal identification problem, and is able to reject spurious detections when the animals are hidden.
arXiv Detail & Related papers (2021-12-13T17:11:32Z) - Livestock Monitoring with Transformer [4.298326853567677]
We develop an end-to-end behaviour monitoring system for group-housed pigs to perform simultaneous instance level segmentation, tracking, action recognition and re-identification tasks.
We present starformer, the first end-to-end multiple-object livestock monitoring framework that learns instance-level embeddings for grouped pigs through the use of transformer architecture.
arXiv Detail & Related papers (2021-11-01T10:03:49Z) - Muti-view Mouse Social Behaviour Recognition with Deep Graphical Model [124.26611454540813]
Social behaviour analysis of mice is an invaluable tool to assess therapeutic efficacy of neurodegenerative diseases.
Because of the potential to create rich descriptions of mouse social behaviors, the use of multi-view video recordings for rodent observations is increasingly receiving much attention.
We propose a novel multiview latent-attention and dynamic discriminative model that jointly learns view-specific and view-shared sub-structures.
arXiv Detail & Related papers (2020-11-04T18:09:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.