UniAP: Towards Universal Animal Perception in Vision via Few-shot
Learning
- URL: http://arxiv.org/abs/2308.09953v1
- Date: Sat, 19 Aug 2023 09:13:46 GMT
- Title: UniAP: Towards Universal Animal Perception in Vision via Few-shot
Learning
- Authors: Meiqi Sun, Zhonghan Zhao, Wenhao Chai, Hanjun Luo, Shidong Cao,
Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang
- Abstract summary: We introduce UniAP, a novel Universal Animal Perception model that enables cross-species perception among various visual tasks.
By capitalizing on the shared visual characteristics among different animals and tasks, UniAP enables the transfer of knowledge from well-studied species to those with limited labeled data or even unseen species.
- Score: 24.157933537030086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Animal visual perception is an important technique for automatically
monitoring animal health, understanding animal behaviors, and assisting
animal-related research. However, it is challenging to design a deep
learning-based perception model that can freely adapt to different animals
across various perception tasks, due to the varying poses of a large diversity
of animals, lacking data on rare species, and the semantic inconsistency of
different tasks. We introduce UniAP, a novel Universal Animal Perception model
that leverages few-shot learning to enable cross-species perception among
various visual tasks. Our proposed model takes support images and labels as
prompt guidance for a query image. Images and labels are processed through a
Transformer-based encoder and a lightweight label encoder, respectively. Then a
matching module is designed for aggregating information between prompt guidance
and the query image, followed by a multi-head label decoder to generate outputs
for various tasks. By capitalizing on the shared visual characteristics among
different animals and tasks, UniAP enables the transfer of knowledge from
well-studied species to those with limited labeled data or even unseen species.
We demonstrate the effectiveness of UniAP through comprehensive experiments in
pose estimation, segmentation, and classification tasks on diverse animal
species, showcasing its ability to generalize and adapt to new classes with
minimal labeled examples.
Related papers
- An Individual Identity-Driven Framework for Animal Re-Identification [15.381573249551181]
IndivAID is a framework specifically designed for Animal ReID.
It generates image-specific and individual-specific textual descriptions that fully capture the diverse visual concepts of each individual across animal images.
Evaluation against state-of-the-art methods across eight benchmark datasets and a real-world Stoat dataset demonstrates IndivAID's effectiveness and applicability.
arXiv Detail & Related papers (2024-10-30T11:34:55Z) - GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding [2.79453284883108]
This study evaluates the visual perception capabilities of multimodal large language models in animal activity recognition.
We found that while current multimodal LLMs require improvement in semantic correspondence and time perception, they have initially demonstrated visual perception capabilities for animal activity recognition.
arXiv Detail & Related papers (2024-06-14T07:30:26Z) - Multimodal Foundation Models for Zero-shot Animal Species Recognition in
Camera Trap Images [57.96659470133514]
Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe.
Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts.
Reducing the reliance on costly labelled data has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor.
arXiv Detail & Related papers (2023-11-02T08:32:00Z) - Multi-Modal Few-Shot Temporal Action Detection [157.96194484236483]
Few-shot (FS) and zero-shot (ZS) learning are two different approaches for scaling temporal action detection to new classes.
We introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD.
arXiv Detail & Related papers (2022-11-27T18:13:05Z) - CLAMP: Prompt-based Contrastive Learning for Connecting Language and
Animal Pose [70.59906971581192]
We introduce a novel prompt-based Contrastive learning scheme for connecting Language and AniMal Pose effectively.
The CLAMP attempts to bridge the gap by adapting the text prompts to the animal keypoints during network training.
Experimental results show that our method achieves state-of-the-art performance under the supervised, few-shot, and zero-shot settings.
arXiv Detail & Related papers (2022-06-23T14:51:42Z) - Animal Kingdom: A Large and Diverse Dataset for Animal Behavior
Understanding [4.606145900630665]
We create a large and diverse dataset, Animal Kingdom, that provides multiple annotated tasks.
Our dataset contains 50 hours of annotated videos to localize relevant animal behavior segments.
We propose a Collaborative Action Recognition (CARe) model that learns general and specific features for action recognition with unseen new animals.
arXiv Detail & Related papers (2022-04-18T02:05:15Z) - Factors of Influence for Transfer Learning across Diverse Appearance
Domains and Task Types [50.1843146606122]
A simple form of transfer learning is common in current state-of-the-art computer vision models.
Previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood.
In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains.
arXiv Detail & Related papers (2021-03-24T16:24:20Z) - Perspectives on individual animal identification from biology and
computer vision [58.81800919492064]
We review current advances of computer vision identification techniques to provide both computer scientists and biologists with an overview of the available tools.
We conclude by offering recommendations for starting an animal identification project, illustrate current limitations and propose how they might be addressed in the future.
arXiv Detail & Related papers (2021-02-28T16:50:09Z) - Transferring Dense Pose to Proximal Animal Classes [83.84439508978126]
We show that it is possible to transfer the knowledge existing in dense pose recognition for humans, as well as in more general object detectors and segmenters, to the problem of dense pose recognition in other classes.
We do this by establishing a DensePose model for the new animal which is also geometrically aligned to humans.
We also introduce two benchmark datasets labelled in the manner of DensePose for the class chimpanzee and use them to evaluate our approach.
arXiv Detail & Related papers (2020-02-28T21:43:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.