Related papers: Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space

Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space

URL: http://arxiv.org/abs/2512.16133v1
Date: Thu, 18 Dec 2025 03:42:54 GMT
Title: Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space
Authors: Ren Nakagawa, Yang Yang, Risa Shinoda, Hiroaki Santo, Kenji Oyama, Fumio Okura, Takenao Ohkawa,
Abstract summary: This paper introduces a method and application for automatically detecting behavioral interactions between grazing cattle from a single image.<n>We propose CattleAct, a data-efficient method for interaction detection by decomposing interactions into the combinations of actions by individual cattle.<n>On top of the proposed method, we develop a practical working system integrating video and GPS inputs.
Score: 18.635930702079563
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This paper introduces a method and application for automatically detecting behavioral interactions between grazing cattle from a single image, which is essential for smart livestock management in the cattle industry, such as for detecting estrus. Although interaction detection for humans has been actively studied, a non-trivial challenge lies in cattle interaction detection, specifically the lack of a comprehensive behavioral dataset that includes interactions, as the interactions of grazing cattle are rare events. We, therefore, propose CattleAct, a data-efficient method for interaction detection by decomposing interactions into the combinations of actions by individual cattle. Specifically, we first learn an action latent space from a large-scale cattle action dataset. Then, we embed rare interactions via the fine-tuning of the pre-trained latent space using contrastive learning, thereby constructing a unified latent space of actions and interactions. On top of the proposed method, we develop a practical working system integrating video and GPS inputs. Experiments on a commercial-scale pasture demonstrate the accurate interaction detection achieved by our method compared to the baselines. Our implementation is available at https://github.com/rakawanegan/CattleAct.

Related papers

PEAR: Phrase-Based Hand-Object Interaction Anticipation [20.53329698350243]
First-person hand-object interaction anticipation aims to predict the interaction process based on current scenes and prompts. Existing research typically anticipates only interaction intention while neglecting manipulation. We propose a novel model, PEAR, which jointly anticipates interaction intention and manipulation.
arXiv Detail & Related papers (2024-07-31T10:28:49Z)
Disentangled Interaction Representation for One-Stage Human-Object Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding. Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction. Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z)
HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly. Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions. Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z)
Real-Time Cattle Interaction Recognition via Triple-stream Network [3.3843451892622576]
Cattle localization network outputs high-quality interaction proposals from every detected cattle. Interaction recognition network feeds them into the interaction recognition network with a triple-stream architecture.
arXiv Detail & Related papers (2022-09-06T06:31:09Z)
Interactiveness Field in Human-Object Interactions [89.13149887013905]
We introduce a previously overlooked interactiveness bimodal prior: given an object in an image, after pairing it with the humans, the generated pairs are either mostly non-interactive, or mostly interactive. We propose new energy constraints based on the cardinality and difference in the inherent "interactiveness field" underlying interactive versus non-interactive pairs. Our method can detect more precise pairs and thus significantly boost HOI detection performance.
arXiv Detail & Related papers (2022-04-16T05:09:25Z)
Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data. Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z)
Transferable Interactiveness Knowledge for Human-Object Interaction Detection [46.89715038756862]
We explore interactiveness knowledge which indicates whether a human and an object interact with each other or not. We found that interactiveness knowledge can be learned across HOI datasets and bridge the gap between diverse HOI category settings. Our core idea is to exploit an interactiveness network to learn the general interactiveness knowledge from multiple HOI datasets.
arXiv Detail & Related papers (2021-01-25T18:21:07Z)
DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection [53.40028068801092]
We propose a novel one-stage HOI detection approach based on a new concept called interaction region for the HOI problem. Unlike previous methods, our approach concentrates on the densely sampled interaction regions across different scales for each human-object pair. In order to compensate for the detection flaws of a single interaction region, we introduce a novel voting strategy.
arXiv Detail & Related papers (2020-10-02T13:57:58Z)
Learning End-to-End Action Interaction by Paired-Embedding Data Augmentation [10.857323240766428]
A new Interactive Action Translation (IAT) task aims to learn end-to-end action interaction from unlabeled interactive pairs. We propose a Paired-Embedding (PE) method for effective and reliable data augmentation. Experimental results on two datasets show impressive effects and broad application prospects of our method.
arXiv Detail & Related papers (2020-07-16T01:54:16Z)
Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs. Our network predicts interaction points, which directly localize and classify the inter-action. Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.