AMIL: Adversarial Multi Instance Learning for Human Pose Estimation
- URL: http://arxiv.org/abs/2003.08002v1
- Date: Wed, 18 Mar 2020 01:22:16 GMT
- Title: AMIL: Adversarial Multi Instance Learning for Human Pose Estimation
- Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Jie Yang
- Abstract summary: We present a structure-aware network to discreetly consider priors during the training of the network.
We propose generative adversarial networks as our learning model in which we design two residual multiple instance learning (MIL) models.
The proposed adversarial residual multi-instance neural network that is based on pooling has been validated on two datasets.
- Score: 24.175298058941515
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Human pose estimation has an important impact on a wide range of applications
from human-computer interface to surveillance and content-based video
retrieval. For human pose estimation, joint obstructions and overlapping upon
human bodies result in departed pose estimation. To address these problems, by
integrating priors of the structure of human bodies, we present a novel
structure-aware network to discreetly consider such priors during the training
of the network. Typically, learning such constraints is a challenging task.
Instead, we propose generative adversarial networks as our learning model in
which we design two residual multiple instance learning (MIL) models with the
identical architecture, one is used as the generator and the other one is used
as the discriminator. The discriminator task is to distinguish the actual poses
from the fake ones. If the pose generator generates the results that the
discriminator is not able to distinguish from the real ones, the model has
successfully learnt the priors. In the proposed model, the discriminator
differentiates the ground-truth heatmaps from the generated ones, and later the
adversarial loss back-propagates to the generator. Such procedure assists the
generator to learn reasonable body configurations and is proved to be
advantageous to improve the pose estimation accuracy. Meanwhile, we propose a
novel function for MIL. It is an adjustable structure for both instance
selection and modeling to appropriately pass the information between instances
in a single bag. In the proposed residual MIL neural network, the pooling
action adequately updates the instance contribution to its bag. The proposed
adversarial residual multi-instance neural network that is based on pooling has
been validated on two datasets for the human pose estimation task and
successfully outperforms the other state-of-arts models.
Related papers
- Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach [20.86345962679122]
Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
arXiv Detail & Related papers (2023-09-05T17:57:31Z) - Population-Based Evolutionary Gaming for Unsupervised Person
Re-identification [26.279581599246224]
Unsupervised person re-identification has achieved great success through the self-improvement of individual neural networks.
We develop a population-based evolutionary gaming (PEG) framework in which a population of diverse neural networks is trained concurrently through selection, reproduction, mutation, and population mutual learning.
PEG produces new state-of-the-art accuracy for person re-identification, indicating the great potential of population-based network cooperative training for unsupervised learning.
arXiv Detail & Related papers (2023-06-08T14:33:41Z) - Diversity vs. Recognizability: Human-like generalization in one-shot
generative models [5.964436882344729]
We propose a new framework to evaluate one-shot generative models along two axes: sample recognizability vs. diversity.
We first show that GAN-like and VAE-like models fall on opposite ends of the diversity-recognizability space.
In contrast, disentanglement transports the model along a parabolic curve that could be used to maximize recognizability.
arXiv Detail & Related papers (2022-05-20T13:17:08Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.