Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network
- URL: http://arxiv.org/abs/2404.13983v1
- Date: Mon, 22 Apr 2024 08:44:10 GMT
- Title: Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network
- Authors: Qiwen Deng, Yangcen Liu, Wen Li, Guoqing Wang,
- Abstract summary: We propose a novel Adaptive Affinity-Graph Network (AAGN), which extracts the global affinity between different body parts.
For high-frequency details, a Body Shape Discriminator (BSD) is designed to extract information from both high-frequency and spatial domain.
Our framework significantly enhances the aesthetic appeal of photos, marginally surpassing all previous work to achieve state-of-the-art in all evaluation metrics.
- Score: 14.361677329761672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a source portrait, the automatic human body reshaping task aims at editing it to an aesthetic body shape. As the technology has been widely used in media, several methods have been proposed mainly focusing on generating optical flow to warp the body shape. However, those previous works only consider the local transformation of different body parts (arms, torso, and legs), ignoring the global affinity, and limiting the capacity to ensure consistency and quality across the entire body. In this paper, we propose a novel Adaptive Affinity-Graph Network (AAGN), which extracts the global affinity between different body parts to enhance the quality of the generated optical flow. Specifically, our AAGN primarily introduces the following designs: (1) we propose an Adaptive Affinity-Graph (AAG) Block that leverages the characteristic of a fully connected graph. AAG represents different body parts as nodes in an adaptive fully connected graph and captures all the affinities between nodes to obtain a global affinity map. The design could better improve the consistency between body parts. (2) Besides, for high-frequency details are crucial for photo aesthetics, a Body Shape Discriminator (BSD) is designed to extract information from both high-frequency and spatial domain. Particularly, an SRM filter is utilized to extract high-frequency details, which are combined with spatial features as input to the BSD. With this design, BSD guides the Flow Generator (FG) to pay attention to various fine details rather than rigid pixel-level fitting. Extensive experiments conducted on the BR-5K dataset demonstrate that our framework significantly enhances the aesthetic appeal of reshaped photos, marginally surpassing all previous work to achieve state-of-the-art in all evaluation metrics.
Related papers
- GASA-UNet: Global Axial Self-Attention U-Net for 3D Medical Image Segmentation [8.939740171704388]
We introduce a refined U-Net-like model featuring a novel Global Axial Self-Attention (GASA) block.
This block processes image data as a 3D entity, with each 2D plane representing a different anatomical cross-section.
Our model has demonstrated promising improvements in segmentation performance, particularly for smaller anatomical structures.
arXiv Detail & Related papers (2024-09-20T01:23:53Z) - BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image
Segmentation [0.0]
This paper proposes an innovative U-shaped network called BEFUnet, which enhances the fusion of body and edge information for precise medical image segmentation.
The BEFUnet comprises three main modules, including a novel Local Cross-Attention Feature (LCAF) fusion module, a novel Double-Level Fusion (DLF) module, and dual-branch encoder.
The LCAF module efficiently fuses edge and body features by selectively performing local cross-attention on features that are spatially close between the two modalities.
arXiv Detail & Related papers (2024-02-13T21:03:36Z) - Guided Image Restoration via Simultaneous Feature and Image Guided
Fusion [67.30078778732998]
We propose a Simultaneous Feature and Image Guided Fusion (SFIGF) network.
It considers feature and image-level guided fusion following the guided filter (GF) mechanism.
Since guided fusion is implemented in both feature and image domains, the proposed SFIGF is expected to faithfully reconstruct both contextual and textual information.
arXiv Detail & Related papers (2023-12-14T12:15:45Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Affinity Feature Strengthening for Accurate, Complete and Robust Vessel
Segmentation [48.638327652506284]
Vessel segmentation is crucial in many medical image applications, such as detecting coronary stenoses, retinal vessel diseases and brain aneurysms.
We present a novel approach, the affinity feature strengthening network (AFN), which jointly models geometry and refines pixel-wise segmentation features using a contrast-insensitive, multiscale affinity approach.
arXiv Detail & Related papers (2022-11-12T05:39:17Z) - SIAN: Style-Guided Instance-Adaptive Normalization for Multi-Organ
Histopathology Image Synthesis [63.845552349914186]
We propose a style-guided instance-adaptive normalization (SIAN) to synthesize realistic color distributions and textures for different organs.
The four phases work together and are integrated into a generative network to embed image semantics, style, and instance-level boundaries.
arXiv Detail & Related papers (2022-09-02T16:45:46Z) - Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction [120.08257447708503]
Graph convolutional network based methods that model the body-joints' relations, have recently shown great promise in 3D skeleton-based human motion prediction.
We propose a novel skeleton-parted graph scattering network (SPGSN)
SPGSN outperforms state-of-the-art methods by remarkable margins of 13.8%, 9.3% and 2.7% in terms of 3D mean per joint position error (MPJPE) on Human3.6M, CMU Mocap and 3DPW datasets, respectively.
arXiv Detail & Related papers (2022-07-31T05:51:39Z) - Structure-Aware Flow Generation for Human Body Reshaping [15.365236395118982]
We develop an end-to-end flow generation architecture to achieve unprecedentedly controllable performance under arbitrary poses and garments.
For a comprehensive evaluation, we construct the first large-scale body reshaping dataset, namely BR-5K.
Our approach significantly outperforms existing state-of-the-art methods in terms of visual performance, controllability, and efficiency.
arXiv Detail & Related papers (2022-03-09T12:22:38Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Feedback Graph Attention Convolutional Network for Medical Image
Enhancement [32.95483574100177]
We propose a novel biomedical image enhancement network, named Feedback Graph Attention Convolutional Network (FB-GACN)
As a key innovation, we consider the global structure of an image by building a graph network from image sub-regions.
Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T16:46:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.