Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
- URL: http://arxiv.org/abs/2203.14804v1
- Date: Mon, 28 Mar 2022 14:44:45 GMT
- Title: Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
- Authors: Pinaki Nath Chowdhury and Ayan Kumar Bhunia and Viswanatha Reddy
Gajjala and Aneeshan Sain and Tao Xiang and Yi-Zhe Song
- Abstract summary: A significant portion of scene sketches are "partial"
We propose a set-based approach to model cross-modal region associativity in a partially-aware fashion.
Our proposed method is not only robust to partial scene-sketches but also yields state-of-the-art performance on existing datasets.
- Score: 106.59164595640704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We scrutinise an important observation plaguing scene-level sketch research
-- that a significant portion of scene sketches are "partial". A quick pilot
study reveals: (i) a scene sketch does not necessarily contain all objects in
the corresponding photo, due to the subjective holistic interpretation of
scenes, (ii) there exists significant empty (white) regions as a result of
object-level abstraction, and as a result, (iii) existing scene-level
fine-grained sketch-based image retrieval methods collapse as scene sketches
become more partial. To solve this "partial" problem, we advocate for a simple
set-based approach using optimal transport (OT) to model cross-modal region
associativity in a partially-aware fashion. Importantly, we improve upon OT to
further account for holistic partialness by comparing intra-modal adjacency
matrices. Our proposed method is not only robust to partial scene-sketches but
also yields state-of-the-art performance on existing datasets.
Related papers
- Multi-Round Region-Based Optimization for Scene Sketching [7.281215486388827]
Scene sketching requires semantic understanding of the scene and consideration of different regions within the scene.
We optimize the different regions of input scene in multiple rounds.
A novel CLIP-Based Semantic loss and a VGG-Based Feature loss are utilized to guide our multi-round optimization.
arXiv Detail & Related papers (2024-10-05T08:04:26Z) - Object-level Scene Deocclusion [92.39886029550286]
We present a new self-supervised PArallel visible-to-COmplete diffusion framework, named PACO, for object-level scene deocclusion.
To train PACO, we create a large-scale dataset with 500k samples to enable self-supervised learning.
Experiments on COCOA and various real-world scenes demonstrate the superior capability of PACO for scene deocclusion, surpassing the state of the arts by a large margin.
arXiv Detail & Related papers (2024-06-11T20:34:10Z) - Occ$^2$Net: Robust Image Matching Based on 3D Occupancy Estimation for
Occluded Regions [14.217367037250296]
Occ$2$Net is an image matching method that models occlusion relations using 3D occupancy and infers matching points in occluded regions.
We evaluate our method on both real-world and simulated datasets and demonstrate its superior performance over state-of-the-art methods on several metrics.
arXiv Detail & Related papers (2023-08-14T13:09:41Z) - Learning Unified Decompositional and Compositional NeRF for Editable
Novel View Synthesis [37.98068169673019]
Implicit neural representations have shown powerful capacity in modeling real-world 3D scenes, offering superior performance in novel view synthesis.
We propose a unified Neural Radiance Field (NeRF) framework to effectively perform joint scene decomposition and composition.
arXiv Detail & Related papers (2023-08-05T10:42:05Z) - Deep Reinforced Attention Regression for Partial Sketch Based Image
Retrieval [6.7667046211131066]
Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims at finding a specific image from a large gallery given a query sketch.
Existing approaches still suffer from a low accuracy while being sensitive to external noises such as unnecessary strokes in the sketch.
We propose a novel framework that leverages a uniquely designed deep reinforcement learning model that performs a dual-level exploration to deal with partial sketch training and attention region selection.
arXiv Detail & Related papers (2021-11-21T23:12:51Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval [66.37346493506737]
Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal retrieval task.
We propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR.
Our approach notably outperforms the state-of-the-art methods in both Sketchy and TU-Berlin datasets.
arXiv Detail & Related papers (2021-06-22T14:58:08Z) - Perspective Plane Program Induction from a Single Image [85.28956922100305]
We study the inverse graphics problem of inferring a holistic representation for natural images.
We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image.
Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and gradient-based algorithms to efficiently solve the problem.
arXiv Detail & Related papers (2020-06-25T21:18:58Z) - Self-Supervised Scene De-occlusion [186.89979151728636]
This paper investigates the problem of scene de-occlusion, which aims to recover the underlying occlusion ordering and complete the invisible parts of occluded objects.
We make the first attempt to address the problem through a novel and unified framework that recovers hidden scene structures without ordering and amodal annotations as supervisions.
Based on PCNet-M and PCNet-C, we devise a novel inference scheme to accomplish scene de-occlusion, via progressive ordering recovery, amodal completion and content completion.
arXiv Detail & Related papers (2020-04-06T16:31:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.