Enabling Viewpoint Learning through Dynamic Label Generation
- URL: http://arxiv.org/abs/2003.04651v2
- Date: Tue, 9 Feb 2021 14:35:11 GMT
- Title: Enabling Viewpoint Learning through Dynamic Label Generation
- Authors: Michael Schelling, Pedro Hermosilla, Pere-Pau Vazquez, Timo Ropinski
- Abstract summary: We show how our proposed approach allows for learning viewpoint predictions for models from different object categories.
We show that prediction times are reduced from several minutes to a fraction of a second, as compared to state-of-the-art (SOTA) viewpoint quality evaluation.
- Score: 10.228754362756153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal viewpoint prediction is an essential task in many computer graphics
applications. Unfortunately, common viewpoint qualities suffer from two major
drawbacks: dependency on clean surface meshes, which are not always available,
and the lack of closed-form expressions, which requires a costly search
involving rendering. To overcome these limitations we propose to separate
viewpoint selection from rendering through an end-to-end learning approach,
whereby we reduce the influence of the mesh quality by predicting viewpoints
from unstructured point clouds instead of polygonal meshes. While this makes
our approach insensitive to the mesh discretization during evaluation, it only
becomes possible when resolving label ambiguities that arise in this context.
Therefore, we additionally propose to incorporate the label generation into the
training procedure, making the label decision adaptive to the current network
predictions. We show how our proposed approach allows for learning viewpoint
predictions for models from different object categories and for different
viewpoint qualities. Additionally, we show that prediction times are reduced
from several minutes to a fraction of a second, as compared to state-of-the-art
(SOTA) viewpoint quality evaluation. We will further release the code and
training data, which will to our knowledge be the biggest viewpoint quality
dataset available.
Related papers
- Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Vision-language Assisted Attribute Learning [53.60196963381315]
Attribute labeling at large scale is typically incomplete and partial.
Existing attribute learning methods often treat the missing labels as negative or simply ignore them all during training.
We leverage the available vision-language knowledge to explicitly disclose the missing labels for enhancing model learning.
arXiv Detail & Related papers (2023-12-12T06:45:19Z) - Weakly Supervised Video Individual CountingWeakly Supervised Video
Individual Counting [126.75545291243142]
Video Individual Counting aims to predict the number of unique individuals in a single video.
We introduce a weakly supervised VIC task, wherein trajectory labels are not provided.
In doing so, we devise an end-to-end trainable soft contrastive loss to drive the network to distinguish inflow, outflow, and the remaining.
arXiv Detail & Related papers (2023-12-10T16:12:13Z) - Credible Remote Sensing Scene Classification Using Evidential Fusion on
Aerial-Ground Dual-view Images [6.817740582240199]
Multi-view (multi-source, multi-modal, multi-perspective, etc.) data are being used more frequently in remote sensing tasks.
The issue of data quality becomes more apparent, limiting the potential benefits of multi-view data.
Deep learning is introduced to the task of aerial-ground dual-view remote sensing scene classification.
arXiv Detail & Related papers (2023-01-02T12:27:55Z) - Data Augmentation-free Unsupervised Learning for 3D Point Cloud
Understanding [61.30276576646909]
We propose an augmentation-free unsupervised approach for point clouds to learn transferable point-level features via soft clustering, named SoftClu.
We exploit the affiliation of points to their clusters as a proxy to enable self-training through a pseudo-label prediction task.
arXiv Detail & Related papers (2022-10-06T10:18:16Z) - Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck [3.2790748006553643]
Given that point-wise semantic annotation is expensive, in this paper, we address the challenge of learning models with extremely sparse labels.
We propose a self-supervised 3D representation learning framework named viewpoint bottleneck.
arXiv Detail & Related papers (2021-09-17T13:54:20Z) - Unsupervised Representation Learning from Pathology Images with
Multi-directional Contrastive Predictive Coding [0.33148826359547523]
We present a modification to the CPC framework for use with digital pathology patches.
This is achieved by introducing an alternative mask for building the latent context.
We show that our proposed modification can yield improved deep classification of histology patches.
arXiv Detail & Related papers (2021-05-11T21:17:13Z) - Weak Multi-View Supervision for Surface Mapping Estimation [0.9367260794056769]
We propose a weakly-supervised multi-view learning approach to learn category-specific surface mapping without dense annotations.
We learn the underlying surface geometry of common categories, such as human faces, cars, and airplanes, given instances from those categories.
arXiv Detail & Related papers (2021-05-04T09:46:26Z) - Distribution Alignment: A Unified Framework for Long-tail Visual
Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition.
We then introduce a generalized re-weight method in the two-stage learning to balance the class prior.
Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - A Weakly-Supervised Semantic Segmentation Approach based on the Centroid
Loss: Application to Quality Control and Inspection [6.101839518775968]
We propose and assess a new weakly-supervised semantic segmentation approach making use of a novel loss function.
The performance of the approach is evaluated against datasets from two different industry-related case studies.
arXiv Detail & Related papers (2020-10-26T09:08:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.