Visual Ground Truth Construction as Faceted Classification
- URL: http://arxiv.org/abs/2202.08512v1
- Date: Thu, 17 Feb 2022 08:35:23 GMT
- Title: Visual Ground Truth Construction as Faceted Classification
- Authors: Fausto Giunchiglia, Mayukh Bagchi, Xiaolei Diao
- Abstract summary: Key novelty of our approach lies in the fact that we construct the classification hierarchies from visual properties exploiting visual genus-differentiae.
The proposed approach is validated by a set of experiments on the ImageNet hierarchy of musical experiments.
- Score: 4.7590051176368915
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work in Machine Learning and Computer Vision has provided evidence of
systematic design flaws in the development of major object recognition
benchmark datasets. One such example is ImageNet, wherein, for several
categories of images, there are incongruences between the objects they
represent and the labels used to annotate them. The consequences of this
problem are major, in particular considering the large number of machine
learning applications, not least those based on Deep Neural Networks, that have
been trained on these datasets. In this paper we posit the problem to be the
lack of a knowledge representation (KR) methodology providing the foundations
for the construction of these ground truth benchmark datasets. Accordingly, we
propose a solution articulated in three main steps: (i) deconstructing the
object recognition process in four ordered stages grounded in the philosophical
theory of teleosemantics; (ii) based on such stratification, proposing a novel
four-phased methodology for organizing objects in classification hierarchies
according to their visual properties; and (iii) performing such classification
according to the faceted classification paradigm. The key novelty of our
approach lies in the fact that we construct the classification hierarchies from
visual properties exploiting visual genus-differentiae, and not from
linguistically grounded properties. The proposed approach is validated by a set
of experiments on the ImageNet hierarchy of musical experiments.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment [24.880495520422006]
We introduce category-level neural fields that learn meaningful common 3D information among objects belonging to the same category present in the scene.
Our key idea is to subcategorize objects based on their observed shape for better training of the category-level model.
Experiments on both simulation and real-world datasets demonstrate that our method improves the reconstruction of unobserved parts for several categories.
arXiv Detail & Related papers (2024-06-12T13:09:59Z) - Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales [54.78115855552886]
We show how to construct over-complete invariants with a Convolutional Neural Networks (CNN)-like hierarchical architecture.
With the over-completeness, discriminative features w.r.t. the task can be adaptively formed in a Neural Architecture Search (NAS)-like manner.
For robust and interpretable vision tasks at larger scales, hierarchical invariant representation can be considered as an effective alternative to traditional CNN and invariants.
arXiv Detail & Related papers (2024-02-23T16:50:07Z) - Incremental Image Labeling via Iterative Refinement [4.7590051176368915]
In particular, the existence of the semantic gap problem leads to a many-to-many mapping between the information extracted from an image and its linguistic description.
This unavoidable bias further leads to poor performance on current computer vision tasks.
We introduce a Knowledge Representation (KR)-based methodology to provide guidelines driving the labeling process.
arXiv Detail & Related papers (2023-04-18T13:37:22Z) - Semantic Representation and Dependency Learning for Multi-Label Image
Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category.
Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model.
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z) - Building a visual semantics aware object hierarchy [0.0]
We propose a novel unsupervised method to build visual semantics aware object hierarchy.
Our intuition in this paper comes from real-world knowledge representation where concepts are hierarchically organized.
The evaluation consists of two parts, firstly we apply the constructed hierarchy on the object recognition task and then we compare our visual hierarchy and existing lexical hierarchies to show the validity of our method.
arXiv Detail & Related papers (2022-02-26T00:10:21Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Object Recognition as Classification of Visual Properties [5.1652563977194434]
We present an object recognition process based on Ranganathan's four-phased faceted knowledge organization process.
We briefly introduce the ongoing project MultiMedia UKC, whose aim is to build an object recognition resource.
arXiv Detail & Related papers (2021-12-20T13:50:07Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.