Related papers: Hierarchical Material Recognition from Local Appearance

Hierarchical Material Recognition from Local Appearance

URL: http://arxiv.org/abs/2505.22911v2
Date: Mon, 02 Jun 2025 16:21:06 GMT
Title: Hierarchical Material Recognition from Local Appearance
Authors: Matthew Beveridge, Shree K. Nayar,
Abstract summary: We introduce a taxonomy of materials for hierarchical recognition from local appearance.<n>We contribute a diverse, in-the-wild dataset with images and depth maps of the taxonomy classes.<n>We present a method for hierarchical material recognition based on graph attention networks.
Score: 6.790905400046194
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a taxonomy of materials for hierarchical recognition from local appearance. Our taxonomy is motivated by vision applications and is arranged according to the physical traits of materials. We contribute a diverse, in-the-wild dataset with images and depth maps of the taxonomy classes. Utilizing the taxonomy and dataset, we present a method for hierarchical material recognition based on graph attention networks. Our model leverages the taxonomic proximity between classes and achieves state-of-the-art performance. We demonstrate the model's potential to generalize to adverse, real-world imaging conditions, and that novel views rendered using the depth maps can enhance this capability. Finally, we show the model's capacity to rapidly learn new materials in a few-shot learning setting.

Related papers

Analyzing Hierarchical Structure in Vision Models with Sparse Autoencoders [6.7161402871287645]
The ImageNet hierarchy provides a structured taxonomy of object categories, offering a valuable lens through which to analyze the representations learned by deep vision models.<n>In this work, we conduct a comprehensive analysis of how vision models encode the ImageNet hierarchy, leveraging Sparse Autoencoders (SAEs) to probe their internal representations.
arXiv Detail & Related papers (2025-05-21T19:38:48Z)
Do I look like a `cat.n.01` to you? A Taxonomy Image Generation Benchmark [63.97125827026949]
This paper explores the feasibility of using text-to-image models in a zero-shot setup to generate images for taxonomy concepts.<n>A benchmark is proposed that assesses models' abilities to understand taxonomy concepts and generate relevant, high-quality images.<n>The 12 models are evaluated using 9 novel taxonomy-related text-to-image metrics and human feedback.
arXiv Detail & Related papers (2025-03-13T13:37:54Z)
Connectivity-Inspired Network for Context-Aware Recognition [1.049712834719005]
We focus on the effect of incorporating circuit motifs found in biological brains to address visual recognition. Our convolutional architecture is inspired by the connectivity of human cortical and subcortical streams. We present a new plug-and-play module to model context awareness.
arXiv Detail & Related papers (2024-09-06T15:42:10Z)
A Hierarchical Architecture for Neural Materials [13.144139872006287]
We introduce a neural appearance model that offers a new level of accuracy. An inception-based core network structure captures material appearances at multiple scales. We encode the inputs into frequency space, introduce a gradient-based loss, and employ it adaptive to the progress of the learning phase.
arXiv Detail & Related papers (2023-07-19T17:00:45Z)
Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning [74.6485604326913]
We provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation. To propagate information on the knowledge graph, we propose a novel Residual Graph Convolutional Network (ResGCN) Experiments conducted on the widely used large-scale ImageNet-21K dataset and AWA2 dataset show the effectiveness of our method.
arXiv Detail & Related papers (2022-12-26T13:18:36Z)
Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks. We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z)
Taxonomy Enrichment with Text and Graph Vector Representations [61.814256012166794]
We address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy. We present a new method that allows achieving high results on this task with little effort. We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.
arXiv Detail & Related papers (2022-01-21T09:01:12Z)
Polynomial Networks in Deep Classifiers [55.90321402256631]
We cast the study of deep neural networks under a unifying framework. Our framework provides insights on the inductive biases of each model. The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.
arXiv Detail & Related papers (2021-04-16T06:41:20Z)
All About Knowledge Graphs for Actions [82.39684757372075]
We propose a better understanding of knowledge graphs (KGs) that can be utilized for zero-shot and few-shot action recognition. We study three different construction mechanisms for KGs: action embeddings, action-object embeddings, visual embeddings. We present extensive analysis of the impact of different KGs on different experimental setups.
arXiv Detail & Related papers (2020-08-28T01:44:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.