On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey
- URL: http://arxiv.org/abs/2408.04879v2
- Date: Thu, 22 Aug 2024 09:04:29 GMT
- Title: On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey
- Authors: Jingcai Guo, Zhijie Rao, Zhi Chen, Song Guo, Jingren Zhou, Dacheng Tao,
- Abstract summary: Zero-shot image recognition (ZSIR) aims at empowering models to recognize and reason in unseen domains.
This paper presents a broad review of recent advances in element-wise ZSIR.
We first attempt to integrate the three basic ZSIR tasks of object recognition, compositional recognition, and foundation model-based open-world recognition into a unified element-wise perspective.
- Score: 82.49623756124357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot image recognition (ZSIR) aims at empowering models to recognize and reason in unseen domains via learning generalized knowledge from limited data in the seen domain. The gist for ZSIR is to execute element-wise representation and reasoning from the input visual space to the target semantic space, which is a bottom-up modeling paradigm inspired by the process by which humans observe the world, i.e., capturing new concepts by learning and combining the basic components or shared characteristics. In recent years, element-wise learning techniques have seen significant progress in ZSIR as well as widespread application. However, to the best of our knowledge, there remains a lack of a systematic overview of this topic. To enrich the literature and provide a sound basis for its future development, this paper presents a broad review of recent advances in element-wise ZSIR. Concretely, we first attempt to integrate the three basic ZSIR tasks of object recognition, compositional recognition, and foundation model-based open-world recognition into a unified element-wise perspective and provide a detailed taxonomy and analysis of the main research approaches. Then, we collect and summarize some key information and benchmarks, such as detailed technical implementations and common datasets. Finally, we sketch out the wide range of its related applications, discuss vital challenges, and suggest potential future directions.
Related papers
- Open World Object Detection: A Survey [16.839310066730533]
Open world object detection (OWOD) is an emerging area of research that adapts this principle to explore new knowledge.
This paper offers a thorough review of the OWOD domain, covering essential aspects, including problem definitions, benchmark datasets, source codes, evaluation metrics, and a comparative study of existing methods.
The paper concludes by addressing the limitations and challenges faced by current OWOD algorithms and proposes directions for future research.
arXiv Detail & Related papers (2024-10-15T05:46:00Z) - Discovering Conceptual Knowledge with Analytic Ontology Templates for Articulated Objects [42.9186628100765]
We aim to endow machine intelligence with an analogous capability through performing at the conceptual level.
AOT-driven approach yields benefits in three key perspectives.
arXiv Detail & Related papers (2024-09-18T04:53:38Z) - Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence.
Recent trends demonstrate the potential homogeneity of these two fields.
We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z) - On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions [46.63556358247516]
Entity- and event-level conceptualization plays a pivotal role in generalizable reasoning.
There is currently a lack of a systematic overview that comprehensively examines existing works in the definition, execution, and application of conceptualization.
We present the first comprehensive survey of 150+ papers, categorizing various definitions, resources, methods, and downstream applications related to conceptualization into a unified taxonomy.
arXiv Detail & Related papers (2024-06-16T10:32:41Z) - Augmented Commonsense Knowledge for Remote Object Grounding [67.30864498454805]
We propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as atemporal knowledge graph for improving agent navigation.
ACK consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment.
We add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction.
arXiv Detail & Related papers (2024-06-03T12:12:33Z) - Towards Data-and Knowledge-Driven Artificial Intelligence: A Survey on Neuro-Symbolic Computing [73.0977635031713]
Neural-symbolic computing (NeSy) has been an active research area of Artificial Intelligence (AI) for many years.
NeSy shows promise of reconciling the advantages of reasoning and interpretability of symbolic representation and robust learning in neural networks.
arXiv Detail & Related papers (2022-10-28T04:38:10Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Deep Gait Recognition: A Survey [15.47582611826366]
Gait recognition is an appealing biometric modality which aims to identify individuals based on the way they walk.
Deep learning has reshaped the research landscape in this area since 2015 through the ability to automatically learn discriminative representations.
We present a comprehensive overview of breakthroughs and recent developments in gait recognition with deep learning.
arXiv Detail & Related papers (2021-02-18T18:49:28Z) - A Review on Intelligent Object Perception Methods Combining
Knowledge-based Reasoning and Machine Learning [60.335974351919816]
Object perception is a fundamental sub-field of Computer Vision.
Recent works seek ways to integrate knowledge engineering in order to expand the level of intelligence of the visual interpretation of objects.
arXiv Detail & Related papers (2019-12-26T13:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.