A Review of Emerging Research Directions in Abstract Visual Reasoning
- URL: http://arxiv.org/abs/2202.10284v1
- Date: Mon, 21 Feb 2022 14:58:02 GMT
- Title: A Review of Emerging Research Directions in Abstract Visual Reasoning
- Authors: Miko{\l}aj Ma{\l}ki\'nski and Jacek Ma\'ndziuk
- Abstract summary: We propose a taxonomy to categorise the tasks along 5 dimensions: input shapes, hidden rules, target task, cognitive function, and main challenge.
The perspective taken in this survey allows to characterise problems with respect to their shared and distinct properties, provides a unified view on the existing approaches for solving tasks.
One of them refers to the observation that in the machine learning literature different tasks are considered in isolation, which is in the stark contrast with the way the tasks are used to measure human intelligence.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstract Visual Reasoning (AVR) problems are commonly used to approximate
human intelligence. They test the ability of applying previously gained
knowledge, experience and skills in a completely new setting, which makes them
particularly well-suited for this task. Recently, the AVR problems have become
popular as a proxy to study machine intelligence, which has led to emergence of
new distinct types of problems and multiple benchmark sets. In this work we
review this emerging AVR research and propose a taxonomy to categorise the AVR
tasks along 5 dimensions: input shapes, hidden rules, target task, cognitive
function, and main challenge. The perspective taken in this survey allows to
characterise AVR problems with respect to their shared and distinct properties,
provides a unified view on the existing approaches for solving AVR tasks, shows
how the AVR problems relate to practical applications, and outlines promising
directions for future work. One of them refers to the observation that in the
machine learning literature different tasks are considered in isolation, which
is in the stark contrast with the way the AVR tasks are used to measure human
intelligence, where multiple types of problems are combined within a single IQ
test.
Related papers
- A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on.
Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions.
The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - A Critical Analysis on Machine Learning Techniques for Video-based Human Activity Recognition of Surveillance Systems: A Review [1.3693860189056777]
Upsurging abnormal activities in crowded locations urges the necessity for an intelligent surveillance system.
Video-based human activity recognition has intrigued many researchers with its pressing issues.
This paper provides a critical survey of video-based Human Activity Recognition (HAR) techniques.
arXiv Detail & Related papers (2024-09-01T14:43:57Z) - A Comprehensive Review of Few-shot Action Recognition [64.47305887411275]
Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data.
It requires accurately classifying human actions in videos using only a few labeled examples per class.
arXiv Detail & Related papers (2024-07-20T03:53:32Z) - Unified Active Retrieval for Retrieval Augmented Generation [69.63003043712696]
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal.
Existing active retrieval methods face two challenges: 1.
They usually rely on a single criterion, which struggles with handling various types of instructions.
They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated.
arXiv Detail & Related papers (2024-06-18T12:09:02Z) - A Unified View of Abstract Visual Reasoning Problems [0.0]
We introduce a unified view of tasks, where each instance is rendered as a single image with no priori assumptions about the number of panels, their location, or role.
The main advantage of the proposed unified view is the ability to develop universal learning models applicable to various tasks.
Experiments conducted on four datasets with Raven's Progressive Matrices and Visual Analogy Problems show that the proposed unified representation of tasks poses a challenge to state-of-the-art Deep Learning (DL) models and, more broadly, contemporary DL image recognition methods.
arXiv Detail & Related papers (2024-06-16T20:52:44Z) - Effectiveness Assessment of Recent Large Vision-Language Models [78.69439393646554]
This paper endeavors to evaluate the competency of popular large vision-language models (LVLMs) in specialized and general tasks.
We employ six challenging tasks in three different application scenarios: natural, healthcare, and industrial.
We examine the performance of three recent open-source LVLMs, including MiniGPT-v2, LLaVA-1.5, and Shikra, on both visual recognition and localization in these tasks.
arXiv Detail & Related papers (2024-03-07T08:25:27Z) - One Self-Configurable Model to Solve Many Abstract Visual Reasoning
Problems [0.0]
We propose a unified model for solving Single-Choice Abstract visual Reasoning tasks.
The proposed model relies on SCAR-Aware dynamic Layer (SAL), which adapts its weights to the structure of the problem.
Experiments show thatSAL-based models, in general, effectively solves diverse tasks, and its performance is on par with the state-of-the-art task-specific baselines.
arXiv Detail & Related papers (2023-12-15T18:15:20Z) - IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing [88.35145788575348]
Image anomaly detection (IAD) is an emerging and vital computer vision task in industrial manufacturing.
The lack of a uniform IM benchmark is hindering the development and usage of IAD methods in real-world applications.
We construct a comprehensive image anomaly detection benchmark (IM-IAD), which includes 19 algorithms on seven major datasets.
arXiv Detail & Related papers (2023-01-31T01:24:45Z) - Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's
Progressive Matrices [0.0]
We focus on the most common type of tasks -- the Raven's Progressive Matrices ( RPMs) -- and provide a review of the learning methods and deep neural models applied to solve RPMs.
We conclude the paper by demonstrating how real-world problems can benefit from the discoveries of RPM studies.
arXiv Detail & Related papers (2022-01-28T19:24:30Z) - The State of Aerial Surveillance: A Survey [62.198765910573556]
This paper provides a comprehensive overview of human-centric aerial surveillance tasks from a computer vision and pattern recognition perspective.
The main object of interest is humans, where single or multiple subjects are to be detected, identified, tracked, re-identified and have their behavior analyzed.
arXiv Detail & Related papers (2022-01-09T20:13:27Z) - Survey on Reliable Deep Learning-Based Person Re-Identification Models:
Are We There Yet? [19.23187114221822]
Person re-identification (PReID) is one of the most critical problems in intelligent video-surveillance (IVS)
Deep neural networks (DNNs) given their compelling performance on similar vision problems and fast execution at test time.
We present descriptions of each model along with their evaluation on a set of benchmark datasets.
arXiv Detail & Related papers (2020-04-30T16:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.