InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation
based on Visual Illusion
- URL: http://arxiv.org/abs/2305.17716v4
- Date: Mon, 5 Jun 2023 22:52:57 GMT
- Title: InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation
based on Visual Illusion
- Authors: Haobo Yang, Wenyu Wang, Ze Cao, Zhekai Duan, Xuchen Liu
- Abstract summary: This paper introduces a novel approach to evaluating deep learning models' capacity for in-diagram logic interpretation.
We establish a unique dataset, InDL, designed to rigorously test and benchmark these models.
We utilize six classic geometric optical illusions to create a comparative framework between human and machine visual perception.
- Score: 1.7980584146314789
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper introduces a novel approach to evaluating deep learning models'
capacity for in-diagram logic interpretation. Leveraging the intriguing realm
of visual illusions, we establish a unique dataset, InDL, designed to
rigorously test and benchmark these models. Deep learning has witnessed
remarkable progress in domains such as computer vision and natural language
processing. However, models often stumble in tasks requiring logical reasoning
due to their inherent 'black box' characteristics, which obscure the
decision-making process. Our work presents a new lens to understand these
models better by focusing on their handling of visual illusions -- a complex
interplay of perception and logic. We utilize six classic geometric optical
illusions to create a comparative framework between human and machine visual
perception. This methodology offers a quantifiable measure to rank models,
elucidating potential weaknesses and providing actionable insights for model
improvements. Our experimental results affirm the efficacy of our benchmarking
strategy, demonstrating its ability to effectively rank models based on their
logic interpretation ability. As part of our commitment to reproducible
research, the source code and datasets will be made publicly available at
https://github.com/rabbit-magic-wh/InDL
Related papers
- SOLD: Reinforcement Learning with Slot Object-Centric Latent Dynamics [16.020835290802548]
Slot-Attention for Object-centric Latent Dynamics is a novel algorithm that learns object-centric dynamics models from pixel inputs.
We demonstrate that the structured latent space not only improves model interpretability but also provides a valuable input space for behavior models to reason over.
Our results show that SOLD outperforms DreamerV3, a state-of-the-art model-based RL algorithm, across a range of benchmark robotic environments.
arXiv Detail & Related papers (2024-10-11T14:03:31Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms [91.19304518033144]
We aim to align vision models with human aesthetic standards in a retrieval system.
We propose a preference-based reinforcement learning method that fine-tunes the vision models to better align the vision models with human aesthetics.
arXiv Detail & Related papers (2024-06-13T17:59:20Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Perception Visualization: Seeing Through the Eyes of a DNN [5.9557391359320375]
We develop a new form of explanation that is radically different in nature from current explanation methods, such as Grad-CAM.
Perception visualization provides a visual representation of what the DNN perceives in the input image by depicting what visual patterns the latent representation corresponds to.
Results of our user study demonstrate that humans can better understand and predict the system's decisions when perception visualizations are available.
arXiv Detail & Related papers (2022-04-21T07:18:55Z) - Human-Understandable Decision Making for Visual Recognition [30.30163407674527]
We propose a new framework to train a deep neural network by incorporating the prior of human perception into the model learning process.
The effectiveness of our proposed model is evaluated on two classical visual recognition tasks.
arXiv Detail & Related papers (2021-03-05T02:07:33Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Explainable Matrix -- Visualization for Global and Local
Interpretability of Random Forest Classification Ensembles [78.6363825307044]
We propose Explainable Matrix (ExMatrix), a novel visualization method for Random Forest (RF) interpretability.
It employs a simple yet powerful matrix-like visual metaphor, where rows are rules, columns are features, and cells are rules predicates.
ExMatrix applicability is confirmed via different examples, showing how it can be used in practice to promote RF models interpretability.
arXiv Detail & Related papers (2020-05-08T21:03:48Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.