Complexity in Complexity: Understanding Visual Complexity Through Structure, Color, and Surprise
- URL: http://arxiv.org/abs/2501.15890v2
- Date: Wed, 05 Feb 2025 19:36:23 GMT
- Title: Complexity in Complexity: Understanding Visual Complexity Through Structure, Color, and Surprise
- Authors: Karahan Sarıtaş, Peter Dayan, Tingke Shen, Surabhi S Nath,
- Abstract summary: We investigate the failure of an interpretable segmentation-based model to capture structural, color and surprisal contributions to complexity.
We propose Multi-Scale Sobel Gradient which measures spatial intensity variations, Multi-Scale Unique Color which quantifies colorfulness across multiple scales, and surprise scores generated using a Large Language Model.
Our experiments demonstrate that modeling complexity accurately is not as simple as previously thought, requiring additional perceptual and semantic factors to address dataset biases.
- Score: 6.324765782436764
- License:
- Abstract: Understanding human perception of visual complexity is crucial in visual cognition. Recently (Shen, et al. 2024) proposed an interpretable segmentation-based model that accurately predicted complexity across various datasets, supporting the idea that complexity can be explained simply. In this work, we investigate the failure of their model to capture structural, color and surprisal contributions to complexity. To this end, we propose Multi-Scale Sobel Gradient which measures spatial intensity variations, Multi-Scale Unique Color which quantifies colorfulness across multiple scales, and surprise scores generated using a Large Language Model. We test our features on existing benchmarks and a novel dataset containing surprising images from Visual Genome. Our experiments demonstrate that modeling complexity accurately is not as simple as previously thought, requiring additional perceptual and semantic factors to address dataset biases. Thus our results offer deeper insights into how humans assess visual complexity.
Related papers
- Multi-scale structural complexity as a quantitative measure of visual complexity [1.3499500088995464]
We suggest adopting the multi-scale structural complexity (MSSC) measure, an approach that defines structural complexity of an object as the amount of dissimilarities between distinct scales in its hierarchical organization.
We demonstrate that MSSC correlates with subjective complexity on par with other computational complexity measures, while being more intuitive by definition, consistent across categories of images, and easier to compute.
arXiv Detail & Related papers (2024-08-07T20:26:35Z) - Understanding Visual Feature Reliance through the Lens of Complexity [14.282243225622093]
We introduce a new metric for quantifying feature complexity, based on $mathscrV$-information.
We analyze the complexities of 10,000 features, represented as directions in the penultimate layer, that were extracted from a standard ImageNet-trained vision model.
arXiv Detail & Related papers (2024-07-08T16:21:53Z) - Simplicity in Complexity : Explaining Visual Complexity using Deep Segmentation Models [6.324765782436764]
We propose to model complexity using segment-based representations of images.
We find that complexity is well-explained by a simple linear model with these two features across six diverse image-sets.
arXiv Detail & Related papers (2024-03-05T17:21:31Z) - On the Complexity of Bayesian Generalization [141.21610899086392]
We consider concept generalization at a large scale in the diverse and natural visual spectrum.
We study two modes when the problem space scales up, and the $complexity$ of concepts becomes diverse.
arXiv Detail & Related papers (2022-11-20T17:21:37Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Complexity and Aesthetics in Generative and Evolutionary Art [5.837881923712394]
We examine the concept of complexity as it applies to generative and evolutionary art and design.
We look at the correlations between complexity and individual aesthetic judgement by the artist.
We conclude by discussing the value of direct measures in generative and evolutionary art.
arXiv Detail & Related papers (2022-01-05T06:19:55Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.