Hybrid Primal Sketch: Combining Analogy, Qualitative Representations, and Computer Vision for Scene Understanding
- URL: http://arxiv.org/abs/2407.04859v1
- Date: Fri, 5 Jul 2024 20:44:35 GMT
- Title: Hybrid Primal Sketch: Combining Analogy, Qualitative Representations, and Computer Vision for Scene Understanding
- Authors: Kenneth D. Forbus, Kezhen Chen, Wangcheng Xu, Madeline Usher,
- Abstract summary: We have developed a new framework inspired by Marr's Primal Sketch.
The Hybrid Primal Sketch combines computer vision components into an ensemble to produce sketch-like entities.
This paper describes our theoretical framework, summarizes several previous experiments, and outlines a new experiment in progress on diagram understanding.
- Score: 7.687215328455751
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the purposes of perception is to bridge between sensors and conceptual understanding. Marr's Primal Sketch combined initial edge-finding with multiple downstream processes to capture aspects of visual perception such as grouping and stereopsis. Given the progress made in multiple areas of AI since then, we have developed a new framework inspired by Marr's work, the Hybrid Primal Sketch, which combines computer vision components into an ensemble to produce sketch-like entities which are then further processed by CogSketch, our model of high-level human vision, to produce both more detailed shape representations and scene representations which can be used for data-efficient learning via analogical generalization. This paper describes our theoretical framework, summarizes several previous experiments, and outlines a new experiment in progress on diagram understanding.
Related papers
- Thinking with Generated Images [30.28526622443551]
We present Thinking with Generated Images, a novel paradigm that transforms how large multimodal models (LMMs) engage with visual reasoning.<n>Our approach enables AI models to engage in the kind of visual imagination and iterative refinement that characterizes human creative, analytical, and strategic thinking.
arXiv Detail & Related papers (2025-05-28T16:12:45Z) - For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives [3.418398936676879]
This work presents FRESCO, a framework designed to explore the socio-cultural implications of images on social media platforms at scale.
FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques.
The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer.
arXiv Detail & Related papers (2024-07-03T16:57:38Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - AIGenC: An AI generalisation model via creativity [1.933681537640272]
Inspired by cognitive theories of creativity, this paper introduces a computational model (AIGenC)
It lays down the necessary components to enable artificial agents to learn, use and generate transferable representations.
We discuss the model's capability to yield better out-of-distribution generalisation in artificial agents.
arXiv Detail & Related papers (2022-05-19T17:43:31Z) - FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic
descriptions, and Conceptual Relations [99.54048050189971]
We present a framework for learning new visual concepts quickly, guided by multiple naturally occurring data streams.
The learned concepts support downstream applications, such as answering questions by reasoning about unseen images.
We demonstrate the effectiveness of our model on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-03-30T19:45:00Z) - Interactive Disentanglement: Learning Concepts by Interacting with their
Prototype Representations [15.284688801788912]
We show the advantages of prototype representations for understanding and revising the latent space of neural concept learners.
For this purpose, we introduce interactive Concept Swapping Networks (iCSNs)
iCSNs learn to bind conceptual information to specific prototype slots by swapping the latent representations of paired images.
arXiv Detail & Related papers (2021-12-04T09:25:40Z) - Unsupervised Learning of Compositional Energy Concepts [70.11673173291426]
We propose COMET, which discovers and represents concepts as separate energy functions.
Comet represents both global concepts as well as objects under a unified framework.
arXiv Detail & Related papers (2021-11-04T17:46:12Z) - Visually Grounded Concept Composition [31.981204314287282]
We learn the grounding of both primitive and all composed concepts by aligning them to images.
We show that learning to compose leads to more robust grounding results, measured in text-to-image matching accuracy.
arXiv Detail & Related papers (2021-09-29T00:38:58Z) - Learning to Associate Every Segment for Video Panoptic Segmentation [123.03617367709303]
We learn coarse segment-level matching and fine pixel-level matching together.
We show that our per-frame computation model can achieve new state-of-the-art results on Cityscapes-VPS and VIPER datasets.
arXiv Detail & Related papers (2021-06-17T13:06:24Z) - Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis.
We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z) - SketchEmbedNet: Learning Novel Concepts by Imitating Drawings [125.45799722437478]
We explore properties of image representations learned by training a model to produce sketches of images.
We show that this generative, class-agnostic model produces informative embeddings of images from novel examples, classes, and even novel datasets in a few-shot setting.
arXiv Detail & Related papers (2020-08-27T16:43:28Z) - Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and
Novel-View Synthesis [39.53519330457627]
We propose a novel task of joint few-shot recognition and novel-view synthesis.
We aim to simultaneously learn an object classifier and generate images of that type of object from new viewpoints.
We focus on the interaction and cooperation between a generative model and a discriminative model.
arXiv Detail & Related papers (2020-08-16T19:40:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.