A Sign That Spells: DALL-E 2, Invisual Images and The Racial Politics of
Feature Space
- URL: http://arxiv.org/abs/2211.06323v1
- Date: Wed, 26 Oct 2022 17:49:17 GMT
- Title: A Sign That Spells: DALL-E 2, Invisual Images and The Racial Politics of
Feature Space
- Authors: Fabian Offert and Thao Phan
- Abstract summary: We focus on DALL-E 2 and related models as an emergent approach to image-making that operates through the cultural techniques of feature extraction and semantic compression.
We use Open AI's failed efforts to 'debias' their system as a critical opening to interrogate how systems like DALL-E 2 dissolve and reconstitute politically salient human concepts like race.
- Score: 3.468886360466784
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we examine how generative machine learning systems produce a
new politics of visual culture. We focus on DALL-E 2 and related models as an
emergent approach to image-making that operates through the cultural techniques
of feature extraction and semantic compression. These techniques, we argue, are
inhuman, invisual, and opaque, yet are still caught in a paradox that is
ironically all too human: the consistent reproduction of whiteness as a latent
feature of dominant visual culture. We use Open AI's failed efforts to 'debias'
their system as a critical opening to interrogate how systems like DALL-E 2
dissolve and reconstitute politically salient human concepts like race. This
example vividly illustrates the stakes of this moment of transformation, when
so-called foundation models reconfigure the boundaries of visual culture and
when 'doing' anti-racism means deploying quick technical fixes to mitigate
personal discomfort, or more importantly, potential commercial loss.
Related papers
- CogMorph: Cognitive Morphing Attacks for Text-to-Image Models [65.38747950692752]
This paper reveals a significant and previously unrecognized ethical risk inherent in text-to-image (T2I) generative models.
We introduce a novel method, termed the Cognitive Morphing Attack (CogMorph), which manipulates T2I models to generate images that retain the original core subjects but embeds toxic or harmful contextual elements.
arXiv Detail & Related papers (2025-01-21T01:45:56Z) - Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art [90.8684263806649]
We show how AI can transcend human cognitive limitations in visual art creation.
Our research hypothesizes that visual art contains a vast unexplored space of conceptual combinations.
We present the Alien Recombination method to identify and generate concept combinations that lie beyond human cognitive availability.
arXiv Detail & Related papers (2024-11-18T11:55:38Z) - Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction [60.964512894143475]
We present Generative Spatial Transformer ( GST), a novel auto-regressive framework that jointly addresses spatial localization and view prediction.
Our model simultaneously estimates the camera pose from a single image and predicts the view from a new camera pose, effectively bridging the gap between spatial awareness and visual prediction.
arXiv Detail & Related papers (2024-10-24T17:58:05Z) - MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models [51.1034358143232]
We introduce component-controllable personalization, a novel task that pushes the boundaries of text-to-image (T2I) models.
To overcome these challenges, we design MagicTailor, an innovative framework that leverages Dynamic Masked Degradation (DM-Deg) to dynamically perturb undesired visual semantics.
arXiv Detail & Related papers (2024-10-17T09:22:53Z) - Attention is All You Want: Machinic Gaze and the Anthropocene [2.4554686192257424]
computational vision interprets and synthesises representations of the Anthropocene.
We examine how this emergent machinic gaze both looks out, through its compositions of futuristic landscapes, and looks back, towards an observing and observed human subject.
In its varied assistive, surveillant and generative roles, computational vision not only mirrors human desire but articulates oblique demands of its own.
arXiv Detail & Related papers (2024-05-16T00:00:53Z) - Contextual Emotion Recognition using Large Vision Language Models [0.6749750044497732]
Achieving human-level recognition of the apparent emotion of a person in real world situations remains an unsolved task in computer vision.
In this paper, we examine two major approaches enabled by recent large vision language models.
We demonstrate that a vision language model, fine-tuned even on a small dataset, can significantly outperform traditional baselines.
arXiv Detail & Related papers (2024-05-14T23:24:12Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - AI and Blackness: Towards moving beyond bias and representation [0.8223798883838329]
We argue that AI ethics must move beyond the concepts of race-based representation and bias.
Antiblackness in AI requires more of an examination of the ontological space that provides a foundation for the design, development, and deployment of AI systems.
arXiv Detail & Related papers (2021-11-05T18:24:54Z) - PACE: Posthoc Architecture-Agnostic Concept Extractor for Explaining
CNNs [3.0724051098062097]
We introduce a Posthoc Architecture-agnostic Concept Extractor (PACE) that automatically extracts smaller sub-regions of the image.
PACE tightly integrates the faithfulness of the explanatory framework to the black-box model.
The results from these experiments suggest that over 72% of the concepts extracted by PACE are human interpretable.
arXiv Detail & Related papers (2021-08-31T13:36:15Z) - Attack to Fool and Explain Deep Networks [59.97135687719244]
We counter-argue by providing evidence of human-meaningful patterns in adversarial perturbations.
Our major contribution is a novel pragmatic adversarial attack that is subsequently transformed into a tool to interpret the visual models.
arXiv Detail & Related papers (2021-06-20T03:07:36Z) - Towards decolonising computational sciences [0.0]
We see this struggle as requiring two basic steps.
grappling with our fields' histories and heritage holds the key to avoiding mistakes of the past.
We aspire for these fields to progress away from their stagnant, sexist, and racist shared past.
arXiv Detail & Related papers (2020-09-29T18:48:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.