Personalizing image enhancement for critical visual tasks: improved
legibility of papyri using color processing and visual illusions
- URL: http://arxiv.org/abs/2104.01106v2
- Date: Mon, 30 Aug 2021 21:28:00 GMT
- Title: Personalizing image enhancement for critical visual tasks: improved
legibility of papyri using color processing and visual illusions
- Authors: Vlad Atanasiu, Isabelle Marthot-Santaniello
- Abstract summary: Methods: Novel enhancement algorithms based on color processing and visual illusions are compared to classic methods in a user experience experiment.
Users exhibited a broad behavioral spectrum, under the influence of factors such as personality and social conditioning, tasks and application domains, expertise level and image quality, and affordances of software, hardware, and interfaces.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Purpose: This article develops theoretical, algorithmic, perceptual, and
interaction aspects of script legibility enhancement in the visible light
spectrum for the purpose of scholarly editing of papyri texts. - Methods: Novel
legibility enhancement algorithms based on color processing and visual
illusions are compared to classic methods in a user experience experiment. -
Results: (1) The proposed methods outperformed the comparison methods. (2)
Users exhibited a broad behavioral spectrum, under the influence of factors
such as personality and social conditioning, tasks and application domains,
expertise level and image quality, and affordances of software, hardware, and
interfaces. No single enhancement method satisfied all factor configurations.
Therefore, it is suggested to offer users a broad choice of methods to
facilitate personalization, contextualization, and complementarity. (3) A
distinction is made between casual and critical vision on the basis of signal
ambiguity and error consequences. The criteria of a paradigm for enhancing
images for critical applications comprise: interpreting images skeptically;
approaching enhancement as a system problem; considering all image structures
as potential information; and making uncertainty and alternative
interpretations explicit, both visually and numerically.
Related papers
- A Psychological Study: Importance of Contrast and Luminance in Color to
Grayscale Mapping [2.1481347363838017]
Grayscale images are essential in image processing and computer vision tasks.
To evaluate and compare different decolorization algorithms, we designed a psychological experiment.
We conducted a comparison between two types of algorithms: perceptual-based simple color space conversion algorithms, and (ii) spatial contrast-based algorithms.
arXiv Detail & Related papers (2024-02-07T04:51:14Z) - Cones 2: Customizable Image Synthesis with Multiple Subjects [50.54010141032032]
We study how to efficiently represent a particular subject as well as how to appropriately compose different subjects.
By rectifying the activations in the cross-attention map, the layout appoints and separates the location of different subjects in the image.
arXiv Detail & Related papers (2023-05-30T18:00:06Z) - Edge-Aware Image Color Appearance and Difference Modeling [0.0]
Humans have developed a keen sense of color and are able to detect subtle differences in appearance.
Applying contrast sensitivity functions and local adaptation rules in an edge-aware manner improves image difference predictions.
arXiv Detail & Related papers (2023-04-20T22:55:16Z) - Holistic Visual-Textual Sentiment Analysis with Prior Models [64.48229009396186]
We propose a holistic method that achieves robust visual-textual sentiment analysis.
The proposed method consists of four parts: (1) a visual-textual branch to learn features directly from data for sentiment analysis, (2) a visual expert branch with a set of pre-trained "expert" encoders to extract selected semantic visual features, (3) a CLIP branch to implicitly model visual-textual correspondence, and (4) a multimodal feature fusion network based on BERT to fuse multimodal features and make sentiment predictions.
arXiv Detail & Related papers (2022-11-23T14:40:51Z) - Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner.
Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z) - Visual Perturbation-aware Collaborative Learning for Overcoming the
Language Prior Problem [60.0878532426877]
We propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration.
Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents.
The experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness.
arXiv Detail & Related papers (2022-07-24T23:50:52Z) - Image-Based Benchmarking and Visualization for Large-Scale Global
Optimization [6.5447678518952115]
An image-based visualization framework is proposed that visualizes the solutions to large-scale global optimization problems as images are proposed.
In the proposed framework, the pixels visualize decision variables while the entire image represents the overall solution quality.
The proposed framework is then demonstrated on arbitrary benchmark problems with known optima.
arXiv Detail & Related papers (2020-07-24T03:39:23Z) - A Novel Attention-based Aggregation Function to Combine Vision and
Language [55.7633883960205]
We propose a novel fully-attentive reduction method for vision and language.
Specifically, our approach computes a set of scores for each element of each modality employing a novel variant of cross-attention.
We test our approach on image-text matching and visual question answering, building fair comparisons with other reduction choices.
arXiv Detail & Related papers (2020-04-27T18:09:46Z) - Image-to-Image Translation with Text Guidance [139.41321867508722]
The goal of this paper is to embed controllable factors, i.e., natural language descriptions, into image-to-image translation with generative adversarial networks.
We propose four key components: (1) the implementation of part-of-speech tagging to filter out non-semantic words in the given description, (2) the adoption of an affine combination module to effectively fuse different modality text and image features, and (3) a novel refined multi-stage architecture to strengthen the differential ability of discriminators and the rectification ability of generators.
arXiv Detail & Related papers (2020-02-12T21:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.