ImageEye: Batch Image Processing Using Program Synthesis
- URL: http://arxiv.org/abs/2304.03253v3
- Date: Wed, 14 Jun 2023 17:28:27 GMT
- Title: ImageEye: Batch Image Processing Using Program Synthesis
- Authors: Celeste Barnaby, Qiaochu Chen, Roopsha Samanta, Isil Dillig
- Abstract summary: This paper presents a new synthesis-based approach for batch image processing.
Our method can apply fine-grained edits to individual objects within the image.
We have implemented the proposed technique in a tool called ImageEye and evaluated it on 50 image editing tasks.
- Score: 7.111443975103331
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a new synthesis-based approach for batch image
processing. Unlike existing tools that can only apply global edits to the
entire image, our method can apply fine-grained edits to individual objects
within the image. For example, our method can selectively blur or crop specific
objects that have a certain property. To facilitate such fine-grained image
editing tasks, we propose a neuro-symbolic domain-specific language (DSL) that
combines pre-trained neural networks for image classification with other
language constructs that enable symbolic reasoning. Our method can
automatically learn programs in this DSL from user demonstrations by utilizing
a novel synthesis algorithm. We have implemented the proposed technique in a
tool called ImageEye and evaluated it on 50 image editing tasks. Our evaluation
shows that ImageEye is able to automate 96% of these tasks.
Related papers
- MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills [37.48977077142813]
We show that a multimodal large language model (MLLM) can be taught to critique raw photographs.<n>We demonstrate that MLLMs can be first made aware of the underlying image processing operations.<n>We then synthesize a reasoning dataset by procedurally manipulating expert-edited photos.
arXiv Detail & Related papers (2025-05-09T16:38:27Z) - WISE: Whitebox Image Stylization by Example-based Learning [0.22835610890984162]
Image-based artistic rendering can synthesize a variety of expressive styles using algorithmic image filtering.
We present an example-based image-processing system that can handle a multitude of stylization techniques.
Our method can be optimized in a style-transfer framework or learned in a generative-adversarial setting for image-to-image translation.
arXiv Detail & Related papers (2022-07-29T10:59:54Z) - FlexIT: Towards Flexible Semantic Image Translation [59.09398209706869]
We propose FlexIT, a novel method which can take any input image and a user-defined text instruction for editing.
First, FlexIT combines the input image and text into a single target point in the CLIP multimodal embedding space.
We iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
arXiv Detail & Related papers (2022-03-09T13:34:38Z) - Learning by Planning: Language-Guided Global Image Editing [53.72807421111136]
We develop a text-to-operation model to map the vague editing language request into a series of editing operations.
The only supervision in the task is the target image, which is insufficient for a stable training of sequential decisions.
We propose a novel operation planning algorithm to generate possible editing sequences from the target image as pseudo ground truth.
arXiv Detail & Related papers (2021-06-24T16:30:03Z) - RTIC: Residual Learning for Text and Image Composition using Graph
Convolutional Network [19.017377597937617]
We study the compositional learning of images and texts for image retrieval.
We introduce a novel method that combines the graph convolutional network (GCN) with existing composition methods.
arXiv Detail & Related papers (2021-04-07T09:41:52Z) - A Benchmark and Baseline for Language-Driven Image Editing [81.74863590492663]
We first present a new language-driven image editing dataset that supports both local and global editing.
Our new method treats each editing operation as a sub-module and can automatically predict operation parameters.
We believe our work, including both the benchmark and the baseline, will advance the image editing area towards a more general and free-form level.
arXiv Detail & Related papers (2020-10-05T20:51:16Z) - Text as Neural Operator: Image Manipulation by Text Instruction [68.53181621741632]
In this paper, we study a setting that allows users to edit an image with multiple objects using complex text instructions to add, remove, or change the objects.
The inputs of the task are multimodal including (1) a reference image and (2) an instruction in natural language that describes desired modifications to the image.
We show that the proposed model performs favorably against recent strong baselines on three public datasets.
arXiv Detail & Related papers (2020-08-11T07:07:10Z) - Semantic Photo Manipulation with a Generative Image Prior [86.01714863596347]
GANs are able to synthesize images conditioned on inputs such as user sketch, text, or semantic labels.
It is hard for GANs to precisely reproduce an input image.
In this paper, we address these issues by adapting the image prior learned by GANs to image statistics of an individual image.
Our method can accurately reconstruct the input image and synthesize new content, consistent with the appearance of the input image.
arXiv Detail & Related papers (2020-05-15T18:22:05Z) - Semantic Image Manipulation Using Scene Graphs [105.03614132953285]
We introduce a-semantic scene graph network that does not require direct supervision for constellation changes or image edits.
This makes possible to train the system from existing real-world datasets with no additional annotation effort.
arXiv Detail & Related papers (2020-04-07T20:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.