Related papers: SketchRef: a Multi-Task Evaluation Benchmark for Sketch Synthesis

SketchRef: a Multi-Task Evaluation Benchmark for Sketch Synthesis

URL: http://arxiv.org/abs/2408.08623v2
Date: Wed, 09 Apr 2025 03:18:01 GMT
Title: SketchRef: a Multi-Task Evaluation Benchmark for Sketch Synthesis
Authors: Xingyue Lin, Xingjian Hu, Shuai Peng, Jianhua Zhu, Liangcai Gao,
Abstract summary: SketchRef is the first comprehensive multi-task evaluation benchmark for sketch synthesis.<n>Tasks are divided into five sub-tasks across four domains: animals, common things, human body, and faces.<n>We validate our approach by collecting 7,920 responses from art enthusiasts.
Score: 6.832790933688975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sketching is a powerful artistic technique for capturing essential visual information about real-world objects and has increasingly attracted attention in image synthesis research. However, the field lacks a unified benchmark to evaluate the performance of various synthesis methods. To address this, we propose SketchRef, the first comprehensive multi-task evaluation benchmark for sketch synthesis. SketchRef fully leverages the shared characteristics between sketches and reference photos. It introduces two primary tasks: category prediction and structural consistency estimation, the latter being largely overlooked in previous studies. These tasks are further divided into five sub-tasks across four domains: animals, common things, human body, and faces. Recognizing the inherent trade-off between recognizability and simplicity in sketches, we are the first to quantify this balance by introducing a recognizability calculation method constrained by simplicity, mRS, ensuring fair and meaningful evaluations. To validate our approach, we collected 7,920 responses from art enthusiasts, confirming the effectiveness of our proposed evaluation metrics. Additionally, we evaluate the performance of existing sketch synthesis methods on our benchmark, highlighting their strengths and weaknesses. We hope this study establishes a standardized benchmark and offers valuable insights for advancing sketch synthesis algorithms.

Related papers

Annotation-Free Human Sketch Quality Assessment [56.71509868378274]
This paper studies quality assessment for the first time -- letting you find these badly drawn ones.<n>Key discovery lies in exploiting the magnitude ($L metric and$ norm) of a sketch feature as a quantitative quality metric.<n>We show how such a quality assessment capability can for the first time enable three practical sketch applications.
arXiv Detail & Related papers (2025-07-28T06:18:51Z)
Design and Evaluation of Deep Learning-Based Dual-Spectrum Image Fusion Methods [0.0]
deep learning-based fusion methods have gained attention, but current evaluations rely on general-purpose metrics without standardized benchmarks or downstream task performance.<n>To address these gaps, we construct a high-quality dual-spectrum dataset captured in campus environments.<n>We propose a comprehensive and fair evaluation framework that integrates fusion speed, general metrics, and object detection performance.
arXiv Detail & Related papers (2025-06-09T13:56:32Z)
Hybrid Primal Sketch: Combining Analogy, Qualitative Representations, and Computer Vision for Scene Understanding [7.687215328455751]
We have developed a new framework inspired by Marr's Primal Sketch. The Hybrid Primal Sketch combines computer vision components into an ensemble to produce sketch-like entities. This paper describes our theoretical framework, summarizes several previous experiments, and outlines a new experiment in progress on diagram understanding.
arXiv Detail & Related papers (2024-07-05T20:44:35Z)
CrossScore: Towards Multi-View Image Evaluation and Scoring [24.853612457257697]
Cross-reference image quality assessment method fills the gap in the image assessment landscape. Our method enables accurate image quality assessment without requiring ground truth references.
arXiv Detail & Related papers (2024-04-22T17:59:36Z)
Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System [1.2289361708127877]
This research addresses a critical challenge in the field of generative models, particularly in the generation and evaluation of synthetic images. We introduce a pioneering algorithm to objectively assess the realism of synthetic images. Our algorithm is particularly tailored to address the challenges in generating and evaluating realistic images of Arabic handwritten digits.
arXiv Detail & Related papers (2024-02-27T04:53:53Z)
SEVA: Leveraging sketches to evaluate alignment between human and machine visual abstraction [19.70530050403922]
Sketching is a powerful tool for creating abstract images that are sparse but meaningful. Current vision algorithms have achieved high performance on a variety of visual tasks. It remains unclear to what extent they understand sketches in a human-like way.
arXiv Detail & Related papers (2023-12-05T13:54:55Z)
A Fine-Grained Image Description Generation Method Based on Joint Objectives [7.565093400979752]
We propose an innovative Fine-grained Image Description Generation model based on Joint Objectives. We introduce new object-based evaluation metrics to more intuitively assess the model's performance in handling description repetition and omission. Experimental results demonstrate that our proposed method significantly improves the CIDEr evaluation metric.
arXiv Detail & Related papers (2023-09-02T03:22:39Z)
CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle Components [77.33782775860028]
We introduce CarPatch, a novel synthetic benchmark of vehicles. In addition to a set of images annotated with their intrinsic and extrinsic camera parameters, the corresponding depth maps and semantic segmentation masks have been generated for each view. Global and part-based metrics have been defined and used to evaluate, compare, and better characterize some state-of-the-art techniques.
arXiv Detail & Related papers (2023-07-24T11:59:07Z)
Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings [99.9788496281408]
We study how sketches can be used as a weak label to detect salient objects present in an image. To accomplish this, we introduce a photo-to-sketch generation model that aims to generate sequential sketch coordinates corresponding to a given visual photo. Tests prove our hypothesis and delineate how our sketch-based saliency detection model gives a competitive performance compared to the state-of-the-art.
arXiv Detail & Related papers (2023-03-20T23:46:46Z)
TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data. We learn to predict realistic texture of objects from real image collections. We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z)
A Visual Navigation Perspective for Category-Level Object Pose Estimation [41.60364392204057]
This paper studies category-level object pose estimation based on a single monocular image. Recent advances in pose-aware generative models have paved the way for addressing this challenging task using analysis-by-synthesis.
arXiv Detail & Related papers (2022-03-25T10:57:37Z)
Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation. The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z)
TISE: A Toolbox for Text-to-Image Synthesis Evaluation [9.092600296992925]
We conduct a study on state-of-the-art methods for single- and multi-object text-to-image synthesis. We propose a common framework for evaluating these methods.
arXiv Detail & Related papers (2021-12-02T16:39:35Z)
Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations. We propose an unsupervised approach to object part discovery and segmentation. Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z)
Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation. In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples. We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z)
Supervised Video Summarization via Multiple Feature Sets with Parallel Attention [4.931399476945033]
We suggest a novel model architecture that combines three feature sets for visual content and motion to predict importance scores. The proposed architecture utilizes an attention mechanism before fusing motion features and features representing the (static) visual content. Comprehensive experimental evaluations are reported for two well-known datasets, SumMe and TVSum.
arXiv Detail & Related papers (2021-04-23T10:46:35Z)
Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis [54.94682858474711]
Class Activation Mapping (CAM) approaches provide an effective visualization by taking weighted averages of the activation maps. We propose a novel set of metrics to quantify explanation maps, which show better effectiveness and simplify comparisons between approaches.
arXiv Detail & Related papers (2021-04-20T21:34:24Z)
Unifying Remote Sensing Image Retrieval and Classification with Robust Fine-tuning [3.6526118822907594]
We aim at unifying remote sensing image retrieval and classification with a new large-scale training and testing dataset, SF300. We show that our framework systematically achieves a boost of retrieval and classification performance on nine different datasets compared to an ImageNet pretrained baseline.
arXiv Detail & Related papers (2021-02-26T11:01:30Z)
Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval [147.24102408745247]
We study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail. In this paper, we design a novel network that is capable of cultivating sketch-specific hierarchies and exploiting them to match sketch with photo at corresponding hierarchical levels.
arXiv Detail & Related papers (2020-07-29T20:50:25Z)
On Learning Semantic Representations for Million-Scale Free-Hand Sketches [146.52892067335128]
We study learning semantic representations for million-scale free-hand sketches. We propose a dual-branch CNNRNN network architecture to represent sketches. We explore learning the sketch-oriented semantic representations in hashing retrieval and zero-shot recognition.
arXiv Detail & Related papers (2020-07-07T15:23:22Z)
A Revised Generative Evaluation of Visual Dialogue [80.17353102854405]
We propose a revised evaluation scheme for the VisDial dataset. We measure consensus between answers generated by the model and a set of relevant answers. We release these sets and code for the revised evaluation scheme as DenseVisDial.
arXiv Detail & Related papers (2020-04-20T13:26:45Z)
Deep Self-Supervised Representation Learning for Free-Hand Sketch [51.101565480583304]
We tackle the problem of self-supervised representation learning for free-hand sketches. Key for the success of our self-supervised learning paradigm lies with our sketch-specific designs. We show that the proposed approach outperforms the state-of-the-art unsupervised representation learning methods.
arXiv Detail & Related papers (2020-02-03T16:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.