Visual Exploration of Large-Scale Image Datasets for Machine Learning
with Treemaps
- URL: http://arxiv.org/abs/2205.06935v1
- Date: Sat, 14 May 2022 00:26:20 GMT
- Title: Visual Exploration of Large-Scale Image Datasets for Machine Learning
with Treemaps
- Authors: Donald Bertucci, Md Montaser Hamid, Yashwanthi Anand, Anita
Ruangrotsakun, Delyar Tabatabai, Melissa Perez, and Minsuk Kahng
- Abstract summary: We develop DendroMap, a novel approach to exploring large-scale image datasets for machine learning.
It effectively organizes images by extracting hierarchical cluster structures from high-dimensional representations of images.
It enables users to make sense of the overall distributions of datasets and interactively zoom into specific areas of interests.
- Score: 1.881768127321966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present DendroMap, a novel approach to interactively
exploring large-scale image datasets for machine learning. Machine learning
practitioners often explore image datasets by generating a grid of images or
projecting high-dimensional representations of images into 2-D using
dimensionality reduction techniques (e.g., t-SNE). However, neither approach
effectively scales to large datasets because images are ineffectively organized
and interactions are insufficiently supported. To address these challenges, we
develop DendroMap by adapting Treemaps, a well-known visualization technique.
DendroMap effectively organizes images by extracting hierarchical cluster
structures from high-dimensional representations of images. It enables users to
make sense of the overall distributions of datasets and interactively zoom into
specific areas of interests at multiple levels of abstraction. Our case studies
with widely-used image datasets for deep learning demonstrate that users can
discover insights about datasets and trained models by examining the diversity
of images, identifying underperforming subgroups, and analyzing classification
errors. We conducted a user study that evaluates the effectiveness of DendroMap
in grouping and searching tasks by comparing it with a gridified version of
t-SNE and found that participants preferred DendroMap over the compared method.
Related papers
- Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision.
We construct a taxonomy and review the most prominent papers in recent years.
We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - SepHRNet: Generating High-Resolution Crop Maps from Remote Sensing
imagery using HRNet with Separable Convolution [3.717258819781834]
We propose a novel Deep learning approach that integrates HRNet with Separable Convolutional layers to capture spatial patterns and Self-attention to capture temporal patterns of the data.
The proposed algorithm achieves a high classification accuracy of 97.5% and IoU of 55.2% in generating crop maps.
arXiv Detail & Related papers (2023-07-11T18:07:25Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Learning Efficient Representations for Enhanced Object Detection on
Large-scene SAR Images [16.602738933183865]
It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images.
Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images.
We propose an efficient and robust deep learning based target detection method.
arXiv Detail & Related papers (2022-01-22T03:25:24Z) - Learning Hierarchical Graph Representation for Image Manipulation
Detection [50.04902159383709]
The objective of image manipulation detection is to identify and locate the manipulated regions in the images.
Recent approaches mostly adopt the sophisticated Convolutional Neural Networks (CNNs) to capture the tampering artifacts left in the images.
We propose a hierarchical Graph Convolutional Network (HGCN-Net), which consists of two parallel branches.
arXiv Detail & Related papers (2022-01-15T01:54:25Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Homography augumented momentum constrastive learning for SAR image
retrieval [3.9743795764085545]
We propose a deep learning-based image retrieval approach using homography transformation augmented contrastive learning.
We also propose a training method for the DNNs induced by contrastive learning that does not require any labeling procedure.
arXiv Detail & Related papers (2021-09-21T17:27:07Z) - From Heatmaps to Structural Explanations of Image Classifiers [31.44267537307587]
The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network.
Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++.
Through the research process, we have learned much about insights in building deep network explanations.
arXiv Detail & Related papers (2021-09-13T23:39:57Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Sparse data to structured imageset transformation [0.0]
Machine learning problems involving sparse datasets may benefit from the use of convolutional neural networks if the numbers of samples and features are very large.
We convert such datasets to imagesets while attempting to give each image structure that is amenable for use with convolutional neural networks.
arXiv Detail & Related papers (2020-05-07T20:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.