GANSlider: How Users Control Generative Models for Images using Multiple
Sliders with and without Feedforward Information
- URL: http://arxiv.org/abs/2202.00965v1
- Date: Wed, 2 Feb 2022 11:25:07 GMT
- Title: GANSlider: How Users Control Generative Models for Images using Multiple
Sliders with and without Feedforward Information
- Authors: Hai Dang, Lukas Mecke, Daniel Buschek
- Abstract summary: We investigate how multiple sliders with and without feedforward visualizations influence users' control of generative models.
We found that more control dimensions (sliders) significantly increase task difficulty and user actions.
Visualization alone are not always sufficient for users to understand individual control dimensions.
- Score: 33.28541180149195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate how multiple sliders with and without feedforward
visualizations influence users' control of generative models. In an online
study (N=138), we collected a dataset of people interacting with a generative
adversarial network (StyleGAN2) in an image reconstruction task. We found that
more control dimensions (sliders) significantly increase task difficulty and
user actions. Visual feedforward partly mitigates this by enabling more
goal-directed interaction. However, we found no evidence of faster or more
accurate task performance. This indicates a tradeoff between feedforward detail
and implied cognitive costs, such as attention. Moreover, we found that
visualizations alone are not always sufficient for users to understand
individual control dimensions. Our study quantifies fundamental UI design
factors and resulting interaction behavior in this context, revealing
opportunities for improvement in the UI design for interactive applications of
generative models. We close by discussing design directions and further
aspects.
Related papers
- GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation [13.1192216083304]
We propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation.
Specifically, we first enhance the user-item graph using multimodal similarity between items.
We then construct two types of user modalities: explicit interaction features and extended interest features.
arXiv Detail & Related papers (2024-07-17T06:29:00Z) - Identifying User Goals from UI Trajectories [19.492331502146886]
This paper introduces the task of goal identification from observed UI trajectories.
We propose a novel evaluation metric to assess whether two task descriptions are paraphrased within a specific UI environment.
Using our metric and these datasets, we conducted several experiments comparing the performance of humans and state-of-the-art models.
arXiv Detail & Related papers (2024-06-20T13:46:10Z) - Think, Act, and Ask: Open-World Interactive Personalized Robot Navigation [17.279875204729553]
Zero-Shot Object Navigation (ZSON) enables agents to navigate towards open-vocabulary objects in unknown environments.
We introduce ZIPON, where robots need to navigate to personalized goal objects while engaging in conversations with users.
We propose Open-woRld Interactive persOnalized Navigation (ORION) to make sequential decisions to manipulate different modules for perception, navigation and communication.
arXiv Detail & Related papers (2023-10-12T01:17:56Z) - Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition [45.0131792009999]
We propose a point cloud-based network named Two-stream Multi-level Dynamic Point Transformer for two-person interaction recognition.
Our model addresses the challenge of recognizing two-person interactions by incorporating local-region spatial information, appearance information, and motion information.
Our network outperforms state-of-the-art approaches in most standard evaluation settings.
arXiv Detail & Related papers (2023-07-22T03:51:32Z) - Learning Large-scale Universal User Representation with Sparse Mixture
of Experts [1.2722697496405464]
We propose SUPERMOE, a generic framework to obtain high quality user representation from multiple tasks.
Specifically, the user behaviour sequences are encoded by MoE transformer, and we can thus increase the model capacity to billions of parameters.
In order to deal with seesaw phenomenon when learning across multiple tasks, we design a new loss function with task indicators.
arXiv Detail & Related papers (2022-07-11T06:19:03Z) - First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual
Information Maximization [112.40598205054994]
We formalize this idea as a completely unsupervised objective for optimizing interfaces.
We conduct an observational study on 540K examples of users operating various keyboard and eye gaze interfaces for typing, controlling simulated robots, and playing video games.
The results show that our mutual information scores are predictive of the ground-truth task completion metrics in a variety of domains.
arXiv Detail & Related papers (2022-05-24T21:57:18Z) - Zero Experience Required: Plug & Play Modular Transfer Learning for
Semantic Visual Navigation [97.17517060585875]
We present a unified approach to visual navigation using a novel modular transfer learning model.
Our model can effectively leverage its experience from one source task and apply it to multiple target tasks.
Our approach learns faster, generalizes better, and outperforms SoTA models by a significant margin.
arXiv Detail & Related papers (2022-02-05T00:07:21Z) - Knowledge-Enhanced Hierarchical Graph Transformer Network for
Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems.
KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics.
We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z) - Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation [61.114580368455236]
User purchasing prediction with multi-behavior information remains a challenging problem for current recommendation systems.
We propose the concept of hyper meta-path to construct hyper meta-paths or hyper meta-graphs to explicitly illustrate the dependencies among different behaviors of a user.
Thanks to the recent success of graph contrastive learning, we leverage it to learn embeddings of user behavior patterns adaptively instead of assigning a fixed scheme to understand the dependencies among different behaviors.
arXiv Detail & Related papers (2021-09-07T04:28:09Z) - ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
Interaction Detection [101.56529337489417]
We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of human, action, object> in images.
We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs.
Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities.
arXiv Detail & Related papers (2020-08-14T09:11:18Z) - Disentangled Graph Collaborative Filtering [100.26835145396782]
Disentangled Graph Collaborative Filtering (DGCF) is a new model for learning informative representations of users and items from interaction data.
By modeling a distribution over intents for each user-item interaction, we iteratively refine the intent-aware interaction graphs and representations.
DGCF achieves significant improvements over several state-of-the-art models like NGCF, DisenGCN, and MacridVAE.
arXiv Detail & Related papers (2020-07-03T15:37:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.