Scaffolding Recursive Divergence and Convergence in Story Ideation
- URL: http://arxiv.org/abs/2507.03307v1
- Date: Fri, 04 Jul 2025 05:25:19 GMT
- Title: Scaffolding Recursive Divergence and Convergence in Story Ideation
- Authors: Taewook Kim, Matthew Kay, Yuqian Sun, Melissa Roemmele, Max Kreminski, John Joon Young Chung,
- Abstract summary: Reverger is an AI-powered creativity support tool (CST) that helps users ideate variations of conceptual directions for modifying a story.<n>For divergence, it allows users to collect explored high-level directions and synthesize them into concrete variations.<n>For convergence, it allows users to collect explored high-level directions and synthesize them into concrete variations.
- Score: 19.650585665200254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human creative ideation involves both exploration of diverse ideas (divergence) and selective synthesis of explored ideas into coherent combinations (convergence). While processes of divergence and convergence are often interleaved and nested, existing AI-powered creativity support tools (CSTs) lack support for sophisticated orchestration of divergence and convergence. We present Reverger, an AI-powered CST that helps users ideate variations of conceptual directions for modifying a story by scaffolding flexible iteration between divergence and convergence. For divergence, our tool enables recursive exploration of alternative high-level directions for modifying a specific part of the original story. For convergence, it allows users to collect explored high-level directions and synthesize them into concrete variations. Users can then iterate between divergence and convergence until they find a satisfactory outcome. A within-subject study revealed that Reverger permitted participants to explore more unexpected and diverse high-level directions than a comparable baseline. Reverger users also felt that they had more fine-grained control and discovered more effort-worthy outcomes.
Related papers
- Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking [154.2388970262703]
Unified Vision-Language Models (UVLMs) aim to advance multimodal learning by supporting both understanding and generation within a single framework.<n>We introduce the interleaved Analyzing-Drafting problem-solving loop (AD-Loop), a new think paradigm that alternates between analytic and drafting operations.<n>By interleaving textual thoughts with visual thoughts, AD-Loop enables models to iteratively refine both comprehension and outputs, fostering genuine synergy.
arXiv Detail & Related papers (2026-02-24T23:26:09Z) - Integrating Diverse Assignment Strategies into DETRs [61.61489761918158]
Label assignment is a critical component in object detectors, particularly within DETR-style frameworks.<n>We propose LoRA-DETR, a flexible and lightweight framework that seamlessly integrates diverse assignment strategies into any DETR-style detector.
arXiv Detail & Related papers (2026-01-14T07:28:54Z) - Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models [26.911618039280665]
We propose a process-oriented human-AI co-creation paradigm including divergent and convergent thinking stages.<n>Our paradigm scaffolds both high-level exploration of conceptual ideas in the early divergent thinking phase and low-level exploration of variations in the later convergent thinking phrase.<n>We report on a within-subjects study comparing HAIExplore with a widely used linear chat interface (ChatGPT) for creative image generation.
arXiv Detail & Related papers (2025-12-20T14:58:41Z) - A Survey on Parallel Reasoning [58.66122129692264]
We first present a formal definition of parallel reasoning and clarify its distinction from related concepts like Chain-of-Thought.<n>We then organize and discuss advanced techniques based on a novel taxonomy, including non-interactive reasoning, interactive reasoning, and efficiency-focused decoding strategies.<n>We highlight the core challenges of parallel reasoning and suggest potential directions for future research.
arXiv Detail & Related papers (2025-10-14T05:42:19Z) - Few-Shot, No Problem: Descriptive Continual Relation Extraction [27.296604792388646]
Few-shot Continual Relation Extraction is a crucial challenge for enabling AI systems to identify and adapt to evolving relationships in real-world domains.<n>Traditional memory-based approaches often overfit to limited samples, failing to reinforce old knowledge.<n>We propose a novel retrieval-based solution, starting with a large language model to generate descriptions for each relation.
arXiv Detail & Related papers (2025-02-27T23:44:30Z) - When Disagreements Elicit Robustness: Investigating Self-Repair Capabilities under LLM Multi-Agent Disagreements [56.29265568399648]
We argue that disagreements prevent premature consensus and expand the explored solution space.<n>Disagreements on task-critical steps can derail collaboration depending on the topology of solution paths.
arXiv Detail & Related papers (2025-02-21T02:24:43Z) - GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts [39.47316836096974]
We argue that similarity is not always the panacea and totally relying on similarity would sometimes degrade the performance of retrieval augmented generation.
We propose MetRag, a Multi layEred Thoughts enhanced Retrieval Augmented Generation framework.
arXiv Detail & Related papers (2024-05-30T09:50:38Z) - Cantor: Inspiring Multimodal Chain-of-Thought of MLLM [83.6663322930814]
We argue that converging visual context acquisition and logical reasoning is pivotal for tackling visual reasoning tasks.
We propose an innovative multimodal CoT framework, termed Cantor, characterized by a perception-decision architecture.
Our experiments demonstrate the efficacy of the proposed framework, showing significant improvements in multimodal CoT performance.
arXiv Detail & Related papers (2024-04-24T17:59:48Z) - VOLTA: Improving Generative Diversity by Variational Mutual Information Maximizing Autoencoder [38.35049378875308]
We introduce VOLTA, a framework that elevates generative diversity by bridging Transformer with VAE.
We perform comprehensive experiments with two types of Transformers on six datasets to show that our approach can significantly improve generative diversity while maintaining generative quality.
arXiv Detail & Related papers (2023-07-03T08:45:42Z) - Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context
Reasoning with Language Models [58.41943058963672]
We propose a new inference framework called Recursion of Thought (RoT)
RoT introduces several special tokens that the models can output to trigger context-related operations.
Experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems.
arXiv Detail & Related papers (2023-06-12T06:34:16Z) - Visual Chain of Thought: Bridging Logical Gaps with Multimodal
Infillings [61.04460792203266]
We introduce VCoT, a novel method that leverages chain-of-thought prompting with vision-language grounding to bridge the logical gaps within sequential data.
Our method uses visual guidance to generate synthetic multimodal infillings that add consistent and novel information to reduce the logical gaps for downstream tasks.
arXiv Detail & Related papers (2023-05-03T17:58:29Z) - Combining Reconstruction and Contrastive Methods for Multimodal Representations in RL [16.792949555151978]
Learning self-supervised representations using reconstruction or contrastive losses improves performance and sample complexity of image-based and multimodal reinforcement learning (RL)
Here, different self-supervised loss functions have distinct advantages and limitations depending on the information density of the underlying sensor modality.
We propose Contrastive Reconstructive Aggregated representation Learning (CoRAL), a unified framework enabling us to choose the most appropriate self-supervised loss for each sensor modality.
arXiv Detail & Related papers (2023-02-10T15:57:20Z) - Going Beyond Approximation: Encoding Constraints for Explainable
Multi-hop Inference via Differentiable Combinatorial Solvers [4.726777092009554]
Linear Programming (ILP) provides a viable mechanism to encode explicit and controllable assumptions about explainable multi-hop inference with natural language.
An ILP formulation is non-differentiable and cannot be integrated into broader deep learning architectures.
Diff-Comb Explainer demonstrates improved accuracy and explainability over non-differentiable solvers, Transformers and existing differentiable constraint-based multi-hop inference frameworks.
arXiv Detail & Related papers (2022-08-05T18:07:53Z) - Heterogeneous Target Speech Separation [52.05046029743995]
We introduce a new paradigm for single-channel target source separation where the sources of interest can be distinguished using non-mutually exclusive concepts.
Our proposed heterogeneous separation framework can seamlessly leverage datasets with large distribution shifts.
arXiv Detail & Related papers (2022-04-07T17:14:20Z) - Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z) - Generalized Adversarially Learned Inference [42.40405470084505]
We develop methods of inference of latent variables in GANs by adversarially training an image generator along with an encoder to match two joint distributions of image and latent vector pairs.
We incorporate multiple layers of feedback on reconstructions, self-supervision, and other forms of supervision based on prior or learned knowledge about the desired solutions.
arXiv Detail & Related papers (2020-06-15T02:18:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.