Related papers: iSNEAK: Partial Ordering as Heuristics for Model-Based Reasoning in Software Engineering

iSNEAK: Partial Ordering as Heuristics for Model-Based Reasoning in Software Engineering

URL: http://arxiv.org/abs/2310.19125v2
Date: Mon, 15 Jul 2024 01:59:27 GMT
Title: iSNEAK: Partial Ordering as Heuristics for Model-Based Reasoning in Software Engineering
Authors: Andre Lustosa, Tim Menzies,
Abstract summary: iSNEAK is an incremental human-in-the-loop AI problem solver. We propose the use of partial orderings and tools like iSNEAK to solve the information overload problem.
Score: 11.166755101891402
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A "partial ordering" is a way to heuristically order a set of examples (partial orderings are a set where, for certain pairs of elements, one precedes the other). While these orderings may only be approximate, they can be useful for guiding a search towards better regions of the data. To illustrate the value of that technique, this paper presents iSNEAK, an incremental human-in-the-loop AI problem solver. iSNEAK uses partial orderings and feedback from humans to prune the space of options. Further, in experiments with a dozen software models of increasing size and complexity (with up to 10,000 variables), iSNEAK only asked a handful of questions to return human-acceptable solutions that outperformed the prior state-of-the-art. We propose the use of partial orderings and tools like iSNEAK to solve the information overload problem where human experts grow fatigued and make mistakes when they are asked too many questions. iSNEAK mitigates the information overload problem since it allows humans to explore complex problem spaces in far less time, with far less effort.

Related papers

AI Agents as Universal Task Solvers [94.49762121230042]
We show that the optimal speed-up that a universal solver can achieve using past data is tightly related to their algorithmic information.<n>We argue that the key quantity to optimize when scaling reasoning models is time, whose critical role in learning has so far only been indirectly considered.
arXiv Detail & Related papers (2025-10-14T02:17:54Z)
Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error [18.209374823705446]
We focus on the paradigmatic task of Sudoku and achieve state-of-the-art accuracy (99%) compared to prior neuro-symbolic approaches.<n>Our method integrates imitation learning of simple Sudoku rules with an explicit Depth-First Search (DFS) exploration strategy.
arXiv Detail & Related papers (2025-09-26T07:57:34Z)
Performative Thinking? The Brittle Correlation Between CoT Length and Problem Complexity [23.225139930889522]
This work critically examines whether intermediate token sequence length reflects or correlates with problem difficulty.<n>We train transformer models from scratch on derivational traces of the A* search algorithm.<n>We find that even for the simplest tasks, they often produce excessively long reasoning traces and sometimes fail to generate a solution.
arXiv Detail & Related papers (2025-09-09T02:31:16Z)
Frontier LLMs Still Struggle with Simple Reasoning Tasks [53.497499123166804]
This work studies the performance of frontier language models on a broad set of "easy" reasoning problems.<n>We create a suite of procedurally generated simple reasoning tasks, including counting, first-order logic, proof trees, and travel planning.<n>We show that even state-of-the-art thinking models consistently fail on such problems and for similar reasons.
arXiv Detail & Related papers (2025-07-09T22:22:49Z)
Understanding Complexity in VideoQA via Visual Program Generation [31.207902042321006]
We propose a data-driven approach to analyzing query complexity in Video Question Answering (VideoQA)<n>We experimentally show that humans struggle to predict which questions are difficult for machine learning models.<n>We extend it to automatically generate complex questions, constructing a new benchmark that is 1.9 times harder than the popular NExT-QA.
arXiv Detail & Related papers (2025-05-19T17:55:14Z)
Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models [58.57279229066477]
We study how language models (LMs) solve retrieval tasks in diverse situations. We introduce ORION, a collection of structured retrieval tasks spanning six domains. We find that LMs internally decompose retrieval tasks in a modular way.
arXiv Detail & Related papers (2023-12-13T18:36:43Z)
Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach [10.293894471295205]
We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments. We present novel and simple algorithms for both learning goals.
arXiv Detail & Related papers (2023-07-13T05:05:30Z)
Models and algorithms for simple disjunctive temporal problems [0.8793721044482611]
We focus on the case where events may have an arbitrarily large number of release and due dates. We provide three mathematical models to describe this problem using constraint programming and linear programming. We implement algorithms from the literature and provide the first in-depth empirical study comparing methods to solve simple disjunctive temporal problems.
arXiv Detail & Related papers (2023-02-06T09:40:24Z)
Successive Prompting for Decomposing Complex Questions [50.00659445976735]
Recent works leverage the capabilities of large language models (LMs) to perform complex question answering in a few-shot setting. We introduce Successive Prompting'', where we iteratively break down a complex task into a simple task, solve it, and then repeat the process until we get the final solution. Our best model (with successive prompting) achieves an improvement of 5% absolute F1 on a few-shot version of the DROP dataset.
arXiv Detail & Related papers (2022-12-08T06:03:38Z)
Chaining Simultaneous Thoughts for Numerical Reasoning [92.2007997126144]
numerical reasoning over text should be an essential skill of AI systems. Previous work focused on modeling the structures of equations, and has proposed various structured decoders. We propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph.
arXiv Detail & Related papers (2022-11-29T18:52:06Z)
Is a Question Decomposition Unit All We Need? [20.66688303609522]
We investigate if humans can decompose a hard question into a set of simpler questions that are relatively easier for models to solve. We analyze a range of datasets involving various forms of reasoning and find that it is indeed possible to significantly improve model performance. Our findings indicate that Human-in-the-loop Question Decomposition (HQD) can potentially provide an alternate path to building large LMs.
arXiv Detail & Related papers (2022-05-25T07:24:09Z)
Minimizing Entropy to Discover Good Solutions to Recurrent Mixed Integer Programs [0.0]
Current solvers for mixed-integer programming (MIP) problems are designed to perform well on a wide range of problems. Recent works have shown that machine learning (ML) can be integrated with an MIP solver to inject domain knowledge and efficiently close the optimality gap. This paper proposes an online solver that uses the notion of entropy to efficiently build a model with minimal training data and tuning.
arXiv Detail & Related papers (2022-02-07T18:52:56Z)
A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering [60.768146126094955]
Weakly supervised question answering usually has only the final answers as supervision signals. There may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance. We propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions.
arXiv Detail & Related papers (2021-06-14T05:47:41Z)
Efficiently Explaining CSPs with Unsatisfiable Subset Optimization [17.498283247757445]
We build on a recently proposed method for explaining solutions of constraint satisfaction problems. An explanation here is a sequence of simple inference steps, where the simplicity of an inference step is measured by the number and types of constraints and facts used. We tackle two emerging questions, namely how to generate explanations that are provably optimal and how to generate them efficiently.
arXiv Detail & Related papers (2021-05-25T08:57:43Z)
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points. This problem setting emerges in many domains where function evaluation is a complex and expensive process. We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z)
Understanding Unnatural Questions Improves Reasoning over Text [54.235828149899625]
Complex question answering (CQA) over raw text is a challenging task. Learning an effective CQA model requires large amounts of human-annotated data. We address the challenge of learning a high-quality programmer (parser) by projecting natural human-generated questions into unnatural machine-generated questions.
arXiv Detail & Related papers (2020-10-19T10:22:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.