Related papers: Matching Problems to Solutions: An Explainable Way of Solving Machine Learning Problems

Matching Problems to Solutions: An Explainable Way of Solving Machine Learning Problems

URL: http://arxiv.org/abs/2406.15662v1
Date: Fri, 21 Jun 2024 21:39:34 GMT
Title: Matching Problems to Solutions: An Explainable Way of Solving Machine Learning Problems
Authors: Lokman Saleh, Hafedh Mili, Mounir Boukadoum,
Abstract summary: Domain experts from all fields are called upon, working with data scientists, to explore the use of ML techniques to solve their problems. This paper focuses on: 1) the representation of domain problems, ML problems, and the main ML solution artefacts, and 2) a matching function that helps identify the ML algorithm family that is most appropriate for the domain problem at hand.
Score: 1.7368964547487398
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Domain experts from all fields are called upon, working with data scientists, to explore the use of ML techniques to solve their problems. Starting from a domain problem/question, ML-based problem-solving typically involves three steps: (1) formulating the business problem (problem domain) as a data analysis problem (solution domain), (2) sketching a high-level ML-based solution pattern, given the domain requirements and the properties of the available data, and (3) designing and refining the different components of the solution pattern. There has to be a substantial body of ML problem solving knowledge that ML researchers agree on, and that ML practitioners routinely apply to solve the most common problems. Our work deals with capturing this body of knowledge, and embodying it in a ML problem solving workbench to helps domain specialists who are not ML experts to explore the ML solution space. This paper focuses on: 1) the representation of domain problems, ML problems, and the main ML solution artefacts, and 2) a heuristic matching function that helps identify the ML algorithm family that is most appropriate for the domain problem at hand, given the domain (expert) requirements, and the characteristics of the training data. We review related work and outline our strategy for validating the workbench

Related papers

Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey [48.53273952814492]
Large Language Models (LLMs) have emerged as powerful tools capable of tackling complex problems across diverse domains.<n>Applying LLMs to real-world problem-solving presents significant challenges, including multi-step reasoning, domain knowledge integration, and result verification.
arXiv Detail & Related papers (2025-05-06T10:53:58Z)
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models [11.706309334631985]
We present Big-Math, a dataset of over 250,000 high-quality math questions with verifiable answers. Big-Math is purposefully made for reinforcement learning (RL)
arXiv Detail & Related papers (2025-02-24T18:14:01Z)
EHOP: A Dataset of Everyday NP-Hard Optimization Problems [66.41749917354159]
Everyday Hard Optimization Problems (EHOP) is a collection of NP-hard optimization problems expressed in natural language. EHOP includes problem formulations that could be found in computer science textbooks, versions that are dressed up as problems that could arise in real life, and variants of well-known problems with inverted rules. We find that state-of-the-art LLMs, across multiple prompting strategies, systematically solve textbook problems more accurately than their real-life and inverted counterparts.
arXiv Detail & Related papers (2025-02-19T14:39:59Z)
Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation [58.799397354312596]
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks. Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation. In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges.
arXiv Detail & Related papers (2025-02-18T03:20:50Z)
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection [60.297079601066784]
We introduce ErrorRadar, the first benchmark designed to assess MLLMs' capabilities in error detection. ErrorRadar evaluates two sub-tasks: error step identification and error categorization. It consists of 2,500 high-quality multimodal K-12 mathematical problems, collected from real-world student interactions. Results indicate significant challenges still remain, as GPT-4o with best performance is still around 10% behind human evaluation.
arXiv Detail & Related papers (2024-10-06T14:59:09Z)
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data [20.31528845718877]
Large language models (LLMs) have significantly advanced natural language understanding and demonstrated strong problem-solving abilities. This paper investigates the mathematical problem-solving capabilities of LLMs using the newly developed "MathOdyssey" dataset.
arXiv Detail & Related papers (2024-06-26T13:02:35Z)
Eliciting Problem Specifications via Large Language Models [4.055489363682198]
Large language models (LLMs) can be utilized to map a problem class into a semi-formal specification. A cognitive system can then use the problem-space specification to solve multiple instances of problems from the problem class.
arXiv Detail & Related papers (2024-05-20T16:19:02Z)
Divide-or-Conquer? Which Part Should You Distill Your LLM? [38.62667131299918]
We devise a similar strategy that breaks down reasoning tasks into a problem decomposition phase and a problem solving phase. We show that the strategy is able to outperform a single stage solution.
arXiv Detail & Related papers (2024-02-22T22:28:46Z)
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? [140.9751389452011]
We study the biases of large language models (LLMs) in relation to those known in children when solving arithmetic word problems. We generate a novel set of word problems for each of these tests, using a neuro-symbolic approach that enables fine-grained control over the problem features.
arXiv Detail & Related papers (2024-01-31T18:48:20Z)
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model [124.68242155098189]
Large language models (LLMs) have shown remarkable proficiency in human-level reasoning and generation capabilities. G-LLaVA demonstrates exceptional performance in solving geometric problems, significantly outperforming GPT-4-V on the MathVista benchmark with only 7B parameters.
arXiv Detail & Related papers (2023-12-18T17:36:20Z)
MacGyver: Are Large Language Models Creative Problem Solvers? [87.70522322728581]
We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting. We create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems. We present our collection to both LLMs and humans to compare and contrast their problem-solving abilities.
arXiv Detail & Related papers (2023-11-16T08:52:27Z)
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models [62.96551299003463]
We propose textbftextitThought Propagation (TP) to enhance the complex reasoning ability of Large Language Models. TP first prompts LLMs to propose and solve a set of analogous problems that are related to the input one. TP reuses the results of analogous problems to directly yield a new solution or derive a knowledge-intensive plan for execution to amend the initial solution obtained from scratch.
arXiv Detail & Related papers (2023-10-06T01:40:09Z)
MLPro: A System for Hosting Crowdsourced Machine Learning Challenges for Open-Ended Research Problems [1.3254304182988286]
We develop a system which combines the notion of open-ended ML coding problems with the concept of an automatic online code judging platform. We find that for sufficiently unconstrained and complex problems, many experts submit similar solutions, but some experts provide unique solutions which outperform the "typical" solution class.
arXiv Detail & Related papers (2022-04-04T02:56:12Z)
Understanding the Usability Challenges of Machine Learning In High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains. In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions. We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.