Related papers: DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

URL: http://arxiv.org/abs/2502.10060v1
Date: Fri, 14 Feb 2025 10:26:14 GMT
Title: DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery
Authors: Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier, Bharath Hariharan, Kavita Bala, Carl Vondrick,
Abstract summary: Good interpretation is important in scientific reasoning, as it allows for better decision-making.<n>This paper introduces an automatic way of obtaining such interpretable-by-design models, by learning programs that interleave neural networks.<n>We propose DiSciPLE an evolutionary algorithm that leverages common sense and prior knowledge of large language models (LLMs) to create Python programs explaining visual data.
Score: 61.02102713094486
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual data is used in numerous different scientific workflows ranging from remote sensing to ecology. As the amount of observation data increases, the challenge is not just to make accurate predictions but also to understand the underlying mechanisms for those predictions. Good interpretation is important in scientific workflows, as it allows for better decision-making by providing insights into the data. This paper introduces an automatic way of obtaining such interpretable-by-design models, by learning programs that interleave neural networks. We propose DiSciPLE (Discovering Scientific Programs using LLMs and Evolution) an evolutionary algorithm that leverages common sense and prior knowledge of large language models (LLMs) to create Python programs explaining visual data. Additionally, we propose two improvements: a program critic and a program simplifier to improve our method further to synthesize good programs. On three different real-world problems, DiSciPLE learns state-of-the-art programs on novel tasks with no prior literature. For example, we can learn programs with 35% lower error than the closest non-interpretable baseline for population density estimation.

Related papers

Unexpected but informative: What fixation-related potentials tell us about the processing of ambiguous program code [15.510640091254887]
We analyze the online processing of program code patterns that are ambiguous to programmers, but not the computer.<n>Relative to unambiguous counterparts in program code, atoms of confusion elicit a late frontal positivity with a duration of about 400 to 700 ms.<n>We take these data to suggest that the brain engages similar neurocognitive mechanisms in response to unexpected and informative inputs in program code and in natural language.
arXiv Detail & Related papers (2024-12-13T12:38:10Z)
Statistical investigations into the geometry and homology of random programs [0.2302001830524133]
We show how the relation between random Python programs generated from chatGPT can be described geometrically and topologically. We compare publicly available models: ChatGPT-4 and TinyLlama, on a simple problem related to image processing. We speculate that our approach may in the future give new insights into the structure of programming languages.
arXiv Detail & Related papers (2024-07-05T20:25:02Z)
PERFOGRAPH: A Numerical Aware Program Graph Representation for Performance Optimization and Program Analysis [12.778336318809092]
A key challenge in adopting the latest machine learning methods is the representation of programming languages. To overcome the limitations and challenges of current program representations, we propose a graph-based program representation called PERFOGRAPH. PerFOGRAPH can capture numerical information and the aggregate data structure by introducing new nodes and edges.
arXiv Detail & Related papers (2023-05-31T21:59:50Z)
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM. For learning methods, we explore the claim of a "free lunch" hypothesis. For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Arithmetic-Based Pretraining -- Improving Numeracy of Pretrained Language Models [67.48894919842576]
State-of-the-art pretrained language models tend to perform below their capabilities when applied out-of-the-box on tasks that require numeracy. We propose a new extended pretraining approach called Arithmetic-Based Pretraining that jointly addresses both in one extended pretraining step. Our experiments show the effectiveness of Arithmetic-Based Pretraining in three different tasks that require improved numeracy.
arXiv Detail & Related papers (2022-05-13T16:10:13Z)
Self-supervised on Graphs: Contrastive, Generative,or Predictive [25.679620842010422]
Self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks. We divide existing graph SSL methods into three categories: contrastive, generative, and predictive. We also summarize the commonly used datasets, evaluation metrics, downstream tasks, and open-source implementations of various algorithms.
arXiv Detail & Related papers (2021-05-16T03:30:03Z)
How could Neural Networks understand Programs? [67.4217527949013]
It is difficult to build a model to better understand programs, by either directly applying off-the-shelf NLP pre-training techniques to the source code, or adding features to the model by theshelf. We propose a novel program semantics learning paradigm, that the model should learn from information composed of (1) the representations which align well with the fundamental operations in operational semantics, and (2) the information of environment transition.
arXiv Detail & Related papers (2021-05-10T12:21:42Z)
Reprogramming Language Models for Molecular Representation Learning [65.00999660425731]
We propose Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained language models for molecular learning tasks. The adversarial program learns a linear transformation between a dense source model input space (language data) and a sparse target model input space (e.g., chemical and biological molecule data) using a k-SVD solver. R2DL achieves the baseline established by state of the art toxicity prediction models trained on domain-specific data and outperforms the baseline in a limited training-data setting.
arXiv Detail & Related papers (2020-12-07T05:50:27Z)
Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z)
Learning to learn generative programs with Memoised Wake-Sleep [52.439550543743536]
We study a class of neuro-symbolic generative models in which neural networks are used both for inference and as priors over symbolic, data-generating programs. We propose the Memoised Wake-Sleep (MWS) algorithm, which extends Wake Sleep by explicitly storing and reusing the best programs discovered by the inference network throughout training.
arXiv Detail & Related papers (2020-07-06T23:51:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.