Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
- URL: http://arxiv.org/abs/2410.03348v1
- Date: Fri, 4 Oct 2024 12:12:36 GMT
- Title: Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
- Authors: Aaditya Naik, Jason Liu, Claire Wang, Saikat Dutta, Mayur Naik, Eric Wong,
- Abstract summary: We propose a framework to scale neurosymbolic learning at a fundamental level by mapping forward chaining and backward gradient propagation in symbolic programs to vectorized computations.
Dolphin introduces a set of abstractions and primitives built directly on top of a high-performance deep learning framework like PyTorch.
We evaluate Dolphin on a suite of 13 benchmarks across 5 neurosymbolic tasks that combine deep learning models for text, image, or video processing with symbolic programs.
- Score: 18.50192747078987
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neurosymbolic learning has emerged as a promising paradigm to incorporate symbolic reasoning into deep learning models. However, existing frameworks are limited in scalability with respect to both the training data and the complexity of symbolic programs. We propose Dolphin, a framework to scale neurosymbolic learning at a fundamental level by mapping both forward chaining and backward gradient propagation in symbolic programs to vectorized computations. For this purpose, Dolphin introduces a set of abstractions and primitives built directly on top of a high-performance deep learning framework like PyTorch, effectively enabling symbolic programs to be written as PyTorch modules. It thereby enables neurosymbolic programs to be written in a language like Python that is familiar to developers and compile them to computation graphs that are amenable to end-to-end differentiation on GPUs. We evaluate Dolphin on a suite of 13 benchmarks across 5 neurosymbolic tasks that combine deep learning models for text, image, or video processing with symbolic programs that involve multi-hop reasoning, recursion, and even black-box functions like Python eval(). Dolphin only takes 0.33%-37.17% of the time (and 2.77% on average) to train these models on the largest input per task compared to baselines Scallop, ISED, and IndeCateR+, which time out on most of these inputs. Models written in Dolphin also achieve state-of-the-art accuracies even on the largest benchmarks.
Related papers
- Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback [71.89119648053396]
We propose Dolphin, the first closed-loop open-ended auto-research framework.
Dolphin can generate research ideas, perform experiments, and get feedback from experimental results to generate higher-quality ideas.
We highlight that Dolphin can automatically propose methods that are comparable to the state-of-the-art in some tasks such as 2D image classification and 3D point classification.
arXiv Detail & Related papers (2025-01-07T16:31:10Z) - Compositional Generalization Across Distributional Shifts with Sparse Tree Operations [77.5742801509364]
We introduce a unified neurosymbolic architecture called the Differentiable Tree Machine.
We significantly increase the model's efficiency through the use of sparse vector representations of symbolic structures.
We enable its application beyond the restricted set of tree2tree problems to the more general class of seq2seq problems.
arXiv Detail & Related papers (2024-12-18T17:20:19Z) - Discovering symbolic expressions with parallelized tree search [59.92040079807524]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.
Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity.
We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z) - Deep Symbolic Optimization for Combinatorial Optimization: Accelerating Node Selection by Discovering Potential Heuristics [10.22111332588471]
We propose a novel deep symbolic optimization learning framework that combines their advantages.
Dso4NS guides the search for mathematical expressions within the high-dimensional discrete symbolic space and then incorporates the highest-performing mathematical expressions into a solver.
Experiments demonstrate the effectiveness of Dso4NS in learning high-quality expressions, outperforming existing approaches on a CPU machine.
arXiv Detail & Related papers (2024-06-14T06:02:14Z) - The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning [54.56905063752427]
Neuro-Symbolic AI (NeSy) holds promise to ensure the safe deployment of AI systems.
Existing pipelines that train the neural and symbolic components sequentially require extensive labelling.
New architecture, NeSyGPT, fine-tunes a vision-language foundation model to extract symbolic features from raw data.
arXiv Detail & Related papers (2024-02-02T20:33:14Z) - Scallop: A Language for Neurosymbolic Programming [14.148819428748597]
Scallop is a language that combines the benefits of deep learning and logical reasoning.
It is capable of expressing algorithmic reasoning in diverse and challenging AI tasks.
It provides a succinct interface for machine learning programmers to integrate logical domain knowledge.
arXiv Detail & Related papers (2023-04-10T18:46:53Z) - Symbolic Visual Reinforcement Learning: A Scalable Framework with
Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies.
By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions.
Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z) - NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual
Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences.
The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation.
In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z) - Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning
Programs [7.656446581986389]
Imperative programming allows users to implement their deep neural networks (DNNs) easily.
Several systems have been proposed to combine the usability of imperative programming with the optimized performance of symbolic graph execution.
We propose Terra, an imperative-symbolic co-execution system that can handle any imperative DL programs while achieving the optimized performance of symbolic graph execution.
arXiv Detail & Related papers (2022-01-23T09:04:48Z) - SLASH: Embracing Probabilistic Circuits into Neural Answer Set
Programming [15.814914345000574]
We introduce SLASH -- a novel deep probabilistic programming language (DPPL)
At its core, SLASH consists of Neural-Probabilistic Predicates (NPPs) and logical programs which are united via answer set programming.
We evaluate SLASH on the benchmark data of MNIST addition as well as novel tasks for DPPLs such as missing data prediction and set prediction with state-of-the-art performance.
arXiv Detail & Related papers (2021-10-07T12:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.