Related papers: Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks

Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks

URL: http://arxiv.org/abs/2506.02483v1
Date: Tue, 03 Jun 2025 05:54:20 GMT
Title: Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks
Authors: Sina Bagheri Nezhad, Ameeta Agrawal,
Abstract summary: We introduce NeuroSymbolic Augmented Reasoning (NSAR), which combines the benefits of neural and symbolic reasoning during inference.<n>NSAR explicitly extracts symbolic facts from text and generates executable Python code to handle complex reasoning steps.<n>Our results highlight the effectiveness of combining explicit symbolic operations with neural inference for robust, interpretable, and scalable reasoning in multilingual settings.
Score: 1.7648680700685022
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) often struggle to perform multi-target reasoning in long-context scenarios where relevant information is scattered across extensive documents. To address this challenge, we introduce NeuroSymbolic Augmented Reasoning (NSAR), which combines the benefits of neural and symbolic reasoning during inference. NSAR explicitly extracts symbolic facts from text and generates executable Python code to handle complex reasoning steps. Through extensive experiments across seven languages and diverse context lengths, we demonstrate that NSAR significantly outperforms both a vanilla RAG baseline and advanced prompting strategies in accurately identifying and synthesizing multiple pieces of information. Our results highlight the effectiveness of combining explicit symbolic operations with neural inference for robust, interpretable, and scalable reasoning in multilingual settings.

Related papers

Hybrid Neural-LLM Pipeline for Morphological Glossing in Endangered Language Documentation: A Case Study of Jungar Tuvan [6.367163817135528]
We present a hybrid automatic glossing pipeline that combines neural sequence labeling with large language model (LLM) post-correction.<n>We show that retrieval-augmented prompting provides substantial gains over random example selection.<n>We also find that morpheme dictionaries paradoxically hurt performance compared to providing no dictionary at all in most cases.
arXiv Detail & Related papers (2026-03-01T05:03:11Z)
mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning [74.97363626515236]
We propose a textbfMultilingual and Scalable Benchmark for textbfSkill-based textbfCommonsense textbfReasoning (textbfmSCoRe)<n>Our benchmark incorporates three key components that are designed to systematically evaluate LLM's reasoning capabilities.<n>Our results reveal the limitations of such reasoning-reinforced models when confronted with nuanced multilingual general and cultural commonsense.
arXiv Detail & Related papers (2025-08-13T18:59:02Z)
Metaphor and Large Language Models: When Surface Features Matter More than Deep Understanding [6.0158981171030685]
This paper presents a comprehensive evaluation of the capabilities of Large Language Models (LLMs) in metaphor interpretation across multiple datasets, tasks, and prompt configurations.<n>We address these limitations by conducting extensive experiments using diverse publicly available datasets with inference and metaphor annotations.<n>The results indicate that LLMs' performance is more influenced by features like lexical overlap and sentence length than by metaphorical content.
arXiv Detail & Related papers (2025-07-21T08:09:11Z)
MMATH: A Multilingual Benchmark for Mathematical Reasoning [94.05289799605957]
We introduce MMATH, a benchmark for multilingual complex reasoning spanning 374 high-quality math problems across 10 typologically diverse languages.<n>We observe that even advanced models like DeepSeek R1 exhibit substantial performance disparities across languages and suffer from a critical off-target issue-generating responses in unintended languages.<n>Our findings offer new insights and practical strategies for advancing the multilingual reasoning capabilities of large language models.
arXiv Detail & Related papers (2025-05-25T12:47:39Z)
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models [56.61984030508691]
We present the first mechanistic interpretability study of language confusion.<n>We show that confusion points (CPs) are central to this phenomenon.<n>We show that editing a small set of critical neurons, identified via comparative analysis with a multilingual-tuned counterpart, substantially mitigates confusion.
arXiv Detail & Related papers (2025-05-22T11:29:17Z)
Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations [65.11348389219887]
We introduce Dialectic-RAG (DRAG), a modular approach that evaluates retrieved information by comparing, contrasting, and resolving conflicting perspectives.<n>We show the impact of our framework both as an in-context learning strategy and for constructing demonstrations to instruct smaller models.
arXiv Detail & Related papers (2025-04-07T06:55:15Z)
SNeL: A Structured Neuro-Symbolic Language for Entity-Based Multimodal Scene Understanding [0.0]
We introduce SNeL (Structured Neuro-symbolic Language), a versatile query language designed to facilitate nuanced interactions with neural networks processing multimodal data. SNeL's expressive interface enables the construction of intricate queries, supporting logical and arithmetic operators, comparators, nesting, and more. Our evaluations demonstrate SNeL's potential to reshape the way we interact with complex neural networks.
arXiv Detail & Related papers (2023-06-09T17:01:51Z)
Interpretable Multimodal Misinformation Detection with Logic Reasoning [40.851213962307206]
We propose a novel logic-based neural model for multimodal misinformation detection. We parameterize symbolic logical elements using neural representations, which facilitate the automatic generation and evaluation of meaningful logic clauses. Results on three public datasets demonstrate the feasibility and versatility of our model.
arXiv Detail & Related papers (2023-05-10T08:16:36Z)
Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve [78.3500985535601]
We find a surprising connection between multitask learning and robustness to neuron failures. Our experiments show that bilingual language models retain higher performance under various neuron perturbations. We provide a theoretical justification for this robustness by mathematically analyzing linear representation learning.
arXiv Detail & Related papers (2022-10-20T22:23:27Z)
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering. Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z)
Syntax Role for Neural Semantic Role Labeling [77.5166510071142]
Semantic role labeling (SRL) is dedicated to recognizing the semantic predicate-argument structure of a sentence. Previous studies in terms of traditional models have shown syntactic information can make remarkable contributions to SRL performance. Recent neural SRL studies show that syntax information becomes much less important for neural semantic role labeling.
arXiv Detail & Related papers (2020-09-12T07:01:12Z)
Data Augmentation for Spoken Language Understanding via Pretrained Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity. We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.