Related papers: Kolmogorov-Arnold Networks: A Critical Assessment of Claims, Performance, and Practical Viability

Related papers

Is Softmax Loss All You Need? A Principled Analysis of Softmax-family Loss [91.61796429377041]
The Softmax loss is one of the most widely employed surrogate objectives for classification and ranking tasks.<n>We investigate whether different surrogates achieve consistency with classification and ranking metrics, and analyze their gradient dynamics to reveal distinct convergence behaviors.<n>Our results establish a principled foundation and offer practical guidance for loss selections in large-class machine learning applications.
arXiv Detail & Related papers (2026-01-30T09:24:52Z)
Kolmogorov Arnold Networks and Multi-Layer Perceptrons: A Paradigm Shift in Neural Modelling [1.6998720690708842]
The research undertakes a comprehensive comparative analysis of Kolmogorov-Arnold Networks (KAN) and Multi-Layer Perceptrons (MLP)<n>KANs utilize spline-based activation functions and grid-based structures, providing a transformative approach compared to traditional neural network frameworks.<n>The proposed study highlights the transformative capabilities of KANs in progressing intelligent systems.
arXiv Detail & Related papers (2026-01-15T16:26:49Z)
Your Group-Relative Advantage Is Biased [74.57406620907797]
Group-based learning methods rely on group-relative advantage estimation to avoid learned critics.<n>In this work, we uncover a fundamental issue of group-based RL: the group-relative advantage estimator is inherently biased relative to the true (expected) advantage.<n>We propose History-Aware Adaptive Difficulty Weighting (HA-DW), an adaptive reweighting scheme that adjusts advantage estimates based on an evolving difficulty anchor and training dynamics.
arXiv Detail & Related papers (2026-01-13T13:03:15Z)
Theory Trace Card: Theory-Driven Socio-Cognitive Evaluation of LLMs [2.98033672654447]
We argue that many socio-cognitive evaluations proceed without an explicit theoretical specification of the target capability.<n>Without this theoretical grounding, benchmarks that exercise only narrow subsets of a capability are routinely misinterpreted as evidence of broad competence.<n>We introduce the Trace Card, a lightweight documentation artifact designed to accompany socio-cognitive evaluations.
arXiv Detail & Related papers (2026-01-05T08:06:50Z)
Efficient Thought Space Exploration through Strategic Intervention [54.35208611253168]
We propose a novel Hint-Practice Reasoning (HPR) framework that operationalizes this insight through two synergistic components.<n>The framework's core innovation lies in Distributional Inconsistency Reduction (DIR), which dynamically identifies intervention points.<n> Experiments across arithmetic and commonsense reasoning benchmarks demonstrate HPR's state-of-the-art efficiency-accuracy tradeoffs.
arXiv Detail & Related papers (2025-11-13T07:26:01Z)
Scientific Machine Learning with Kolmogorov-Arnold Networks [0.0]
The field of scientific machine learning is increasingly adopting Kolmogorov-Arnold Networks (KANs) for data encoding.<n>KANs address issues with enhanced interpretability and flexibility, enabling more efficient modeling of complex nonlinear interactions.<n>This review categorizes recent progress in KAN-based models across three perspectives: (i) data-driven learning, (ii) physics-informed modeling, and (iii) deep-operator learning.
arXiv Detail & Related papers (2025-07-30T01:26:44Z)
Kolmogorov Arnold Networks (KANs) for Imbalanced Data -- An Empirical Perspective [0.0]
Kolmogorov Arnold Networks (KANs) are architectural advancement in neural computation that offer a mathematically grounded alternative to standard neural networks.<n>This study presents an empirical evaluation of KANs in context of class imbalanced classification, using ten benchmark datasets.
arXiv Detail & Related papers (2025-07-18T17:50:51Z)
A Survey of Model Architectures in Information Retrieval [64.75808744228067]
We focus on two key aspects: backbone models for feature extraction and end-to-end system architectures for relevance estimation. We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs) We conclude by discussing emerging challenges and future directions, including architectural optimizations for performance and scalability, handling of multimodal, multilingual data, and adaptation to novel application domains beyond traditional search paradigms.
arXiv Detail & Related papers (2025-02-20T18:42:58Z)
Symmetric Pruning of Large Language Models [61.309982086292756]
Popular post-training pruning methods such as Wanda and RIA are known for their simple, yet effective, designs.<n>This paper introduces new theoretical insights that redefine the standard minimization objective for pruning.<n>We propose complementary strategies that consider both input activations and weight significance.
arXiv Detail & Related papers (2025-01-31T09:23:06Z)
Latenrgy: Model Agnostic Latency and Energy Consumption Prediction for Binary Classifiers [0.0]
Machine learning systems increasingly drive innovation across scientific fields and industry.<n>Yet challenges in compute overhead, specifically during inference, limit their scalability and sustainability.<n>This study addresses critical gaps in the literature, chiefly the lack of generalized predictive techniques for latency and energy consumption.
arXiv Detail & Related papers (2024-12-26T14:51:24Z)
A Survey on Kolmogorov-Arnold Network [0.0]
Review explores the theoretical foundations, evolution, applications, and future potential of Kolmogorov-Arnold Networks (KAN) KANs distinguish themselves from traditional neural networks by using learnable, spline- parameterized functions instead of fixed activation functions. This paper highlights KAN's role in modern neural architectures and outlines future directions to improve its computational efficiency, interpretability, and scalability in data-intensive applications.
arXiv Detail & Related papers (2024-11-09T05:54:17Z)
An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms [62.878616839799776]
We propose SynthRAG, an innovative framework designed to enhance Question Answering (QA) performance. SynthRAG improves on conventional models by employing adaptive outlines for dynamic content structuring. An online deployment on the Zhihu platform revealed that SynthRAG's answers achieved notable user engagement.
arXiv Detail & Related papers (2024-10-23T09:14:57Z)
Integrating Symbolic Neural Networks with Building Physics: A Study and Proposal [1.160352509486639]
Symbolic neural networks, such as Kolmogorov-Arnold Networks (KAN), offer a promising approach for integrating prior knowledge with data-driven methods. This study explores the application of KAN in building physics, focusing on predictive modeling, knowledge discovery, and continuous learning.
arXiv Detail & Related papers (2024-10-20T08:30:19Z)
Dynamic Few-Shot Learning for Knowledge Graph Question Answering [3.116231004560997]
Large language models present opportunities for innovative Question Answering over Knowledge Graphs (KGQA) To bridge this gap, solutions have been proposed that rely on fine-tuning or ad-hoc architectures, achieving good results but limited out-of-domain distribution generalization. In this study, we introduce a novel approach called Dynamic Few-Shot Learning (DFL) DFL integrates the efficiency of in-context learning and semantic similarity and provides a generally applicable solution for KGQA with state-of-the-art performance.
arXiv Detail & Related papers (2024-07-01T15:59:17Z)
Towards Learning Foundation Models for Heuristic Functions to Solve Pathfinding Problems [12.990207889359402]
Pathfinding problems are found in robotics, computational science, and natural sciences. Traditional methods to solve these require training deep neural networks (DNNs) for each new problem domain. This study introduces a novel foundation model, leveraging deep reinforcement learning to train functions that seamlessly adapt to new domains.
arXiv Detail & Related papers (2024-06-01T16:18:20Z)
Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z)
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework [51.44863255495668]
Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence. We present Multi-Modal Reasoning(COCO-MMR) dataset, a novel dataset that encompasses an extensive collection of open-ended questions. We propose innovative techniques, including multi-hop cross-modal attention and sentence-level contrastive learning, to enhance the image and text encoders.
arXiv Detail & Related papers (2023-07-24T08:58:25Z)
From NeurODEs to AutoencODEs: a mean-field control framework for width-varying Neural Networks [68.8204255655161]
We propose a new type of continuous-time control system, called AutoencODE, based on a controlled field that drives dynamics. We show that many architectures can be recovered in regions where the loss function is locally convex.
arXiv Detail & Related papers (2023-07-05T13:26:17Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data. CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
Polynomial-Spline Neural Networks with Exact Integrals [0.0]
We develop a novel neural network architecture that combines a mixture-of-experts model with free knot B1-spline basis functions. Our architecture exhibits both $h$- and $p$- refinement for regression problems at the convergence rates expected from approximation theory. We demonstrate the success of our network on a range of regression and variational problems that illustrate the consistency and exact integrability of our network architecture.
arXiv Detail & Related papers (2021-10-26T22:12:37Z)
Local Propagation in Constraint-based Neural Network [77.37829055999238]
We study a constraint-based representation of neural network architectures. We investigate a simple optimization procedure that is well suited to fulfil the so-called architectural constraints.
arXiv Detail & Related papers (2020-02-18T16:47:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.