Related papers: Random at First, Fast at Last: NTK-Guided Fourier Pre-Processing for Tabular DL

Random at First, Fast at Last: NTK-Guided Fourier Pre-Processing for Tabular DL

URL: http://arxiv.org/abs/2506.02406v1
Date: Tue, 03 Jun 2025 03:45:13 GMT
Title: Random at First, Fast at Last: NTK-Guided Fourier Pre-Processing for Tabular DL
Authors: Renat Sergazinov, Jing Wu, Shao-An Yin,
Abstract summary: We revisit and repurpose random Fourier mappings as a parameter-free, architecture-agnostic transformation.<n>We show that this approach circumvents the need for ad hoc normalization or additional learnable embeddings.<n> Empirically, we demonstrate that deep networks trained on Fourier-transformed inputs converge more rapidly and consistently achieve strong final performance.
Score: 4.6774351030379835
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While random Fourier features are a classic tool in kernel methods, their utility as a pre-processing step for deep learning on tabular data has been largely overlooked. Motivated by shortcomings in tabular deep learning pipelines - revealed through Neural Tangent Kernel (NTK) analysis - we revisit and repurpose random Fourier mappings as a parameter-free, architecture-agnostic transformation. By projecting each input into a fixed feature space via sine and cosine projections with frequencies drawn once at initialization, this approach circumvents the need for ad hoc normalization or additional learnable embeddings. We show within the NTK framework that this mapping (i) bounds and conditions the network's initial NTK spectrum, and (ii) introduces a bias that shortens the optimization trajectory, thereby accelerating gradient-based training. These effects pre-condition the network with a stable kernel from the outset. Empirically, we demonstrate that deep networks trained on Fourier-transformed inputs converge more rapidly and consistently achieve strong final performance, often with fewer epochs and less hyperparameter tuning. Our findings establish random Fourier pre-processing as a theoretically motivated, plug-and-play enhancement for tabular deep learning.

Related papers

Scalable Gaussian Processes with Low-Rank Deep Kernel Decomposition [7.532273334759435]
Kernels are key to encoding prior beliefs and data structures in Gaussian process (GP) models.<n>Deep kernel learning enhances kernel flexibility by feeding inputs through a neural network before applying a standard parametric form.<n>We introduce a fully data-driven, scalable deep kernel representation where a neural network directly represents a low-rank kernel.
arXiv Detail & Related papers (2025-05-24T05:42:11Z)
Robust Fourier Neural Networks [1.0589208420411014]
We show that introducing a simple diagonal layer after the Fourier embedding layer makes the network more robust to measurement noise. Under certain conditions, our proposed approach can also learn functions that are noisy mixtures of nonlinear functions of Fourier features.
arXiv Detail & Related papers (2024-09-03T16:56:41Z)
Spectral-Refiner: Accurate Fine-Tuning of Spatiotemporal Fourier Neural Operator for Turbulent Flows [6.961408873053586]
Recent in operator-type neural networks have shown promising results in approximating Partial Differential Equations (PDEs)<n>These neural networks entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines.
arXiv Detail & Related papers (2024-05-27T14:33:06Z)
Fourier Sensitivity and Regularization of Computer Vision Models [11.79852671537969]
We study the frequency sensitivity characteristics of deep neural networks using a principled approach. We find that computer vision models are consistently sensitive to particular frequencies dependent on the dataset, training method and architecture.
arXiv Detail & Related papers (2023-01-31T10:05:35Z)
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions. We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis. We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z)
Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time. This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z)
NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction. The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network. A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z)
Functional Regularization for Reinforcement Learning via Learned Fourier Features [98.90474131452588]
We propose a simple architecture for deep reinforcement learning by embedding inputs into a learned Fourier basis. We show that it improves the sample efficiency of both state-based and image-based RL.
arXiv Detail & Related papers (2021-12-06T18:59:52Z)
Scaling Neural Tangent Kernels via Sketching and Random Features [53.57615759435126]
Recent works report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets. We design a near input-sparsity time approximation algorithm for NTK, by sketching the expansions of arc-cosine kernels. We show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup.
arXiv Detail & Related papers (2021-06-15T04:44:52Z)
Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.