Employing Discrete Fourier Transform in Representational Learning
- URL: http://arxiv.org/abs/2506.06765v1
- Date: Sat, 07 Jun 2025 11:17:24 GMT
- Title: Employing Discrete Fourier Transform in Representational Learning
- Authors: Raoof HojatJalali, Edmondo Trentin,
- Abstract summary: A well-established approach is using autoencoders to extract latent representations at the network's compression point.<n>We propose an alternative learning objective, using the Discrete Fourier Transform (DFT) of the input.<n>We validate our approach by achieving 52.8% top-1 accuracy on CIFAR-10 with ResNet-50 and outperforming the traditional autoencoder by 12.8 points under identical architectural configurations.
- Score: 0.6629765271909505
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Image Representation learning via input reconstruction is a common technique in machine learning for generating representations that can be effectively utilized by arbitrary downstream tasks. A well-established approach is using autoencoders to extract latent representations at the network's compression point. These representations are valuable because they retain essential information necessary for reconstructing the original input from the compressed latent space. In this paper, we propose an alternative learning objective. Instead of using the raw input as the reconstruction target, we employ the Discrete Fourier Transform (DFT) of the input. The DFT provides meaningful global information at each frequency level, making individual frequency components useful as separate learning targets. When dealing with multidimensional input data, the DFT offers remarkable flexibility by enabling selective transformation across specific dimensions while preserving others in the computation. Moreover, certain types of input exhibit distinct patterns in their frequency distributions, where specific frequency components consistently contain most of the magnitude, allowing us to focus on a subset of frequencies rather than the entire spectrum. These characteristics position the DFT as a viable learning objective for representation learning and we validate our approach by achieving 52.8% top-1 accuracy on CIFAR-10 with ResNet-50 and outperforming the traditional autoencoder by 12.8 points under identical architectural configurations. Additionally, we demonstrate that training on only the lower-frequency components - those with the highest magnitudes yields results comparable to using the full frequency spectrum, with only minimal reductions in accuracy.
Related papers
- FreqU-FNet: Frequency-Aware U-Net for Imbalanced Medical Image Segmentation [0.0]
FreqU-FNet is a novel U-shaped segmentation architecture operating in the frequency domain.<n>Our framework incorporates a Frequency that leverages Low-Pass Convolution and Daubechies wavelet-based downsampling.<n>Experiments on multiple medical segmentation benchmarks demonstrate that FreqU-FNet consistently outperforms both CNN and Transformer baselines.
arXiv Detail & Related papers (2025-05-23T06:51:24Z) - Cross-Frequency Implicit Neural Representation with Self-Evolving Parameters [52.574661274784916]
Implicit neural representation (INR) has emerged as a powerful paradigm for visual data representation.<n>We propose a self-evolving cross-frequency INR using the Haar wavelet transform (termed CF-INR), which decouples data into four frequency components and employs INRs in the wavelet space.<n>We evaluate CF-INR on a variety of visual data representation and recovery tasks, including image regression, inpainting, denoising, and cloud removal.
arXiv Detail & Related papers (2025-04-15T07:14:35Z) - Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection [8.607385112274882]
Deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images.<n>Existing deep learning-based RGB-T SOD models suffer from two major limitations.<n>We propose a purely Fourier transform-based model, namely Deep Fourier-Embedded Network (DFENet) for accurate RGB-T SOD.
arXiv Detail & Related papers (2024-11-27T14:55:16Z) - F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring [8.296475046681696]
We propose a novel approach based on the Fractional Fourier Transform (FRFT), a unified spatial-frequency representation.
We show that the performance of our proposed method is superior to other state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2024-09-03T17:05:12Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Misalignment-Robust Frequency Distribution Loss for Image Transformation [51.0462138717502]
This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution.
We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain.
Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
arXiv Detail & Related papers (2024-02-28T09:27:41Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Efficient Frequency Domain-based Transformers for High-Quality Image
Deblurring [39.720032882926176]
We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring.
We formulate the proposed FSAS and DFFN into an asymmetrical network based on an encoder and decoder architecture.
arXiv Detail & Related papers (2022-11-22T13:08:03Z) - Deep Frequency Filtering for Domain Generalization [55.66498461438285]
Deep Neural Networks (DNNs) have preferences for some frequency components in the learning process.
We propose Deep Frequency Filtering (DFF) for learning domain-generalizable features.
We show that applying our proposed DFF on a plain baseline outperforms the state-of-the-art methods on different domain generalization tasks.
arXiv Detail & Related papers (2022-03-23T05:19:06Z) - Fourier Disentangled Space-Time Attention for Aerial Video Recognition [54.80846279175762]
We present an algorithm, Fourier Activity Recognition (FAR), for UAV video activity recognition.
Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent from the background.
We have evaluated our approach on multiple UAV datasets including UAV Human RGB, UAV Human Night, Drone Action, and NEC Drone.
arXiv Detail & Related papers (2022-03-21T01:24:53Z) - On Transfer Learning of Traditional Frequency and Time Domain Features
in Turning [1.0965065178451106]
We use traditional signal processing tools to identify chatter in accelerometer signals obtained from a turning experiment.
The tagged signals are then used to train a classifier.
Our results show that features extracted from the Fourier spectrum are the most informative when training a classifier and testing on data from the same cutting configuration.
arXiv Detail & Related papers (2020-08-28T14:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.