Deep Learning, Machine Learning -- Digital Signal and Image Processing: From Theory to Application
- URL: http://arxiv.org/abs/2410.20304v1
- Date: Sun, 27 Oct 2024 02:00:33 GMT
- Title: Deep Learning, Machine Learning -- Digital Signal and Image Processing: From Theory to Application
- Authors: Weiche Hsieh, Ziqian Bi, Junyu Liu, Benji Peng, Sen Zhang, Xuanhe Pan, Jiawei Xu, Jinlang Wang, Keyu Chen, Caitlyn Heqi Yin, Pohsun Feng, Yizhu Wen, Tianyang Wang, Ming Li, Jintao Ren, Qian Niu, Silin Chen, Ming Liu,
- Abstract summary: Digital Signal Processing (DSP) and Digital Image Processing (DIP) with Machine Learning (ML) and Deep Learning (DL) are popular research areas in Computer Vision.
We highlight transformative applications in image enhancement, filtering techniques, and pattern recognition.
We implement algorithms that optimize real-time data processing, forming a foundation for scalable, high-performance solutions in computer vision.
- Score: 17.284340134417285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digital Signal Processing (DSP) and Digital Image Processing (DIP) with Machine Learning (ML) and Deep Learning (DL) are popular research areas in Computer Vision and related fields. We highlight transformative applications in image enhancement, filtering techniques, and pattern recognition. By integrating frameworks like the Discrete Fourier Transform (DFT), Z-Transform, and Fourier Transform methods, we enable robust data manipulation and feature extraction essential for AI-driven tasks. Using Python, we implement algorithms that optimize real-time data processing, forming a foundation for scalable, high-performance solutions in computer vision. This work illustrates the potential of ML and DL to advance DSP and DIP methodologies, contributing to artificial intelligence, automated feature extraction, and applications across diverse domains.
Related papers
- MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection [55.702662643521265]
We propose the multimodal graph-conditioned vision-language reconstruction network (MGCR-Net) to explore the semantic interaction capabilities of multimodal data.<n> Experimental results on four public datasets demonstrate that MGCR achieves superior performance compared to mainstream CD methods.
arXiv Detail & Related papers (2025-08-03T02:50:08Z) - DRL-based Dolph-Tschebyscheff Beamforming in Downlink Transmission for Mobile Users [52.9870460238443]
We propose a deep reinforcement learning-based blind beamforming technique using a learnable Dolph-Tschebyscheff antenna array.
Our simulation results show that the proposed method can support data rates very close to the best possible values.
arXiv Detail & Related papers (2025-02-03T11:50:43Z) - From Pixels to Prose: Advancing Multi-Modal Language Models for Remote Sensing [16.755590790629153]
This review examines the development and application of multi-modal language models (MLLMs) in remote sensing.
We focus on their ability to interpret and describe satellite imagery using natural language.
Key applications such as scene description, object detection, change detection, text-to-image retrieval, image-to-text generation, and visual question answering are discussed.
arXiv Detail & Related papers (2024-11-05T12:14:22Z) - Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP)
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid.
PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z) - Harnessing Machine Learning for Discerning AI-Generated Synthetic Images [2.6227376966885476]
We employ machine learning techniques to discern between AI-generated and genuine images.
We refine and adapt advanced deep learning architectures like ResNet, VGGNet, and DenseNet.
The experimental results were significant, demonstrating that our optimized deep learning models outperform traditional methods.
arXiv Detail & Related papers (2024-01-14T20:00:37Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Attention Mechanism for Contrastive Learning in GAN-based Image-to-Image
Translation [3.90801108629495]
We propose a GAN-based model that is capable of generating high-quality images across different domains.
We leverage Contrastive Learning to train the model in a self-supervised way using image data acquired in the real world using real sensors and simulated images from 3D games.
arXiv Detail & Related papers (2023-02-23T14:23:23Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Concurrent Neural Tree and Data Preprocessing AutoML for Image
Classification [0.5735035463793008]
Current state-of-the-art (SOTA) methods do not include traditional methods for manipulating input data as part of the algorithmic search space.
We adapt the Evolutionary Multi-objective Algorithm Design Engine (EMADE), a multi-objective evolutionary search framework for traditional machine learning methods, to perform neural architecture search.
We show that including these methods as part of the search space shows potential to provide benefits to performance on the CIFAR-10 image classification benchmark dataset.
arXiv Detail & Related papers (2022-05-25T20:03:09Z) - Multimodal Representation Learning With Text and Images [2.998895355715139]
This project leverages multimodal AI and matrix factorization techniques for representation learning, on text and image data simultaneously.
The learnt representations are evaluated using downstream classification and regression tasks.
arXiv Detail & Related papers (2022-04-30T03:25:01Z) - Reprogramming Language Models for Molecular Representation Learning [65.00999660425731]
We propose Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained language models for molecular learning tasks.
The adversarial program learns a linear transformation between a dense source model input space (language data) and a sparse target model input space (e.g., chemical and biological molecule data) using a k-SVD solver.
R2DL achieves the baseline established by state of the art toxicity prediction models trained on domain-specific data and outperforms the baseline in a limited training-data setting.
arXiv Detail & Related papers (2020-12-07T05:50:27Z) - Graph signal processing for machine learning: A review and new
perspectives [57.285378618394624]
We review a few important contributions made by GSP concepts and tools, such as graph filters and transforms, to the development of novel machine learning algorithms.
We discuss exploiting data structure and relational priors, improving data and computational efficiency, and enhancing model interpretability.
We provide new perspectives on future development of GSP techniques that may serve as a bridge between applied mathematics and signal processing on one side, and machine learning and network science on the other.
arXiv Detail & Related papers (2020-07-31T13:21:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.