Related papers: GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III

GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III

URL: http://arxiv.org/abs/2504.06067v1
Date: Tue, 08 Apr 2025 14:09:23 GMT
Title: GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III
Authors: Hao Li, Zhenyu Liang, Ran Cheng,
Abstract summary: We propose a fully tensorized implementation of NSGA-III for large-scale many-objective optimization.<n>NSGA-III maintains the exact selection and variation mechanisms of NSGA-III while achieving significant acceleration.<n>Results show thatNSGA-III achieves speedups of up to $3629times$ over the CPU version of NSGA-III.
Score: 13.487945730611193
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: NSGA-III is one of the most widely adopted algorithms for tackling many-objective optimization problems. However, its CPU-based design severely limits scalability and computational efficiency. To address the limitations, we propose {TensorNSGA-III}, a fully tensorized implementation of NSGA-III that leverages GPU parallelism for large-scale many-objective optimization. Unlike conventional GPU-accelerated evolutionary algorithms that rely on heuristic approximations to improve efficiency, TensorNSGA-III maintains the exact selection and variation mechanisms of NSGA-III while achieving significant acceleration. By reformulating the selection process with tensorized data structures and an optimized caching strategy, our approach effectively eliminates computational bottlenecks inherent in traditional CPU-based and na\"ive GPU implementations. Experimental results on widely used numerical benchmarks show that TensorNSGA-III achieves speedups of up to $3629\times$ over the CPU version of NSGA-III. Additionally, we validate its effectiveness in multiobjective robotic control tasks, where it discovers diverse and high-quality behavioral solutions. Furthermore, we investigate the critical role of large population sizes in many-objective optimization and demonstrate the scalability of TensorNSGA-III in such scenarios. The source code is available at https://github.com/EMI-Group/evomo

Related papers

A Non-Dominated Sorting Evolutionary Algorithm Updating When Required [0.0]
NSGA-III relies on uniformly distributed reference points to promote diversity in many-objective optimization problems.<n>This paper proposes NSGA-III with Update when Required (NSGA-III-UR), a hybrid algorithm that selectively activates reference vector adaptation.
arXiv Detail & Related papers (2025-07-05T02:28:10Z)
Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization [14.87046071090259]
3D Gaussian Splatting (3DGS) has recently gained significant attention for high-quality and efficient view synthesis.<n>Despite its impressive algorithmic performance, real-time rendering on resource-constrained devices remains a major challenge due to tight power and area budgets.
arXiv Detail & Related papers (2025-06-08T10:14:54Z)
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting [57.97160965244424]
3D Gaussian splatting (3DGS) has enabled various applications in 3D scene representation and novel view synthesis.<n>Previous approaches have focused on pruning less important Gaussians, effectively compressing 3DGS.<n>We present an elastic inference method for 3DGS, achieving substantial rendering performance without additional fine-tuning.
arXiv Detail & Related papers (2025-06-04T17:17:57Z)
Second-order Optimization of Gaussian Splats with Importance Sampling [51.95046424364725]
3D Gaussian Splatting (3DGS) is widely used for novel view rendering due to its high quality and fast inference time. We propose a novel second-order optimization strategy based on Levenberg-Marquardt (LM) and Conjugate Gradient (CG) Our method achieves a $3times$ speedup over standard LM and outperforms Adam by $6times$ when the Gaussian count is low.
arXiv Detail & Related papers (2025-04-17T12:52:08Z)
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization [11.508416084439443]
Evolutionary multiobjective optimization (EMO) has made significant strides over the past two decades. Traditional EMO algorithms face substantial performance limitations due to insufficient parallelism and scalability. We propose to parallelize EMO algorithms on GPU via the tensorization methodology. Our experiments show that the tensorized EMO algorithms achieve speedups of up to 1113x compared to their CPU-based counterparts.
arXiv Detail & Related papers (2025-03-26T07:30:23Z)
DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds [71.37326848614133]
We propose DashGaussian, a scheduling scheme over the optimization complexity of 3DGS.<n>We show that our method accelerates the optimization of various 3DGS backbones by 45.7% on average.
arXiv Detail & Related papers (2025-03-24T07:17:27Z)
Accelerating Sparse Graph Neural Networks with Tensor Core Optimization [0.0]
Graphdense networks (GNNs) have seen extensive application in domains such as social networks, bioinformatics, computation and recommendation systems.<n>Traditional computing methods are insufficient to meet the performance demands of GNNs.<n>Recent research has explored parallel acceleration using Cores and Cores, but significant challenges persist.
arXiv Detail & Related papers (2024-12-16T01:57:53Z)
Octopus Inspired Optimization Algorithm: Multi-Level Structures and Parallel Computing Strategies [21.96416191573034]
Octopus Inspired Optimization (OIO) algorithm is inspired by the neural structure of octopus, especially its hierarchical and decentralised interaction properties.<n>OIO shows faster convergence and higher accuracy, especially when dealing with multimodal functions and high-dimensional optimisation problems.<n>It is especially suitable for application scenarios that require fast, efficient and robust optimisation methods, such as robot path planning, supply chain management, and energy system management.
arXiv Detail & Related papers (2024-10-10T14:27:38Z)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs. At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads. At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z)
T-GAE: Transferable Graph Autoencoder for Network Alignment [79.89704126746204]
T-GAE is a graph autoencoder framework that leverages transferability and stability of GNNs to achieve efficient network alignment without retraining. Our experiments demonstrate that T-GAE outperforms the state-of-the-art optimization method and the best GNN approach by up to 38.7% and 50.8%, respectively.
arXiv Detail & Related papers (2023-10-05T02:58:29Z)
Massively Parallel Genetic Optimization through Asynchronous Propagation of Populations [50.591267188664666]
Propulate is an evolutionary optimization algorithm and software package for global optimization. We provide an MPI-based implementation of our algorithm, which features variants of selection, mutation, crossover, and migration. We find that Propulate is up to three orders of magnitude faster without sacrificing solution accuracy.
arXiv Detail & Related papers (2023-01-20T18:17:34Z)
A Mathematical Runtime Analysis of the Non-dominated Sorting Genetic Algorithm III (NSGA-III) [9.853329403413701]
The Non-dominated Sorting Genetic Algorithm II (NSGA-II) is the most prominent multi-objective evolutionary algorithm for real-world applications. We provide the first mathematical runtime analysis of the NSGA-III, a refinement of the NSGA-II aimed at better handling more than two objectives.
arXiv Detail & Related papers (2022-11-15T15:10:36Z)
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design [66.39546326221176]
Attention-based neural networks have become pervasive in many AI tasks. The use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources. This paper proposes a hardware-friendly variant that adopts a unified butterfly sparsity pattern to approximate both the attention mechanism and the FFNs.
arXiv Detail & Related papers (2022-09-20T09:28:26Z)
Optimizing Memory Efficiency of Graph NeuralNetworks on Edge Computing Platforms [10.045922468883486]
Graph neural networks (GNN) have achieved state-of-the-art performance on various industrial tasks. A feature decomposition approach is proposed for memory efficiency optimization of GNN inference. The proposed approach could achieve outstanding optimization on various GNN models, covering a wide range of datasets, which speeds up the inference by up to 3x.
arXiv Detail & Related papers (2021-04-07T11:15:12Z)
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces. We train and validate our approach directly on the Intel NNP-I chip for inference. We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.