Related papers: Six Sigma For Neural Networks: Taguchi-based optimization

Six Sigma For Neural Networks: Taguchi-based optimization

URL: http://arxiv.org/abs/2509.25213v1
Date: Mon, 22 Sep 2025 06:50:25 GMT
Title: Six Sigma For Neural Networks: Taguchi-based optimization
Authors: Sai Varun Kodathala,
Abstract summary: This study introduces the application of Taguchi Design of Experiments methodology, a statistical optimization technique traditionally used in quality engineering.<n>To address the multi-objective nature of machine learning optimization, five different approaches were developed to simultaneously optimize training accuracy, validation accuracy, training loss, and validation loss.<n>Results demonstrate that Approach 3, combining weighted accuracy metrics with logarithmically transformed loss functions, achieved optimal performance with 98.84% training accuracy and 86.25% validation accuracy while maintaining minimal loss values.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The optimization of hyperparameters in convolutional neural networks (CNNs) remains a challenging and computationally expensive process, often requiring extensive trial-and-error approaches or exhaustive grid searches. This study introduces the application of Taguchi Design of Experiments methodology, a statistical optimization technique traditionally used in quality engineering, to systematically optimize CNN hyperparameters for professional boxing action recognition. Using an L12(211) orthogonal array, eight hyperparameters including image size, color mode, activation function, learning rate, rescaling, shuffling, vertical flip, and horizontal flip were systematically evaluated across twelve experimental configurations. To address the multi-objective nature of machine learning optimization, five different approaches were developed to simultaneously optimize training accuracy, validation accuracy, training loss, and validation loss using Signal-to-Noise ratio analysis. The study employed a novel logarithmic scaling technique to unify conflicting metrics and enable comprehensive multi-quality assessment within the Taguchi framework. Results demonstrate that Approach 3, combining weighted accuracy metrics with logarithmically transformed loss functions, achieved optimal performance with 98.84% training accuracy and 86.25% validation accuracy while maintaining minimal loss values. The Taguchi analysis revealed that learning rate emerged as the most influential parameter, followed by image size and activation function, providing clear guidance for hyperparameter prioritization in CNN optimization.

Related papers

Intelligent Optimization of Multi-Parameter Micromixers Using a Scientific Machine Learning Framework [0.0]
This paper introduces a novel framework leveraging cutting-edge Scientific Machine Learning (Sci-ML) methodologies.<n>The proposed method provides instantaneous solutions to a spectrum of complex, multidimensional optimization problems.
arXiv Detail & Related papers (2025-11-10T23:54:15Z)
Beyond the Ideal: Analyzing the Inexact Muon Update [54.70108543057578]
We show first analysis of the inexactized update at Muon's core.<n>We reveal a fundamental coupling between this inexactness and the optimal step size and momentum.
arXiv Detail & Related papers (2025-10-22T18:01:07Z)
SO-PIFRNN: Self-optimization physics-informed Fourier-features randomized neural network for solving partial differential equations [3.769992289689535]
This study proposes a self-optimization physics-informed Fourier-features randomized neural network (SO-PIFRNN) framework.<n>The inner-level optimization determines the output layer weights of the neural network via the least squares method.<n>The experimental results affirm that SO-PIFRNN exhibits superior approximation accuracy and frequency capture capability.
arXiv Detail & Related papers (2025-08-07T02:08:34Z)
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.<n>We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z)
Unveiling the optimization process of Physics Informed Neural Networks: How accurate and competitive can PINNs be? [0.0]
This study investigates the potential accuracy of physics-informed neural networks, contrasting their approach with previous similar works and traditional numerical methods.<n>We find that selecting improved optimization algorithms significantly enhances the accuracy of the results.<n>Simple modifications to the loss function may also improve precision, offering an additional avenue for enhancement.
arXiv Detail & Related papers (2024-05-07T11:50:25Z)
Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training. We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z)
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems. We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z)
Precision Machine Learning [5.15188009671301]
We compare various function approximation methods and study how they scale with increasing parameters and data. We find that neural networks can often outperform classical approximation methods on high-dimensional examples. We develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.
arXiv Detail & Related papers (2022-10-24T17:58:30Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Accelerating gradient-based topology optimization design with dual-model neural networks [21.343803954998915]
In this work, neural networks are used as efficient surrogate models for forward and sensitivity calculations. To improve the accuracy of sensitivity analyses, dual-model neural networks that are trained with both forward and sensitivity data are constructed. The efficiency gained in the problem with size of 64x64 is 137 times in forward calculation and 74 times in sensitivity analysis.
arXiv Detail & Related papers (2020-09-14T07:52:55Z)
Self-Directed Online Machine Learning for Topology Optimization [58.920693413667216]
Self-directed Online Learning Optimization integrates Deep Neural Network (DNN) with Finite Element Method (FEM) calculations. Our algorithm was tested by four types of problems including compliance minimization, fluid-structure optimization, heat transfer enhancement and truss optimization. It reduced the computational time by 2 5 orders of magnitude compared with directly using methods, and outperformed all state-of-the-art algorithms tested in our experiments.
arXiv Detail & Related papers (2020-02-04T20:00:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.