Related papers: Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis

Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis

URL: http://arxiv.org/abs/2511.07329v1
Date: Mon, 10 Nov 2025 17:31:39 GMT
Title: Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
Authors: Yash Mittal, Dmitry Ignatov, Radu Timofte,
Abstract summary: It introduces FractalNet, a fractal-inspired computational architectures for advanced large language model analysis.<n>The new set-up involves a template-driven generator, runner, and evaluation framework that, through systematic permutations of convolutional, normalization, activation, and dropout layers, can create more than 1,200 variants of neural networks.<n>The paper positions fractal design as a feasible and resource-efficient method of automated architecture exploration.
Score: 50.11146543029802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It introduces FractalNet, a fractal-inspired computational architectures for advanced large language model analysis that mainly challenges model diversity on a large scale in an efficient manner. The new set-up involves a template-driven generator, runner, and evaluation framework that, through systematic permutations of convolutional, normalization, activation, and dropout layers, can create more than 1,200 variants of neural networks. Fractal templates allow for structural recursion and multi-column pathways, thus, models become deeper and wider in a balanced way. Training utilizes PyTorch, Automatic Mixed Precision (AMP), and gradient checkpointing and is carried out on the CIFAR-10 dataset for five epochs. The outcomes show that fractal-based architectures are capable of strong performance and are computationally efficient. The paper positions fractal design as a feasible and resource-efficient method of automated architecture exploration.

Related papers

DNAD: Differentiable Neural Architecture Distillation [6.026956571669411]
Differentiable neural architecture distillation (DNAD) algorithm is developed based on two cores, namely search by deleting and search by imitating.<n>DNAD achieves the top-1 error rate of 23.7% on ImageNet classification with a model of 6.0M parameters and 598M FLOPs.<n>Super-network progressive shrinking (SNPS) algorithm is developed based on the framework of differentiable architecture search (DARTS)
arXiv Detail & Related papers (2025-04-25T08:49:31Z)
ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity.<n>This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics.<n>Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z)
STAR: Synthesis of Tailored Architectures [61.080157488857516]
We propose a new approach for the synthesis of tailored architectures (STAR)<n>Our approach combines a novel search space based on the theory of linear input-varying systems, supporting a hierarchical numerical encoding into architecture genomes. STAR genomes are automatically refined and recombined with gradient-free, evolutionary algorithms to optimize for multiple model quality and efficiency metrics.<n>Using STAR, we optimize large populations of new architectures, leveraging diverse computational units and interconnection patterns, improving over highly-optimized Transformers and striped hybrid models on the frontier of quality, parameter size, and inference cache for autoregressive language modeling.
arXiv Detail & Related papers (2024-11-26T18:42:42Z)
Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures [12.038095281876071]
In this study, we incorporate fractal features into a deep learning model by extending FD to the pixel-level using a sliding window technique. The resulting fractal feature maps (FFMs) are then incorporated as additional input to the model and additional weight in the loss function. Experiments on five tubular structure datasets validate the effectiveness and robustness of our approach.
arXiv Detail & Related papers (2024-07-20T05:22:59Z)
Learning From Simplicial Data Based on Random Walks and 1D Convolutions [6.629765271909503]
simplicial complex neural network learning architecture based on random walks and fast 1D convolutions. We empirically evaluate SCRaWl on real-world datasets and show that it outperforms other simplicial neural networks.
arXiv Detail & Related papers (2024-04-04T13:27:22Z)
Symplectic Autoencoders for Model Reduction of Hamiltonian Systems [0.0]
It is crucial to preserve the symplectic structure associated with the system in order to ensure long-term numerical stability. We propose a new neural network architecture in the spirit of autoencoders, which are established tools for dimension reduction. In order to train the network, a non-standard gradient descent approach is applied.
arXiv Detail & Related papers (2023-12-15T18:20:25Z)
Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture [68.13678918660872]
We design a more capable parameter-sharing architecture based on matrix product operator (MPO) MPO decomposition can reorganize and factorize the information of a parameter matrix into two parts. Our architecture shares the central tensor across all layers for reducing the model size.
arXiv Detail & Related papers (2023-03-27T02:34:09Z)
FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task. Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Segmentation and Recovery of Superquadric Models using Convolutional Neural Networks [2.454342521577328]
We present a (two-stage) approach built around convolutional neural networks (CNNs) In the first stage, our approach uses a Mask RCNN model to identify superquadric-like structures in depth scenes. We are able to describe complex structures with a small number of interpretable parameters.
arXiv Detail & Related papers (2020-01-28T18:17:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.