Related papers: No One-Size-Fits-All Neurons: Task-based Neurons for Artificial Neural Networks

No One-Size-Fits-All Neurons: Task-based Neurons for Artificial Neural Networks

URL: http://arxiv.org/abs/2405.02369v1
Date: Fri, 3 May 2024 09:12:46 GMT
Title: No One-Size-Fits-All Neurons: Task-based Neurons for Artificial Neural Networks
Authors: Feng-Lei Fan, Meng Wang, Hang-Cheng Dong, Jianwei Ma, Tieyong Zeng,
Abstract summary: Since the human brain is a task-based neuron user, can the artificial network design go from the task-based architecture design to the task-based neuron design? We propose a two-step framework for prototyping task-based neurons. Experiments show that the proposed task-based neuron design is not only feasible but also delivers competitive performance over other state-of-the-art models.
Score: 25.30801109401654
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Biologically, the brain does not rely on a single type of neuron that universally functions in all aspects. Instead, it acts as a sophisticated designer of task-based neurons. In this study, we address the following question: since the human brain is a task-based neuron user, can the artificial network design go from the task-based architecture design to the task-based neuron design? Since methodologically there are no one-size-fits-all neurons, given the same structure, task-based neurons can enhance the feature representation ability relative to the existing universal neurons due to the intrinsic inductive bias for the task. Specifically, we propose a two-step framework for prototyping task-based neurons. First, symbolic regression is used to identify optimal formulas that fit input data by utilizing base functions such as logarithmic, trigonometric, and exponential functions. We introduce vectorized symbolic regression that stacks all variables in a vector and regularizes each input variable to perform the same computation, which can expedite the regression speed, facilitate parallel computation, and avoid overfitting. Second, we parameterize the acquired elementary formula to make parameters learnable, which serves as the aggregation function of the neuron. The activation functions such as ReLU and the sigmoidal functions remain the same because they have proven to be good. Empirically, experimental results on synthetic data, classic benchmarks, and real-world applications show that the proposed task-based neuron design is not only feasible but also delivers competitive performance over other state-of-the-art models.

Related papers

Language Model Circuits Are Sparse in the Neuron Basis [50.460651620833055]
We show that textbfMLP neurons are as sparse a feature basis as SAEs.<n>This work advances automated interpretability of language models without additional training costs.
arXiv Detail & Related papers (2026-01-30T05:41:19Z)
Principled Approaches for Extending Neural Architectures to Function Spaces for Operator Learning [78.88684753303794]
Deep learning has predominantly advanced through applications in computer vision and natural language processing.<n>Neural operators are a principled way to generalize neural networks to mappings between function spaces.<n>This paper identifies and distills the key principles for constructing practical implementations of mappings between infinite-dimensional function spaces.
arXiv Detail & Related papers (2025-06-12T17:59:31Z)
NeuronSeek: On Stability and Expressivity of Task-driven Neurons [19.773883759021764]
Prototyping task-driven neurons (referred to as NeuronSeek) employs symbolic regression (SR) to discover the optimal neuron formulation.<n>This work replaces symbolic regression with tensor decomposition (TD) to discover optimal neuronal formulations.<n>We establish theoretical guarantees that modifying the aggregation functions with common activation functions can empower a network with a fixed number of parameters to approximate any continuous function with an arbitrarily small error.
arXiv Detail & Related papers (2025-06-01T01:36:27Z)
Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data [3.46029409929709]
State-of-the-art systems neuroscience experiments yield large-scale multimodal data, and these data sets require new tools for analysis. Inspired by the success of large pretrained models in vision and language domains, we reframe the analysis of large-scale, cellular-resolution neuronal spiking data into an autoregressive generation problem. We first trained Neuroformer on simulated datasets, and found that it both accurately predicted intrinsically simulated neuronal circuit activity, and also inferred the underlying neural circuit connectivity, including direction.
arXiv Detail & Related papers (2023-10-31T20:17:32Z)
WaLiN-GUI: a graphical and auditory tool for neuron-based encoding [73.88751967207419]
Neuromorphic computing relies on spike-based, energy-efficient communication. We develop a tool to identify suitable configurations for neuron-based encoding of sample-based data into spike trains. The WaLiN-GUI is provided open source and with documentation.
arXiv Detail & Related papers (2023-10-25T20:34:08Z)
Neuron to Graph: Interpreting Language Model Neurons at Scale [8.32093320910416]
This paper introduces a novel automated approach designed to scale interpretability techniques across a vast array of neurons within Large Language Models. We propose Neuron to Graph (N2G), an innovative tool that automatically extracts a neuron's behaviour from the dataset it was trained on and translates it into an interpretable graph.
arXiv Detail & Related papers (2023-05-31T14:44:33Z)
Emergent Modularity in Pre-trained Transformers [127.08792763817496]
We consider two main characteristics of modularity: functional specialization of neurons and function-based neuron grouping. We study how modularity emerges during pre-training, and find that the modular structure is stabilized at the early stage. It suggests that Transformers first construct the modular structure and then learn fine-grained neuron functions.
arXiv Detail & Related papers (2023-05-28T11:02:32Z)
Learning to Act through Evolution of Neural Diversity in Random Neural Networks [9.387749254963595]
In most artificial neural networks (ANNs), neural computation is abstracted to an activation function that is usually shared between all neurons. We propose the optimization of neuro-centric parameters to attain a set of diverse neurons that can perform complex computations.
arXiv Detail & Related papers (2023-05-25T11:33:04Z)
Redundancy and Concept Analysis for Code-trained Language Models [5.726842555987591]
Code-trained language models have proven to be highly effective for various code intelligence tasks. They can be challenging to train and deploy for many software engineering applications due to computational bottlenecks and memory constraints. We perform the first neuron-level analysis for source code models to identify textitimportant neurons within latent representations.
arXiv Detail & Related papers (2023-05-01T15:22:41Z)
POPPINS : A Population-Based Digital Spiking Neuromorphic Processor with Integer Quadratic Integrate-and-Fire Neurons [50.591267188664666]
We propose a population-based digital spiking neuromorphic processor in 180nm process technology with two hierarchy populations. The proposed approach enables the developments of biomimetic neuromorphic system and various low-power, and low-latency inference processing applications.
arXiv Detail & Related papers (2022-01-19T09:26:34Z)
Two-argument activation functions learn soft XOR operations like cortical neurons [6.88204255655161]
We learn canonical activation functions with two input arguments, analogous to basal and apical dendrites. Remarkably, the resultant nonlinearities often produce soft XOR functions. Networks with these nonlinearities learn faster and perform better than conventional ReLU nonlinearities with matched parameter counts.
arXiv Detail & Related papers (2021-10-13T17:06:20Z)
The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain. In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z)
Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts. We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.