Continuous Thought Machines
- URL: http://arxiv.org/abs/2505.05522v3
- Date: Wed, 28 May 2025 00:50:21 GMT
- Title: Continuous Thought Machines
- Authors: Luke Darlow, Ciaran Regan, Sebastian Risi, Jeffrey Seely, Llion Jones,
- Abstract summary: We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation.<n>The CTM has two core innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process a history of incoming signals; and (2) neural synchronization employed as a latent representation.<n>We demonstrate the CTM's strong performance and versatility across a range of challenging tasks, including ImageNet-1K classification, solving 2D mazes, sorting, parity, question-answering, and RL tasks.
- Score: 9.222873822861954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biological brains demonstrate complex neural activity, where the timing and interplay between neurons is critical to how brains process information. Most deep learning architectures simplify neural activity by abstracting away temporal dynamics. In this paper we challenge that paradigm. By incorporating neuron-level processing and synchronization, we can effectively reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. The CTM has two core innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process a history of incoming signals; and (2) neural synchronization employed as a latent representation. The CTM aims to strike a balance between oversimplified neuron abstractions that improve computational efficiency, and biological realism. It operates at a level of abstraction that effectively captures essential temporal dynamics while remaining computationally tractable for deep learning. We demonstrate the CTM's strong performance and versatility across a range of challenging tasks, including ImageNet-1K classification, solving 2D mazes, sorting, parity computation, question-answering, and RL tasks. Beyond displaying rich internal representations and offering a natural avenue for interpretation owing to its internal process, the CTM is able to perform tasks that require complex sequential reasoning. The CTM can also leverage adaptive compute, where it can stop earlier for simpler tasks, or keep computing when faced with more challenging instances. The goal of this work is to share the CTM and its associated innovations, rather than pushing for new state-of-the-art results. To that end, we believe the CTM represents a significant step toward developing more biologically plausible and powerful artificial intelligence systems.
Related papers
- Application of an attention-based CNN-BiLSTM framework for in vivo two-photon calcium imaging of neuronal ensembles: decoding complex bilateral forelimb movements from unilateral M1 [0.511850618931844]
Decoding, such as movement, from multiscale brain networks remains a central objective in neuroscience.<n>In this study, we employ a hybrid deep learning framework, an attention-based CNN-BiLSTM model, to decode skilled and complex forelimb movements.<n>Our findings demonstrate that the intricate movements of both ipsilateral and contralateral forelimbs can be accurately decoded from unilateral M1 neuronal ensembles.
arXiv Detail & Related papers (2025-04-23T17:43:00Z) - Artificial Kuramoto Oscillatory Neurons [65.16453738828672]
It has long been known in both neuroscience and AI that ''binding'' between neurons leads to a form of competitive learning.<n>We introduce Artificial rethinking together with arbitrary connectivity designs such as fully connected convolutional, or attentive mechanisms.<n>We show that this idea provides performance improvements across a wide spectrum of tasks such as unsupervised object discovery, adversarial robustness, uncertainty, and reasoning.
arXiv Detail & Related papers (2024-10-17T17:47:54Z) - Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Single Neuromorphic Memristor closely Emulates Multiple Synaptic
Mechanisms for Energy Efficient Neural Networks [71.79257685917058]
We demonstrate memristive nano-devices based on SrTiO3 that inherently emulate all these synaptic functions.
These memristors operate in a non-filamentary, low conductance regime, which enables stable and energy efficient operation.
arXiv Detail & Related papers (2024-02-26T15:01:54Z) - The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks [64.08042492426992]
We introduce the Expressive Memory (ELM) neuron model, a biologically inspired model of a cortical neuron.
Our ELM neuron can accurately match the aforementioned input-output relationship with under ten thousand trainable parameters.
We evaluate it on various tasks with demanding temporal structures, including the Long Range Arena (LRA) datasets.
arXiv Detail & Related papers (2023-06-14T13:34:13Z) - Contrastive-Signal-Dependent Plasticity: Self-Supervised Learning in Spiking Neural Circuits [61.94533459151743]
This work addresses the challenge of designing neurobiologically-motivated schemes for adjusting the synapses of spiking networks.
Our experimental simulations demonstrate a consistent advantage over other biologically-plausible approaches when training recurrent spiking networks.
arXiv Detail & Related papers (2023-03-30T02:40:28Z) - Increasing Liquid State Machine Performance with Edge-of-Chaos Dynamics
Organized by Astrocyte-modulated Plasticity [0.0]
Liquid state machine (LSM) tunes internal weights without backpropagation of gradients.
Recent findings suggest that astrocytes, a long-neglected non-neuronal brain cell, modulate synaptic plasticity and brain dynamics.
We propose the neuron-astrocyte liquid state machine (NALSM) that addresses under-performance through self-organized near-critical dynamics.
arXiv Detail & Related papers (2021-10-26T23:04:40Z) - Mapping and Validating a Point Neuron Model on Intel's Neuromorphic
Hardware Loihi [77.34726150561087]
We investigate the potential of Intel's fifth generation neuromorphic chip - Loihi'
Loihi is based on the novel idea of Spiking Neural Networks (SNNs) emulating the neurons in the brain.
We find that Loihi replicates classical simulations very efficiently and scales notably well in terms of both time and energy performance as the networks get larger.
arXiv Detail & Related papers (2021-09-22T16:52:51Z) - A brain basis of dynamical intelligence for AI and computational
neuroscience [0.0]
More brain-like capacities may demand new theories, models, and methods for designing artificial learning systems.
This article was inspired by our symposium on dynamical neuroscience and machine learning at the 6th Annual US/NIH BRAIN Initiative Investigators Meeting.
arXiv Detail & Related papers (2021-05-15T19:49:32Z) - Neuromorphic Algorithm-hardware Codesign for Temporal Pattern Learning [11.781094547718595]
We derive an efficient training algorithm for Leaky Integrate and Fire neurons, which is capable of training a SNN to learn complex spatial temporal patterns.
We have developed a CMOS circuit implementation for a memristor-based network of neuron and synapses which retains critical neural dynamics with reduced complexity.
arXiv Detail & Related papers (2021-04-21T18:23:31Z) - A Spiking Neural Network Emulating the Structure of the Oculomotor
System Requires No Learning to Control a Biomimetic Robotic Head [0.0]
A neuromorphic oculomotor controller is placed at the heart of our in-house biomimetic robotic head prototype.
The controller is unique in the sense that all data are encoded and processed by a spiking neural network (SNN)
We report the robot's target tracking ability, demonstrate that its eye kinematics are similar to those reported in human eye studies and show that a biologically-constrained learning can be used to further refine its performance.
arXiv Detail & Related papers (2020-02-18T13:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.