Related papers: How to Square Tensor Networks and Circuits Without Squaring Them

How to Square Tensor Networks and Circuits Without Squaring Them

URL: http://arxiv.org/abs/2512.17090v1
Date: Thu, 18 Dec 2025 21:36:54 GMT
Title: How to Square Tensor Networks and Circuits Without Squaring Them
Authors: Lorenzo Loconte, Adrián Javaloy, Antonio Vergari,
Abstract summary: We show how to parameterize squared circuits to overcome marginalization overhead.<n>We also show how our proposed conditions in squared circuits come with no loss, while enabling more efficient learning.
Score: 18.229583996292146
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Squared tensor networks (TNs) and their extension as computational graphs--squared circuits--have been used as expressive distribution estimators, yet supporting closed-form marginalization. However, the squaring operation introduces additional complexity when computing the partition function or marginalizing variables, which hinders their applicability in ML. To solve this issue, canonical forms of TNs are parameterized via unitary matrices to simplify the computation of marginals. However, these canonical forms do not apply to circuits, as they can represent factorizations that do not directly map to a known TN. Inspired by the ideas of orthogonality in canonical forms and determinism in circuits enabling tractable maximization, we show how to parameterize squared circuits to overcome their marginalization overhead. Our parameterizations unlock efficient marginalization even in factorizations different from TNs, but encoded as circuits, whose structure would otherwise make marginalization computationally hard. Finally, our experiments on distribution estimation show how our proposed conditions in squared circuits come with no expressiveness loss, while enabling more efficient learning.

Related papers

Graph-based Clustering Revisited: A Relaxation of Kernel $k$-Means Perspective [73.18641268511318]
We propose a graph-based clustering algorithm that only relaxes the orthonormal constraint to derive clustering results.<n>To ensure a doubly constraint into a gradient, we transform the non-negative constraint into a class probability parameter.
arXiv Detail & Related papers (2025-09-23T09:14:39Z)
Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
Orthogonal Finetuning Made Scalable [92.34573849209238]
Orthogonal finetuning (OFT) offers highly parameter-efficient adaptation while preventing catastrophic forgetting, but its high runtime and memory demands limit practical deployment.<n>We identify the core computational bottleneck in OFT as its weight-centric implementation, which relies on costly matrix-matrix multiplications with cubic complexity.<n>We propose OFTv2, an input-centric reformulation that instead uses matrix-vector multiplications (i.e., matrix-free computation), reducing the computational cost to quadratic.<n>These modifications allow OFTv2 to achieve up to 10x faster training and 3x lower GPU memory usage without
arXiv Detail & Related papers (2025-06-24T17:59:49Z)
Scaling Probabilistic Circuits via Monarch Matrices [109.65822339230853]
Probabilistic Circuits (PCs) are tractable representations of probability distributions.<n>We propose a novel sparse and structured parameterization for the sum blocks in PCs.
arXiv Detail & Related papers (2025-06-14T07:39:15Z)
The Limits of Tractable Marginalization [23.716205079188608]
Marginalization -- summing a function over all assignments to a subset of its inputs -- is a fundamental computational problem.<n>We show that when there is an efficient real RAM performing virtual evidence marginalization for a function, then there are small circuits for that function's multilinear representation.<n>We conclude with a result, showing that whenever there is an efficient real RAM performing virtual evidence marginalization for a function, then there are small circuits for that function's multilinear representation.
arXiv Detail & Related papers (2025-04-17T07:54:56Z)
On Faster Marginalization with Squared Circuits via Orthonormalization [5.211074429497916]
We show how to parameterize squared circuits to ensure they encode already normalized distributions.<n>We then use this parameterization to devise an algorithm to compute any marginal of squared circuits that is more efficient than a previously known one.
arXiv Detail & Related papers (2024-12-10T19:37:03Z)
Diagonalization without Diagonalization: A Direct Optimization Approach for Solid-State Density Functional Theory [8.922374095111797]
We present a novel approach to address the challenges of variable occupation numbers in direct optimization of density functional theory. Our method incorporates physical constraints on both the eigenfunctions and the occupations into the parameterization. It produces the correct Fermi-Dirac distribution of the occupation numbers and yields band structures consistent with those obtained with SCF methods in Quantum Espresso.
arXiv Detail & Related papers (2024-11-06T11:03:40Z)
Refined Risk Bounds for Unbounded Losses via Transductive Priors [67.12679195076387]
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression.<n>Our key tools are based on the exponential weights algorithm with carefully chosen transductive priors.
arXiv Detail & Related papers (2024-10-29T00:01:04Z)
On the Relationship Between Monotone and Squared Probabilistic Circuits [32.42702089668762]
Probabilistic circuits are a unifying representation of functions as graphs of weighted sums and products.<n>Recently, it was proposed to instead represent densities as the square of the circuit function (squared circuits)<n>This raises the question of whether we can reconcile, and indeed improve upon the two modeling approaches.
arXiv Detail & Related papers (2024-08-01T18:56:08Z)
Cons-training Tensor Networks: Embedding and Optimization Over Discrete Linear Constraints [2.8834278113855896]
We introduce a novel family of tensor networks, termed constrained matrix product states (MPS)<n>MPS incorporate exactly arbitrary discrete linear constraints, including inequalities, into sparse block structures.<n>These networks are particularly tailored for modeling distributions with support strictly over the feasible space.
arXiv Detail & Related papers (2024-05-15T00:13:18Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.