Tensor Network for Supervised Learning at Finite Temperature
- URL: http://arxiv.org/abs/2104.05439v1
- Date: Fri, 9 Apr 2021 05:02:36 GMT
- Title: Tensor Network for Supervised Learning at Finite Temperature
- Authors: Haoxiang Lin, Shuqian Ye, Xi Zhu
- Abstract summary: finite temperature tensor network (FTTN) imports thermal perturbation into matrix product states framework.
FTTN regards it as thermal loss computed from entanglement with environment.
Temperature-like parameter can be automatically optimized, which gives each database an individual temperature.
- Score: 1.4717465036484292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The large variation of datasets is a huge barrier for image classification
tasks. In this paper, we embraced this observation and introduce the finite
temperature tensor network (FTTN), which imports the thermal perturbation into
the matrix product states framework by placing all images in an environment
with constant temperature, in analog to energy-based learning. Tensor network
is chosen since it is the best platform to introduce thermal fluctuation.
Different from traditional network structure which directly takes the summation
of individual losses as its loss function, FTTN regards it as thermal average
loss computed from the entanglement with the environment. The temperature-like
parameter can be automatically optimized, which gives each database an
individual temperature. FTTN obtains improvement in both test accuracy and
convergence speed in several datasets. The non-zero temperature automatically
separates similar features, avoiding the wrong classification in previous
architecture. The thermal fluctuation may give a better improvement in other
frameworks, and we may also implement the temperature of database to improve
the training effect.
Related papers
- A finite element-based physics-informed operator learning framework for spatiotemporal partial differential equations on arbitrary domains [33.7054351451505]
We propose a novel finite element-based physics operator learning framework that allows for predicting dynamics governed by partial differential equations (PDEs)
The proposed operator learning framework takes a temperature field at the current time step as input and predicts a temperature field at the next time step.
Networks successfully predict the temperature evolution over time for any initial temperature field at high accuracy compared to the FEM solution.
arXiv Detail & Related papers (2024-05-21T02:41:40Z) - To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO [68.69840111477367]
We present a principled framework for learning a small yet generalizable temperature prediction network (TempNet) to improve LFMs.
Our experiments on LLMs and CLIP models demonstrate that TempNet greatly improves the performance of existing solutions or models.
arXiv Detail & Related papers (2024-04-06T09:55:03Z) - Temperature Balancing, Layer-wise Weight Analysis, and Neural Network
Training [58.20089993899729]
This paper proposes TempBalance, a straightforward yet effective layerwise learning rate method.
We show that TempBalance significantly outperforms ordinary SGD and carefully-tuned spectral norm regularization.
We also show that TempBalance outperforms a number of state-of-the-art metrics and schedulers.
arXiv Detail & Related papers (2023-12-01T05:38:17Z) - Small Temperature is All You Need for Differentiable Architecture Search [8.93957397187611]
Differentiable architecture search (DARTS) yields highly efficient gradient-based neural architecture search (NAS)
We propose to close the gap between the relaxed supernet in training and the pruned finalnet in evaluation through utilizing small temperature.
arXiv Detail & Related papers (2023-06-12T04:01:57Z) - Not All Semantics are Created Equal: Contrastive Self-supervised
Learning with Automatic Temperature Individualization [51.41175648612714]
We propose a new robust contrastive loss inspired by distributionally robust optimization (DRO)
We show that our algorithm automatically learns a suitable $tau$ for each sample.
Our method outperforms prior strong baselines on unimodal and bimodal datasets.
arXiv Detail & Related papers (2023-05-19T19:25:56Z) - Fine-tune your Classifier: Finding Correlations With Temperature [2.071516130824992]
We analyze the impact of temperature on classification tasks by describing a dataset as a set of statistics computed on representations.
We study the correlation between these extracted statistics and the observed optimal temperatures.
arXiv Detail & Related papers (2022-10-18T09:48:46Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z) - Thermal Prediction for Efficient Energy Management of Clouds using
Machine Learning [31.735983199708013]
We study data from a private cloud and show the presence of thermal variations.
We propose a gradient boosting machine learning model for temperature prediction.
In addition, we propose a dynamic scheduling algorithm to minimize the peak temperature of hosts.
arXiv Detail & Related papers (2020-11-07T00:55:47Z) - Critical Phenomena in Complex Networks: from Scale-free to Random
Networks [77.34726150561087]
We study critical phenomena in a class of configuration network models with hidden variables controlling links between pairs of nodes.
We find analytical expressions for the average node degree, the expected number of edges, and the Landau and Helmholtz free energies, as a function of the temperature and number of nodes.
arXiv Detail & Related papers (2020-08-05T18:57:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.