On the Versatile Uses of Partial Distance Correlation in Deep Learning
- URL: http://arxiv.org/abs/2207.09684v1
- Date: Wed, 20 Jul 2022 06:36:11 GMT
- Title: On the Versatile Uses of Partial Distance Correlation in Deep Learning
- Authors: Xingjian Zhen, Zihang Meng, Rudrasis Chakraborty, Vikas Singh
- Abstract summary: This paper revisits a (less widely known) from statistics, called distance correlation (and its partial variant), designed to evaluate correlation between feature spaces of different dimensions.
We describe the steps necessary to carry out its deployment for large scale models.
This opens the door to a surprising array of applications ranging from conditioning one deep model w.r.t. another, learning disentangled representations as well as optimizing diverse models that would directly be more robust to adversarial attacks.
- Score: 47.11577420740119
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Comparing the functional behavior of neural network models, whether it is a
single network over time or two (or more networks) during or post-training, is
an essential step in understanding what they are learning (and what they are
not), and for identifying strategies for regularization or efficiency
improvements. Despite recent progress, e.g., comparing vision transformers to
CNNs, systematic comparison of function, especially across different networks,
remains difficult and is often carried out layer by layer. Approaches such as
canonical correlation analysis (CCA) are applicable in principle, but have been
sparingly used so far. In this paper, we revisit a (less widely known) from
statistics, called distance correlation (and its partial variant), designed to
evaluate correlation between feature spaces of different dimensions. We
describe the steps necessary to carry out its deployment for large scale models
-- this opens the door to a surprising array of applications ranging from
conditioning one deep model w.r.t. another, learning disentangled
representations as well as optimizing diverse models that would directly be
more robust to adversarial attacks. Our experiments suggest a versatile
regularizer (or constraint) with many advantages, which avoids some of the
common difficulties one faces in such analyses. Code is at
https://github.com/zhenxingjian/Partial_Distance_Correlation.
Related papers
- GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis [17.989809995141044]
We propose CCA Merge, which is based on Corre Analysis Analysis.
We show that CCA works significantly better than past methods when more than 2 models are merged.
arXiv Detail & Related papers (2024-07-07T14:21:04Z) - Reducing Computational Costs in Sentiment Analysis: Tensorized Recurrent
Networks vs. Recurrent Networks [0.12891210250935145]
Anticipating audience reaction towards a certain text is integral to several facets of society ranging from politics, research, and commercial industries.
Sentiment analysis (SA) is a useful natural language processing (NLP) technique that utilizes lexical/statistical and deep learning methods to determine whether different-sized texts exhibit positive, negative, or neutral emotions.
arXiv Detail & Related papers (2023-06-16T09:18:08Z) - Compare learning: bi-attention network for few-shot learning [6.559037166322981]
One of the Few-shot learning methods called metric learning addresses this challenge by first learning a deep distance metric to determine whether a pair of images belong to the same category.
In this paper, we propose a novel approach named Bi-attention network to compare the instances, which can measure the similarity between embeddings of instances precisely, globally and efficiently.
arXiv Detail & Related papers (2022-03-25T07:39:10Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - The Devil Is in the Details: An Efficient Convolutional Neural Network
for Transport Mode Detection [3.008051369744002]
Transport mode detection is a classification problem aiming to design an algorithm that can infer the transport mode of a user given multimodal signals.
We show that a small, optimized model can perform as well as a current deep model.
arXiv Detail & Related papers (2021-09-16T08:05:47Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.