Related papers: Self-learn to Explain Siamese Networks Robustly

Self-learn to Explain Siamese Networks Robustly

URL: http://arxiv.org/abs/2109.07371v1
Date: Wed, 15 Sep 2021 15:28:39 GMT
Title: Self-learn to Explain Siamese Networks Robustly
Authors: Chao Chen, Yifan Shen, Guixiang Ma, Xiangnan Kong, Srinivas Rangarajan, Xi Zhang, Sihong Xie
Abstract summary: Learning to compare two objects are used in digital forensics, face recognition, brain network analysis, especially when labeled data is scarce. As these applications make high-stake decisions involving societal values like fairness and imbalance, it is critical to explain learned models.
Score: 22.913886901196353
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning to compare two objects are essential in applications, such as digital forensics, face recognition, and brain network analysis, especially when labeled data is scarce and imbalanced. As these applications make high-stake decisions and involve societal values like fairness and transparency, it is critical to explain the learned models. We aim to study post-hoc explanations of Siamese networks (SN) widely used in learning to compare. We characterize the instability of gradient-based explanations due to the additional compared object in SN, in contrast to architectures with a single input instance. We propose an optimization framework that derives global invariance from unlabeled data using self-learning to promote the stability of local explanations tailored for specific query-reference pairs. The optimization problems can be solved using gradient descent-ascent (GDA) for constrained optimization, or SGD for KL-divergence regularized unconstrained optimization, with convergence proofs, especially when the objective functions are nonconvex due to the Siamese architecture. Quantitative results and case studies on tabular and graph data from neuroscience and chemical engineering show that the framework respects the self-learned invariance while robustly optimizing the faithfulness and simplicity of the explanation. We further demonstrate the convergence of GDA experimentally.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients [24.699470675718466]
We present a spectral analysis of layer-wise gradients induced by low/high-quality instruction and reasoning data. Our work illuminates the interplay between data quality and training stability, shedding novel insights into developing better data exploration strategies for post-training.
arXiv Detail & Related papers (2025-04-14T23:53:47Z)
Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning. We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition. We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z)
Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance. Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z)
Semi-parametric Expert Bayesian Network Learning with Gaussian Processes and Horseshoe Priors [26.530289799110562]
This paper proposes a model learning Semi-parametric rela- tionships in an Expert Bayesian Network (SEBN) We use Gaussian Pro- cesses and a Horseshoe prior to introduce minimal nonlin- ear components. In real-world datasets with unknown truth, we gen- erate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability.
arXiv Detail & Related papers (2024-01-29T18:57:45Z)
Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z)
Discovering Dynamic Causal Space for DAG Structure Learning [64.763763417533]
We propose a dynamic causal space for DAG structure learning, coined CASPER. It integrates the graph structure into the score function as a new measure in the causal space to faithfully reflect the causal distance between estimated and ground truth DAG.
arXiv Detail & Related papers (2023-06-05T12:20:40Z)
Provable Robust Saliency-based Explanations [16.217374556142484]
We show that R2ET attains higher explanation robustness under stealthy attacks while retaining model accuracy. Experiments with a wide spectrum of network architectures and data modalities demonstrate that R2ET attains higher explanation robustness under stealthy attacks while retaining model accuracy.
arXiv Detail & Related papers (2022-12-28T22:05:32Z)
Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances. Online descent (OGD) is a popular approach to handle streaming data in pairwise learning. In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z)
Are conditional GANs explicitly conditional? [0.0]
This paper proposes two contributions for conditional Generative Adversarial Networks (cGANs) The first main contribution is an analysis of cGANs to show that they are not explicitly conditional. The second contribution is a new method, called acontrario, that explicitly models conditionality for both parts of the adversarial architecture.
arXiv Detail & Related papers (2021-06-28T22:49:27Z)
Bounding Information Leakage in Machine Learning [26.64770573405079]
This paper investigates fundamental bounds on information leakage. We identify and bound the success rate of the worst-case membership inference attack. We derive bounds on the mutual information between the sensitive attributes and model parameters.
arXiv Detail & Related papers (2021-05-09T08:49:14Z)
Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP) By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently. Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.