Related papers: Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

URL: http://arxiv.org/abs/2408.08448v4
Date: Wed, 11 Sep 2024 06:12:17 GMT
Title: Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability
Authors: Haniyeh Ehsani Oskouie, Lionel Levine, Majid Sarrafzadeh,
Abstract summary: This paper introduces a novel approach for assessing a newly trained model's performance based on another known model. The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output.
Score: 2.6708879445664584
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: As Artificial Intelligence (AI) models are increasingly integrated into critical systems, the need for a robust framework to establish the trustworthiness of AI is increasingly paramount. While collaborative efforts have established conceptual foundations for such a framework, there remains a significant gap in developing concrete, technically robust methods for assessing AI model quality and performance. A critical drawback in the traditional methods for assessing the validity and generalizability of models is their dependence on internal developer datasets, rendering it challenging to independently assess and verify their performance claims. This paper introduces a novel approach for assessing a newly trained model's performance based on another known model by calculating correlation between neural networks. The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output. This approach has implications for memory efficiency, allowing for the use of smaller networks when high correlation exists between networks of different sizes. Additionally, the method provides insights into robustness, suggesting that if two highly correlated networks are compared and one demonstrates robustness when operating in production environments, the other is likely to exhibit similar robustness. This contribution advances the technical toolkit for responsible AI, supporting more comprehensive and nuanced evaluations of AI models to ensure their safe and effective deployment. Code is available at https://github.com/aheldis/Cross-model-correlation.git.

Related papers

Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction [55.914891182214475]
We introduce neural network reprogrammability as a unifying framework for model adaptation.<n>We present a taxonomy that categorizes such information manipulation approaches across four key dimensions.<n>We also analyze remaining technical challenges and ethical considerations.
arXiv Detail & Related papers (2025-06-05T05:42:27Z)
Optimizing Deep Neural Networks using Safety-Guided Self Compression [0.0]
This study introduces a novel safety-driven quantization framework that prunes and quantizes neural network weights. The proposed methodology is rigorously evaluated on both a convolutional neural network (CNN) and an attention-based language model. Experimental results reveal that our framework achieves up to a 2.5% enhancement in test accuracy relative to the original unquantized models.
arXiv Detail & Related papers (2025-05-01T06:50:30Z)
An XAI-based Analysis of Shortcut Learning in Neural Networks [2.592470112714595]
We introduce the neuron spurious score to quantify a neuron's dependence on spurious features. Our results show that spurious features are partially disentangled, but the degree of disentanglement varies across model architectures. Our results lay the groundwork for the development of novel methods to mitigate spurious correlations and make AI models safer to use in practice.
arXiv Detail & Related papers (2025-04-22T07:40:45Z)
An unified approach to link prediction in collaboration networks [0.0]
This article investigates and compares three approaches to link prediction in colaboration networks. The ERGM is employed to capture general structural patterns within the network. The GCN and Word2Vec+MLP models leverage deep learning techniques to learn adaptive structural representations of nodes and their relationships.
arXiv Detail & Related papers (2024-11-01T22:40:39Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST) IST is a recently proposed and highly effective technique for solving the aforementioned problems. We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z)
Work In Progress: Safety and Robustness Verification of Autoencoder-Based Regression Models using the NNV Tool [0.0]
This work introduces robustness verification for autoencoder-based regression neural network (NN) models. We introduce two definitions of robustness evaluation metrics for autoencoder-based regression models. As per the authors' understanding, this work in progress paper is the first to show possible reachability analysis of autoencoder-based NNs.
arXiv Detail & Related papers (2022-07-14T09:10:30Z)
Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications. In this paper we propose an uncertainty quantification approach by modelling the distribution of features. We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem. We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
Context-dependent self-exciting point processes: models, methods, and risk bounds in high dimensions [21.760636228118607]
High-dimensional autoregressive point processes model how current events trigger or inhibit future events, such as activity by one member of a social network can affect the future activity of his or her neighbors. We leverage ideas from compositional time series and regularization methods in machine learning to conduct network estimation for high-dimensional marked point processes.
arXiv Detail & Related papers (2020-03-16T20:22:43Z)
Learning Queuing Networks by Recurrent Neural Networks [0.0]
We propose a machine-learning approach to derive performance models from data. We exploit a deterministic approximation of their average dynamics in terms of a compact system of ordinary differential equations. This allows for an interpretable structure of the neural network, which can be trained from system measurements to yield a white-box parameterized model.
arXiv Detail & Related papers (2020-02-25T10:56:47Z)
Feature Importance Estimation with Self-Attention Networks [0.0]
Black-box neural network models are widely used in industry and science, yet are hard to understand and interpret. Recently, the attention mechanism was introduced, offering insights into the inner workings of neural language models. This paper explores the use of attention-based neural networks mechanism for estimating feature importance, as means for explaining the models learned from propositional (tabular) data.
arXiv Detail & Related papers (2020-02-11T15:15:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.