Related papers: NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals

NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals

URL: http://arxiv.org/abs/2407.14561v4
Date: Tue, 01 Apr 2025 16:04:53 GMT
Title: NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
Authors: Jaden Fiotto-Kaufman, Alexander R. Loftus, Eric Todd, Jannik Brinkmann, Koyena Pal, Dmitrii Troitskii, Michael Ripa, Adam Belfki, Can Rager, Caden Juang, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Nikhil Prakash, Carla Brodley, Arjun Guha, Jonathan Bell, Byron C. Wallace, David Bau,
Abstract summary: We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks.
Score: 58.83169560132308
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks. NNsight is an open-source system that extends PyTorch to introduce deferred remote execution. The National Deep Inference Fabric (NDIF) is a scalable inference service that executes NNsight requests, allowing users to share GPU resources and pretrained models. These technologies are enabled by the Intervention Graph, an architecture developed to decouple experimental design from model runtime. Together, this framework provides transparent and efficient access to the internals of deep neural networks such as very large language models (LLMs) without imposing the cost or complexity of hosting customized models individually. We conduct a quantitative survey of the machine learning literature that reveals a growing gap in the study of the internals of large-scale AI. We demonstrate the design and use of our framework to address this gap by enabling a range of research methods on huge models. Finally, we conduct benchmarks to compare performance with previous approaches. Code, documentation, and tutorials are available at https://nnsight.net/.

Related papers

NNTile: a machine learning framework capable of training extremely large GPT language models on a single node [83.9328245724548]
NNTile is based on a StarPU library, which implements task-based parallelism and schedules all provided tasks onto all available processing units. It means that a particular operation, necessary to train a large neural network, can be performed on any of the CPU cores or GPU devices.
arXiv Detail & Related papers (2025-04-17T16:22:32Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules. We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z)
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows [72.40917624485822]
We introduce DataDreamer, an open source Python library that allows researchers to implement powerful large language models. DataDreamer also helps researchers adhere to best practices that we propose to encourage open science.
arXiv Detail & Related papers (2024-02-16T00:10:26Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models [74.64651681052628]
We introduce ModelScope-Agent, a customizable agent framework for real-world applications based on open-source LLMs as controllers. It provides a user-friendly system library, with customizable engine design to support model training on multiple open-source LLMs. A comprehensive framework has been proposed spanning over tool-use data collection, tool retrieval, tool registration, memory control, customized model training, and evaluation.
arXiv Detail & Related papers (2023-09-02T16:50:30Z)
nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales [65.01417261415833]
We present an approach to predict the pre-training loss based on our observations that Maximal Update Parametrization (muP) enables accurate fitting of scaling laws. With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B. Our goal with nanoLM is to empower researchers with limited resources to reach meaningful conclusions on large models.
arXiv Detail & Related papers (2023-04-14T00:45:01Z)
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically. Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z)
Comparison Analysis of Traditional Machine Learning and Deep Learning Techniques for Data and Image Classification [62.997667081978825]
The purpose of the study is to analyse and compare the most common machine learning and deep learning techniques used for computer vision 2D object classification tasks. Firstly, we will present the theoretical background of the Bag of Visual words model and Deep Convolutional Neural Networks (DCNN) Secondly, we will implement a Bag of Visual Words model, the VGG16 CNN Architecture.
arXiv Detail & Related papers (2022-04-11T11:34:43Z)
CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor. In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z)
Efficient Neural Architecture Search with Performance Prediction [0.0]
We use a neural architecture search to find the best network architecture for the task at hand. Existing NAS algorithms generally evaluate the fitness of a new architecture by fully training from scratch. An end-to-end offline performance predictor is proposed to accelerate the evaluation of sampled architectures.
arXiv Detail & Related papers (2021-08-04T05:44:16Z)
DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science [5.3825788156200565]
We present DGL-LifeSci, an open-source package for deep learning on graphs in life science. DGL-LifeSci is a python toolkit based on RDKit, PyTorch and Deep Graph Library. It allows GNN-based modeling on custom datasets for molecular property prediction, reaction prediction and molecule generation.
arXiv Detail & Related papers (2021-06-27T13:27:47Z)
Understanding Neural Code Intelligence Through Program Simplification [3.9704927572880253]
We propose a model-agnostic approach to identify critical input features for models in code intelligence systems. Our approach, SIVAND, uses simplification techniques that reduce the size of input programs of a CI model. We believe that SIVAND's extracted features may help understand neural CI systems' predictions and learned behavior.
arXiv Detail & Related papers (2021-06-07T05:44:29Z)
PyHealth: A Python Library for Health Predictive Models [53.848478115284195]
PyHealth is an open-source Python toolbox for developing various predictive models on healthcare data. The data preprocessing module enables the transformation of complex healthcare datasets into machine learning friendly formats. The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches.
arXiv Detail & Related papers (2021-01-11T22:02:08Z)
Minimum Viable Model Estimates for Machine Learning Projects [0.0]
We present a technique for estimating the minimum required performance characteristics of a predictive model. The technique has been implemented into the open source application MinViME.
arXiv Detail & Related papers (2021-01-02T01:01:20Z)
Direct Federated Neural Architecture Search [0.0]
We present an effective approach for direct federated NAS which is hardware agnostic, computationally lightweight, and a one-stage method to search for ready-to-deploy neural network models. Our results show an order of magnitude reduction in resource consumption while edging out prior art in accuracy.
arXiv Detail & Related papers (2020-10-13T08:11:35Z)
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
PHOTONAI -- A Python API for Rapid Machine Learning Model Development [2.414341608751139]
PHOTONAI is a high-level Python API designed to simplify and accelerate machine learning model development. It functions as a unifying framework allowing the user to easily access and combine algorithms from different toolboxes into custom algorithm sequences.
arXiv Detail & Related papers (2020-02-13T10:33:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.