Related papers: Integrating kNN with Foundation Models for Adaptable and Privacy-Aware Image Classification

Integrating kNN with Foundation Models for Adaptable and Privacy-Aware Image Classification

URL: http://arxiv.org/abs/2402.12500v1
Date: Mon, 19 Feb 2024 20:08:13 GMT
Title: Integrating kNN with Foundation Models for Adaptable and Privacy-Aware Image Classification
Authors: Sebastian Doerrich, Tobias Archut, Francesco Di Salvo, Christian Ledig
Abstract summary: Traditional deep learning models implicity encode knowledge limiting their transparency and ability to adapt to data changes. We address this limitation by storing embeddings of the underlying training data independently of the model weights. Our approach integrates the $k$-Nearest Neighbor ($k$-NN) classifier with a vision-based foundation model, pre-trained self-supervised on natural images.
Score: 0.13108652488669734
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Traditional deep learning models implicity encode knowledge limiting their transparency and ability to adapt to data changes. Yet, this adaptability is vital for addressing user data privacy concerns. We address this limitation by storing embeddings of the underlying training data independently of the model weights, enabling dynamic data modifications without retraining. Specifically, our approach integrates the $k$-Nearest Neighbor ($k$-NN) classifier with a vision-based foundation model, pre-trained self-supervised on natural images, enhancing interpretability and adaptability. We share open-source implementations of a previously unpublished baseline method as well as our performance-improving contributions. Quantitative experiments confirm improved classification across established benchmark datasets and the method's applicability to distinct medical image classification tasks. Additionally, we assess the method's robustness in continual learning and data removal scenarios. The approach exhibits great promise for bridging the gap between foundation models' performance and challenges tied to data privacy. The source code is available at https://github.com/TobArc/privacy-aware-image-classification-with-kNN.

Related papers

Prompt-Driven and Training-Free Forgetting Approach and Dataset for Large Language Models [4.824120664293887]
We propose an Automatic dataset Creation Framework based on prompt-based layered editing and training-free local feature removal. The ForgetMe dataset encompasses a diverse set of real and synthetic scenarios, including CUB-200-2011 (Birds), Stanford-Dogs, ImageNet, and a synthetic cat dataset. We apply LoRA fine-tuning on Stable Diffusion to achieve selective unlearning on this dataset and validate the effectiveness of both the ForgetMe dataset and the Entangled metric.
arXiv Detail & Related papers (2025-04-17T01:44:57Z)
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning. Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning. We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z)
ConceptDrift: Uncovering Biases through the Lens of Foundation Models [5.025665239455297]
We show that ConceptDrift can be scaled to automatically identify biases in datasets without human prior knowledge. We propose two bias identification evaluation protocols to fill the gap in the previous work and show that our method significantly improves over SoTA methods. Our method is not bounded to a single modality, and we empirically validate it both on image (Waterbirds, CelebA, ImageNet, and text datasets (CivilComments))
arXiv Detail & Related papers (2024-10-24T17:59:16Z)
Privacy-preserving datasets by capturing feature distributions with Conditional VAEs [0.11999555634662634]
Conditional Variational Autoencoders (CVAEs) trained on feature vectors extracted from large pre-trained vision foundation models. Our method notably outperforms traditional approaches in both medical and natural image domains. Results underscore the potential of generative models to significantly impact deep learning applications in data-scarce and privacy-sensitive environments.
arXiv Detail & Related papers (2024-08-01T15:26:24Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification [34.37262622415682]
We propose a new adaptation framework called Data Adaptive Traceback. Specifically, we utilize a zero-shot-based method to extract the most downstream task-related subset of the pre-training data. We adopt a pseudo-label-based semi-supervised technique to reuse the pre-training images and a vision-language contrastive learning method to address the confirmation bias issue in semi-supervised learning.
arXiv Detail & Related papers (2024-07-11T18:01:58Z)
Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training [23.56208527227504]
Source-free domain adaptation (SFDA) aims to adapt a source model trained on a fully-labeled source domain to a related but unlabeled target domain. In the conventional SFDA pipeline, a large data (e.g. ImageNet) pre-trained feature extractor is used to initialize the source model. We introduce an integrated framework to incorporate pre-trained networks into the target adaptation process.
arXiv Detail & Related papers (2024-05-05T14:48:13Z)
Consistency Regularization for Generalizable Source-free Domain Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset. Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets. We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z)
Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models. We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models. Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
Content-Aware Differential Privacy with Conditional Invertible Neural Networks [0.7102341019971402]
Invertible Neural Networks (INNs) have shown excellent generative performance while still providing the ability to quantify the exact likelihood. We hypothesize that adding noise to the latent space of an INN can enable differentially private image modification. We conduct experiments on publicly available benchmarking datasets as well as dedicated medical ones.
arXiv Detail & Related papers (2022-07-29T11:52:16Z)
Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect. There always exists severe domain shift between the pretraining and downstream ABSA datasets. We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z)
A Curriculum-style Self-training Approach for Source-Free Semantic Segmentation [91.13472029666312]
We propose a curriculum-style self-training approach for source-free domain adaptive semantic segmentation. Our method yields state-of-the-art performance on source-free semantic segmentation tasks for both synthetic-to-real and adverse conditions.
arXiv Detail & Related papers (2021-06-22T10:21:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.