MINT-Demo: Membership Inference Test Demonstrator
- URL: http://arxiv.org/abs/2503.08332v1
- Date: Tue, 11 Mar 2025 11:45:05 GMT
- Title: MINT-Demo: Membership Inference Test Demonstrator
- Authors: Daniel DeAlcala, Aythami Morales, Julian Fierrez, Gonzalo Mancera, Ruben Tolosana, Ruben Vera-Rodriguez,
- Abstract summary: MINT is a technique for experimentally determining whether certain data has been used during the training of machine learning models.<n>We conduct experiments with popular face recognition models and 5 public databases containing over 22M images.
- Score: 13.795574322456797
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present the Membership Inference Test Demonstrator, to emphasize the need for more transparent machine learning training processes. MINT is a technique for experimentally determining whether certain data has been used during the training of machine learning models. We conduct experiments with popular face recognition models and 5 public databases containing over 22M images. Promising results, up to 89% accuracy are achieved, suggesting that it is possible to recognize if an AI model has been trained with specific data. Finally, we present a MINT platform as demonstrator of this technology aimed to promote transparency in AI training.
Related papers
- Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs [14.618008816273784]
MINT is a general approach to determine if given data was used for training machine learning models.
This work focuses on its application to the domain of Natural Language Processing.
arXiv Detail & Related papers (2025-03-10T14:32:56Z) - Curating Demonstrations using Online Experience [52.59275477573012]
We show that Demo-SCORE can effectively identify suboptimal demonstrations without manual curation.
Demo-SCORE achieves over 15-35% higher absolute success rate in the resulting policy compared to the base policy trained with all original demonstrations.
arXiv Detail & Related papers (2025-03-05T17:58:16Z) - Is my Data in your AI Model? Membership Inference Test with Application to Face Images [18.402616111394842]
This article introduces the Membership Inference Test (MINT), a novel approach that aims to empirically assess if given data was used during the training of AI/ML models.
We propose two MINT architectures designed to learn the distinct activation patterns that emerge when an Audited Model is exposed to data used during its training process.
Experiments are carried out using six publicly available databases, comprising over 22 million face images in total.
arXiv Detail & Related papers (2024-02-14T15:09:01Z) - Development of an NLP-driven computer-based test guide for visually
impaired students [0.28647133890966986]
This paper presents an NLP-driven Computer-Based Test guide for visually impaired students.
It employs a speech technology pre-trained methods to provide real-time assistance and support to visually impaired students.
arXiv Detail & Related papers (2024-01-22T21:59:00Z) - Exploring Large-scale Unlabeled Faces to Enhance Facial Expression
Recognition [12.677143408225167]
We propose a semi-supervised learning framework that utilizes unlabeled face data to train expression recognition models effectively.
Our method uses a dynamic threshold module that can adaptively adjust the confidence threshold to fully utilize the face recognition data.
In the ABAW5 EXPR task, our method achieved excellent results on the official validation set.
arXiv Detail & Related papers (2023-03-15T13:43:06Z) - Leveraging Demonstrations to Improve Online Learning: Quality Matters [54.98983862640944]
We show that the degree of improvement must depend on the quality of the demonstration data.
We propose an informed TS algorithm that utilizes the demonstration data in a coherent way through Bayes' rule.
arXiv Detail & Related papers (2023-02-07T08:49:12Z) - Process-BERT: A Framework for Representation Learning on Educational
Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z) - MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal
Emotion Recognition [118.73025093045652]
We propose a pre-training model textbfMEmoBERT for multimodal emotion recognition.
Unlike the conventional "pre-train, finetune" paradigm, we propose a prompt-based method that reformulates the downstream emotion classification task as a masked text prediction.
Our proposed MEmoBERT significantly enhances emotion recognition performance.
arXiv Detail & Related papers (2021-10-27T09:57:00Z) - Fairness in the Eyes of the Data: Certifying Machine-Learning Models [38.09830406613629]
We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test.
We tackle two scenarios, where either the test data is privately available only to the tester or is publicly known in advance, even to the model creator.
We provide a cryptographic technique to automate fairness testing and certified inference with only black-box access to the model at hand while hiding the participants' sensitive data.
arXiv Detail & Related papers (2020-09-03T09:22:39Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.