SMELLNET: A Large-scale Dataset for Real-world Smell Recognition
- URL: http://arxiv.org/abs/2506.00239v1
- Date: Fri, 30 May 2025 21:15:25 GMT
- Title: SMELLNET: A Large-scale Dataset for Real-world Smell Recognition
- Authors: Dewei Feng, Carol Li, Wei Dai, Paul Pu Liang,
- Abstract summary: We use portable gas and chemical sensors to create SmellNet.<n>SmellNet is the first large-scale database that digitizes a diverse range of smells in the natural world.<n>We train AI models for real-time classification of substances based on their smell alone.
- Score: 24.9959802608091
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability of AI to sense and identify various substances based on their smell alone can have profound impacts on allergen detection (e.g., smelling gluten or peanuts in a cake), monitoring the manufacturing process, and sensing hormones that indicate emotional states, stress levels, and diseases. Despite these broad impacts, there are virtually no large scale benchmarks, and therefore little progress, for training and evaluating AI systems' ability to smell in the real world. In this paper, we use portable gas and chemical sensors to create SmellNet, the first large-scale database that digitizes a diverse range of smells in the natural world. SmellNet contains about 180,000 time steps of 50 substances (spanning nuts, spices, herbs, fruits, and vegetables) with 50 hours of data. Using SmellNet, we train AI models for real-time classification of substances based on their smell alone. Our best methods leverage sequence models, contrastive learning to integrate high-resolution Gas Chromatography-Mass Spectrometry molecular data, and a new temporal difference method that identifies sharp changes in sensor readings. Our best models achieve up to 65.35% accuracy on pre-recorded data, and generalize to real-world conditions with 10.71% accuracy on nuts and 25.38% on spices in the challenging 50-way online classification task. Despite these promising results, SmellNet highlights many technical challenges in building AI for smell, including richer feature learning, on-edge smell models, and robustness to environmental changes.
Related papers
- Diffusion Models for Increasing Accuracy in Olfaction Sensors and Datasets [0.0]
We introduce a novel machine learning method using diffusion-based molecular generation to enhance odour localization accuracy.<n>Our framework enhances the ability of olfaction-vision models on robots to accurately associate odours with their correct sources.
arXiv Detail & Related papers (2025-05-31T08:22:09Z) - Navigating the Fragrance space Via Graph Generative Models And Predicting Odors [0.2749898166276853]
We explore a suite of generative modelling techniques to efficiently navigate and explore the complex landscapes of odor and the broader chemical space.<n>Unlike traditional approaches, we not only generate molecules but also predict the odor likeliness with ROC AUC score of 0.97 and assign probable odor labels.
arXiv Detail & Related papers (2025-01-30T22:00:23Z) - Machine Learning for Methane Detection and Quantification from Space -- A survey [49.7996292123687]
Methane (CH_4) is a potent anthropogenic greenhouse gas, contributing 86 times more to global warming than Carbon Dioxide (CO_2) over 20 years.
This work expands existing information on operational methane point source detection sensors in the Short-Wave Infrared (SWIR) bands.
It reviews the state-of-the-art for traditional as well as Machine Learning (ML) approaches.
arXiv Detail & Related papers (2024-08-27T15:03:20Z) - Data Science In Olfaction [1.4499463058550683]
We conceptualize smell from a Data Science and AI perspective, that relates the properties of odorants to how they are sensed and analyzed in the olfactory system from the nose to the brain.
Drawing distinctions to color vision, we argue that smell presents unique measurement challenges, including the complexity of stimuli, the high dimensionality of the sensory apparatus, as well as what constitutes ground truth.
We present results using machine learning-based classification of neural responses to odors as they are recorded in the mouse olfactory bulb with calcium imaging.
arXiv Detail & Related papers (2024-04-08T13:25:02Z) - Olfactory Label Prediction on Aroma-Chemical Pairs [0.2749898166276853]
We present graph neural network models capable of accurately predicting the odor qualities arising from blends of aroma-chemicals.
In this paper, we apply both existing and novel approaches to a dataset we gathered consisting of labeled pairs of molecules.
arXiv Detail & Related papers (2023-12-26T17:18:09Z) - Limitations in odour recognition and generalisation in a neuromorphic
olfactory circuit [0.07589017023705934]
We present an odour-learning algorithm that runs on a neuromorphic architecture and is inspired by circuits described in the mammalian olfactory bulb.
They assess the algorithm's performance in "rapid online learning and identification" of gaseous odorants and odorless gases.
arXiv Detail & Related papers (2023-09-20T18:00:05Z) - Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics [44.97217246897902]
We address the challenge of using energy-based models to produce high-quality, label-specific data in structured datasets.
Traditional training methods encounter difficulties due to inefficient Markov chain Monte Carlo mixing.
We use a novel training algorithm that exploits non-equilibrium effects.
arXiv Detail & Related papers (2023-07-13T15:08:44Z) - ChemVise: Maximizing Out-of-Distribution Chemical Detection with the
Novel Application of Zero-Shot Learning [60.02503434201552]
This research proposes learning approximations of complex exposures from training sets of simple ones.
We demonstrate this approach to synthetic sensor responses surprisingly improves the detection of out-of-distribution obscured chemical analytes.
arXiv Detail & Related papers (2023-02-09T20:19:57Z) - Graph Neural Networks with Trainable Adjacency Matrices for Fault
Diagnosis on Multivariate Sensor Data [69.25738064847175]
It is necessary to consider the behavior of the signals in each sensor separately, to take into account their correlation and hidden relationships with each other.
The graph nodes can be represented as data from the different sensors, and the edges can display the influence of these data on each other.
It was proposed to construct a graph during the training of graph neural network. This allows to train models on data where the dependencies between the sensors are not known in advance.
arXiv Detail & Related papers (2022-10-20T11:03:21Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - Real-time detection of uncalibrated sensors using Neural Networks [62.997667081978825]
An online machine-learning based uncalibration detector for temperature, humidity and pressure sensors was developed.
The solution integrates an Artificial Neural Network as main component which learns from the behavior of the sensors under calibrated conditions.
The obtained results show that the proposed solution is able to detect uncalibrations for deviation values of 0.25 degrees, 1% RH and 1.5 Pa, respectively.
arXiv Detail & Related papers (2021-02-02T15:44:39Z) - Unassisted Noise Reduction of Chemical Reaction Data Sets [59.127921057012564]
We propose a machine learning-based, unassisted approach to remove chemically wrong entries from data sets.
Our results show an improved prediction quality for models trained on the cleaned and balanced data sets.
arXiv Detail & Related papers (2021-02-02T09:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.