Cost of Dietary Data Acquisition with Smart Group Catering
- URL: http://arxiv.org/abs/2001.00367v1
- Date: Thu, 2 Jan 2020 09:25:57 GMT
- Title: Cost of Dietary Data Acquisition with Smart Group Catering
- Authors: Jiapeng Dong and Pengju Wang and Weiqiang Sun
- Abstract summary: The need for dietary data management is growing with public awareness of food intakes.
As human labor is involved in both cases, manpower allocation is critical to data quality.
This paper has studied the relation between the quality of dietary data and the manpower invested.
- Score: 4.511923587827301
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The need for dietary data management is growing with public awareness of food
intakes. As a result, there are increasing deployments of smart canteens where
dietary data is collected through either Radio Frequency Identification (RFID)
or Computer Vision(CV)-based solutions. As human labor is involved in both
cases, manpower allocation is critical to data quality. Where manpower
requirements are underestimated, data quality is compromised. This paper has
studied the relation between the quality of dietary data and the manpower
invested, using numerical simulations based on real data collected from
multiple smart canteens. We found that in both RFID and CV-based systems, the
long-term cost of dietary data acquisition is dominated by manpower. Our study
provides a comprehensive understanding of the cost composition for dietary data
acquisition and useful insights toward future cost effective systems.
Related papers
- Economics of Sourcing Human Data [27.26816810619047]
We argue that the widespread use of large language models threatens the quality and integrity of human-generated data.
Existing data collection systems prioritize speed, scale, and efficiency at the cost of intrinsic human motivation.
We propose that rethinking data collection systems to align with contributors' intrinsic motivations.
arXiv Detail & Related papers (2025-02-11T17:51:52Z) - A monthly sub-national Harmonized Food Insecurity Dataset for comprehensive analysis and predictive modeling [0.11292693568898363]
This paper introduces the Harmonized Food Insecurity dataset (HFID), an open-source resource consolidating four key data sources.
The HFID serves as a vital tool for food security experts and humanitarian agencies, providing a unified resource for analyzing food security conditions.
The scientific community can also leverage the HFID to develop data-driven predictive models, enhancing the capacity to forecast and prevent future food crises.
arXiv Detail & Related papers (2025-01-10T16:13:57Z) - RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models [96.43285670458803]
Uni-Food is a unified food dataset that comprises over 100,000 images with various food labels.
Uni-Food is designed to provide a more holistic approach to food data analysis.
We introduce a novel Linear Rectification Mixture of Diverse Experts (RoDE) approach to address the inherent challenges of food-related multitasking.
arXiv Detail & Related papers (2024-07-17T16:49:34Z) - Copycats: the many lives of a publicly available medical imaging dataset [12.98380178359767]
Medical Imaging (MI) datasets are fundamental to artificial intelligence in healthcare.
MI datasets used to be proprietary, but have become increasingly available to the public, including on community-contributed platforms (CCPs) like Kaggle or HuggingFace.
While open data is important to enhance the redistribution of data's public value, we find that the current CCP governance model fails to uphold the quality needed and recommended practices for sharing, documenting, and evaluating datasets.
arXiv Detail & Related papers (2024-02-09T12:01:22Z) - Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets.
We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers.
Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches [59.38343165508926]
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating.
Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images.
We introduce NutritionVerse- Synth, the first large-scale dataset of 84,984 synthetic 2D food images with associated dietary information.
We also collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism.
arXiv Detail & Related papers (2023-09-14T13:29:41Z) - rWISDM: Repaired WISDM, a Public Dataset for Human Activity Recognition [0.0]
Human Activity Recognition (HAR) has become a spotlight in recent scientific research because of its applications in various domains such as healthcare, athletic competitions, smart cities, and smart home.
This paper presents the methods by which other researchers may identify and correct similar problems in public datasets.
arXiv Detail & Related papers (2023-05-17T13:55:50Z) - How Much More Data Do I Need? Estimating Requirements for Downstream
Tasks [99.44608160188905]
Given a small training data set and a learning algorithm, how much more data is necessary to reach a target validation or test performance?
Overestimating or underestimating data requirements incurs substantial costs that could be avoided with an adequate budget.
Using our guidelines, practitioners can accurately estimate data requirements of machine learning systems to gain savings in both development time and data acquisition costs.
arXiv Detail & Related papers (2022-07-04T21:16:05Z) - Data Smells in Public Datasets [7.1460275491017144]
We introduce a novel catalogue of data smells that can be used to indicate early signs of problems in machine learning systems.
To understand the prevalence of data quality issues in datasets, we analyse 25 public datasets and identify 14 data smells.
arXiv Detail & Related papers (2022-03-15T15:44:20Z) - A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources.
The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion.
This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.