Training Machine Learning Models to Characterize Temporal Evolution of
Disadvantaged Communities
- URL: http://arxiv.org/abs/2303.03677v1
- Date: Tue, 7 Mar 2023 06:33:40 GMT
- Title: Training Machine Learning Models to Characterize Temporal Evolution of
Disadvantaged Communities
- Authors: Milan Jain, Narmadha Meenu Mohankumar, Heng Wan, Sumitrra Ganguly,
Kyle D Wilson, and David M Anderson
- Abstract summary: The Justice40 initiative of the Department of Energy (DOE), USA, identifies census tracts across the USA to determine where climate and energy investments are or are not accruing.
The DAC status not only helps in determining the eligibility for future Justice40-related investments but is also critical for exploring ways to achieve equitable distribution of resources.
In this paper, machine learning (ML) models are trained on publicly available census data from recent years to classify the DAC status at the census tracts level and then the trained model is used to classify DAC status for historical years.
- Score: 2.1242970730855126
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Disadvantaged communities (DAC), as defined by the Justice40 initiative of
the Department of Energy (DOE), USA, identifies census tracts across the USA to
determine where benefits of climate and energy investments are or are not
currently accruing. The DAC status not only helps in determining the
eligibility for future Justice40-related investments but is also critical for
exploring ways to achieve equitable distribution of resources. However,
designing inclusive and equitable strategies not just requires a good
understanding of current demographics, but also a deeper analysis of the
transformations that happened in those demographics over the years. In this
paper, machine learning (ML) models are trained on publicly available census
data from recent years to classify the DAC status at the census tracts level
and then the trained model is used to classify DAC status for historical years.
A detailed analysis of the feature and model selection along with the evolution
of disadvantaged communities between 2013 and 2018 is presented in this study.
Related papers
- Transforming Social Science Research with Transfer Learning: Social Science Survey Data Integration with AI [0.4944564023471818]
Large-N nationally representative surveys, which have profoundly shaped American politics scholarship, represent related but distinct domains.
Our study introduces a novel application of transfer learning (TL) to address these gaps.
Models pre-trained on the Cooperative Election Study dataset are fine-tuned for use in the American National Election Studies dataset.
arXiv Detail & Related papers (2025-01-11T16:01:44Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
We introduce VLBiasBench, a benchmark to evaluate biases in Large Vision-Language Models (LVLMs)
VLBiasBench features a dataset that covers nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status, as well as two intersectional bias categories: race x gender and race x social economic status.
We conduct extensive evaluations on 15 open-source models as well as two advanced closed-source models, yielding new insights into the biases present in these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - Understanding Intrinsic Socioeconomic Biases in Large Language Models [4.276697874428501]
We introduce a novel dataset of one million English sentences to quantify socioeconomic biases.
Our findings reveal pervasive socioeconomic biases in both established models like GPT-2 and state-of-the-art models like Llama 2 and Falcon.
arXiv Detail & Related papers (2024-05-28T23:54:44Z) - Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information [50.29934517930506]
DAFair is a novel approach to address social bias in language models.
We leverage prototypical demographic texts and incorporate a regularization term during the fine-tuning process to mitigate bias.
arXiv Detail & Related papers (2024-03-14T15:58:36Z) - Assessing Generalization for Subpopulation Representative Modeling via
In-Context Learning [5.439020425819001]
This study evaluates the ability of Large Language Model (LLM)-based Subpopulation Representative Models (SRMs) to generalize from empirical data.
We explore generalization across response variables and demographic subgroups.
arXiv Detail & Related papers (2024-02-12T01:55:51Z) - Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - Survey of Social Bias in Vision-Language Models [65.44579542312489]
Survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL.
The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models.
arXiv Detail & Related papers (2023-09-24T15:34:56Z) - Predicting Socio-Economic Well-being Using Mobile Apps Data: A Case
Study of France [5.254432021398321]
This work investigates mobile app data to predict socio-economic features.
We present a large-scale study using data that captures traffic of thousands of mobile applications by approximately 30 million users.
Using the app usage patterns, our best model can estimate socio-economic indicators.
arXiv Detail & Related papers (2023-01-15T18:12:16Z) - Explaining Cross-Domain Recognition with Interpretable Deep Classifier [100.63114424262234]
Interpretable Deep (IDC) learns the nearest source samples of a target sample as evidence upon which the classifier makes the decision.
Our IDC leads to a more explainable model with almost no accuracy degradation and effectively calibrates classification for optimum reject options.
arXiv Detail & Related papers (2022-11-15T15:58:56Z) - Estimating a new panel MSK dataset for comparative analyses of national
absorptive capacity systems, economic growth, and development in low and
middle income economies [0.0]
Low- and middle-income countries (LMICs) are rarely part of any empirical discourse on growth, development, and innovation.
This work offers a new complete panel dataset with no missing values for LMICs eligible for IDA's support.
arXiv Detail & Related papers (2021-09-12T14:48:07Z) - Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data [48.7576911714538]
We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes.
For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions.
As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
arXiv Detail & Related papers (2020-06-23T16:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.