Investigation into U.S. Citizen and Non-Citizen Worker Health Insurance and Employment
- URL: http://arxiv.org/abs/2601.00896v1
- Date: Wed, 31 Dec 2025 16:00:34 GMT
- Title: Investigation into U.S. Citizen and Non-Citizen Worker Health Insurance and Employment
- Authors: Annabelle Yao,
- Abstract summary: This study uses statistical analysis and machine learning clustering techniques to analyze socioeconomic integration and inequality.<n>Using statistical tests, we identified the proportion of the population with healthcare insurance, quality education, and employment.<n>The five clusters our analysis identifies reveal that while citizenship status shows no association with workforce participation, significant disparities exist in access to employer-sponsored health insurance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Socioeconomic integration is a critical dimension of social equity, yet persistent disparities remain in access to health insurance, education, and employment across different demographic groups. While previous studies have examined isolated aspects of inequality, there is limited research that integrates both statistical analysis and advanced machine learning to uncover hidden structures within population data. This study leverages statistical analysis ($χ^2$ test of independence and Two Proportion Z-Test) and machine learning clustering techniques -- K-Modes and K-Prototypes -- along with t-SNE visualization and CatBoost classification to analyze socioeconomic integration and inequality. Using statistical tests, we identified the proportion of the population with healthcare insurance, quality education, and employment. With this data, we concluded that there was an association between employment and citizenship status. Moreover, we were able to determine 5 distinct population groups using Machine Learning classification. The five clusters our analysis identifies reveal that while citizenship status shows no association with workforce participation, significant disparities exist in access to employer-sponsored health insurance. Each cluster represents a distinct demographic of the population, showing that there is a primary split along the lines of educational attainment which separates Clusters 0 and 4 from Clusters 1, 2, and 3. Furthermore, labor force status and nativity serve as secondary differentiators. Non-citizens are also disproportionately concentrated in precarious employment without benefits, highlighting systemic inequalities in healthcare access. By uncovering demographic clusters that face compounded disadvantages, this research contributes to a more nuanced understanding of socioeconomic stratification.
Related papers
- An External Fairness Evaluation of LinkedIn Talent Search [55.18656975953939]
We conduct an independent, third-party audit for bias of LinkedIn's Talent Search ranking system.<n>We focus on potential ranking bias across two attributes: gender and race.<n>Our analysis reveals an under-representation of minority groups in early ranks.
arXiv Detail & Related papers (2025-11-13T19:10:49Z) - Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach [53.824673312331626]
Implicit Demography Inference (IDI) module uses k-means clustering to mitigate bias in Speech Emotion Recognition (SER)<n>Experiments show that pseudo-labeling IDI reduces subgroup disparities, improving fairness metrics by over 28%.<n>Unsupervised IDI yields more than a 4.6% improvement in fairness metrics with a drop of less than 3.6% in SER performance.
arXiv Detail & Related papers (2025-05-20T14:50:44Z) - Assessing Racial Disparities in Healthcare Expenditures via Mediator Distribution Shifts [4.357338639836869]
This study develops a framework for decomposing such disparities through shifts in the distributions of mediating variables.<n>We examine the extent to which expenditure disparities would persist or be reduced if mediators such as insurance access, health behaviors, or health status were equalized across racial groups.
arXiv Detail & Related papers (2025-04-30T14:23:50Z) - Sample Selection Bias in Machine Learning for Healthcare [17.549969100454803]
We focus on sample selection bias ( SSB), a specific type of bias where the study population is less representative of the target population.<n>Existing machine learning techniques try to correct the bias mostly by balancing distributions between the study and the target populations.<n>We propose a new research direction for addressing SSB, based on the target population identification rather than the bias correction.
arXiv Detail & Related papers (2024-05-13T15:30:35Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - The Impact of Socioeconomic Factors on Health Disparities [0.0]
We examined data from the US Census and the CDC to determine the degree to which specific socioeconomic factors correlate with both specific and general health metrics.
Our results indicate that certain socioeconomic factors, like income and educational attainment, are highly correlated with aggregate measures of health.
arXiv Detail & Related papers (2022-12-01T00:00:40Z) - Sociodemographic inequalities in student achievement: An intersectional
multilevel analysis of individual heterogeneity and discriminatory accuracy
(MAIHDA) with application to students in London, England [0.0]
We study sociodemographic inequalities in student achievement across two cohorts of students in London, England.
We find substantial strata-level variation in achievement composed primarily by additive rather than interactive effects.
We conclude that policymakers should pay greater attention to multiply marginalized students.
arXiv Detail & Related papers (2022-11-10T16:16:52Z) - Fair Machine Learning in Healthcare: A Review [90.22219142430146]
We analyze the intersection of fairness in machine learning and healthcare disparities.
We provide a critical review of the associated fairness metrics from a machine learning standpoint.
We propose several new research directions that hold promise for developing ethical and equitable ML applications in healthcare.
arXiv Detail & Related papers (2022-06-29T04:32:10Z) - Assessing Social Determinants-Related Performance Bias of Machine
Learning Models: A case of Hyperchloremia Prediction in ICU Population [6.8473641147443995]
We evaluated four classifiers built to predict Hyperchloremia, a condition that often results from aggressive fluids administration in the ICU population.
We observed that adding social determinants features in addition to the lab-based ones improved model performance on all patients.
We urge future researchers to design models that proactively adjust for potential biases and include subgroup reporting.
arXiv Detail & Related papers (2021-11-18T03:58:50Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.