Related papers: Machine Learning-Based Estimation and Goodness-of-Fit for Large-Scale Confirmatory Item Factor Analysis

Machine Learning-Based Estimation and Goodness-of-Fit for Large-Scale Confirmatory Item Factor Analysis

URL: http://arxiv.org/abs/2109.09500v1
Date: Mon, 20 Sep 2021 12:53:01 GMT
Title: Machine Learning-Based Estimation and Goodness-of-Fit for Large-Scale Confirmatory Item Factor Analysis
Authors: Christopher J. Urban and Daniel J. Bauer
Abstract summary: We investigate novel parameter estimation and goodness-of-fit (GOF) assessment methods for large-scale item factor analysis (IFA) For parameter estimation, we extend Urban and Bauer's (2021) deep learning algorithm for exploratory IFA to the confirmatory setting. For GOF assessment, we explore new simulation-based tests and indices.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We investigate novel parameter estimation and goodness-of-fit (GOF) assessment methods for large-scale confirmatory item factor analysis (IFA) with many respondents, items, and latent factors. For parameter estimation, we extend Urban and Bauer's (2021) deep learning algorithm for exploratory IFA to the confirmatory setting by showing how to handle user-defined constraints on loadings and factor correlations. For GOF assessment, we explore new simulation-based tests and indices. In particular, we consider extensions of the classifier two-sample test (C2ST), a method that tests whether a machine learning classifier can distinguish between observed data and synthetic data sampled from a fitted IFA model. The C2ST provides a flexible framework that integrates overall model fit, piece-wise fit, and person fit. Proposed extensions include a C2ST-based test of approximate fit in which the user specifies what percentage of observed data can be distinguished from synthetic data as well as a C2ST-based relative fit index that is similar in spirit to the relative fit indices used in structural equation modeling. Via simulation studies, we first show that the confirmatory extension of Urban and Bauer's (2021) algorithm produces more accurate parameter estimates as the sample size increases and obtains comparable estimates to a state-of-the-art confirmatory IFA estimation procedure in less time. We next show that the C2ST-based test of approximate fit controls the empirical type I error rate and detects when the number of latent factors is misspecified. Finally, we empirically investigate how the sampling distribution of the C2ST-based relative fit index depends on the sample size.

Related papers

Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators [15.294324192338484]
Area Under the Risk-Coverage Curve (AURC) has emerged as the foremost evaluation metric for assessing the performance of SC systems. We derive empirical AURC plug-in estimators for finite sample scenarios. We empirically validate the effectiveness of our estimators through experiments across multiple datasets.
arXiv Detail & Related papers (2024-10-20T11:14:51Z)
Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
L-C2ST: Local Diagnostics for Posterior Approximations in Simulation-Based Inference [63.22081662149488]
L-C2ST allows for a local evaluation of the posterior estimator at any given observation. It offers theoretically grounded and easy to interpret. On standard SBI benchmarks, L-C2ST provides comparable results to C2ST and outperforms alternative local approaches.
arXiv Detail & Related papers (2023-06-06T10:53:26Z)
ALMERIA: Boosting pairwise molecular contrasts with scalable methods [0.0]
ALMERIA is a tool for estimating compound similarities and activity prediction based on pairwise molecular contrasts. It has been implemented using scalable software and methods to exploit large volumes of data. Experiments show state-of-the-art performance for molecular activity prediction.
arXiv Detail & Related papers (2023-04-28T16:27:06Z)
Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z)
Computational Efficient Approximations of the Concordance Probability in a Big Data Setting [0.0]
We propose two estimation methods that calculate the concordance probability in a fast and accurate way. Experiments on two real-life data sets confirm the conclusions of the artificial simulations.
arXiv Detail & Related papers (2021-05-21T15:09:53Z)
Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions [57.77347280992548]
In this paper, we design two-sample tests for pairwise comparison data and ranking data. Our test requires essentially no assumptions on the distributions. By applying our two-sample test on real-world pairwise comparison data, we conclude that ratings and rankings provided by people are indeed distributed differently.
arXiv Detail & Related papers (2020-06-21T20:51:09Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.