Cohort Characteristics and Factors Associated with Cannabis Use among
Adolescents in Canada Using Pattern Discovery and Disentanglement Method
- URL: http://arxiv.org/abs/2109.01739v1
- Date: Fri, 3 Sep 2021 21:53:14 GMT
- Title: Cohort Characteristics and Factors Associated with Cannabis Use among
Adolescents in Canada Using Pattern Discovery and Disentanglement Method
- Authors: Peiyuan Zhou, Andrew K.C. Wong, Yang Yang, Scott T. Leatherdale, Kate
Battista, Zahid A. Butt, George Michalopoulos, Helen Chen
- Abstract summary: We aim to discover significant frequent/rare associations of behavioral factors among Canadian adolescents related to cannabis use.
We use a subset of Compass dataset which contains 18,761 records of students in grades 9 to 12 with 31 selected features (attributes) involving various characteristics.
We then used the Pattern Discovery and Disentanglement (PDD) algorithm to detect strong and rare (yet statistically significant) associations from the dataset.
- Score: 3.77251737220697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: COMPASS is a longitudinal, prospective cohort study collecting data annually
from students attending high school in jurisdictions across Canada. We aimed to
discover significant frequent/rare associations of behavioral factors among
Canadian adolescents related to cannabis use. We use a subset of COMPASS
dataset which contains 18,761 records of students in grades 9 to 12 with 31
selected features (attributes) involving various characteristics, from living
habits to academic performance. We then used the Pattern Discovery and
Disentanglement (PDD) algorithm that we have developed to detect strong and
rare (yet statistically significant) associations from the dataset. PDD used
the criteria derived from disentangled statistical spaces (known as
Re-projected Adjusted-Standardized Residual Vector Spaces, notated as RARV). It
outperformed methods using other criteria (i.e. support and confidence) popular
as reported in the literature. Association results showed that PDD can
discover: i) a smaller set of succinct significant associations in clusters;
ii) frequent and rare, yet significant, patterns supported by population health
relevant study; iii) patterns from a dataset with extremely imbalanced groups
(majority class: minority class = 88.3%: 11.7%).
Related papers
- Forecasting Mortality in the Middle-Aged and Older Population of England: A 1D-CNN Approach [0.0]
This study considers the English Longitudinal Study of Ageing (ELSA) survey, conducted every two years.
We use one-dimensional convolutional neural networks (1D-CNNs) to forecast mortality using socio-demographics, diseases, mobility impairment, Activities of Daily Living (ADLs) and Instrumental Activities of Daily Living (IADLs)
As our dataset is highly imbalanced, we try different over and undersampling methods and find that over-representing the small class improves the results.
arXiv Detail & Related papers (2024-11-01T02:20:19Z) - Machine Learning Predicts Upper Secondary Education Dropout as Early as the End of Primary School [0.0]
This study expanded the modeling horizon by utilizing a 13-year longitudinal dataset, encompassing data from kindergarten to Grade 9.
Our methodology incorporated a comprehensive range of parameters, including students' academic and cognitive skills, motivation, behavior, well-being, and officially recorded dropout data.
The machine learning models developed in this study demonstrated notable classification ability, achieving a mean area under the curve (AUC) of 0.61 with data up to Grade 6 and an improved AUC of 0.65 with data up to Grade 9.
arXiv Detail & Related papers (2024-03-01T13:18:08Z) - What's Race Got to do with it? Predicting Youth Depression Across Racial
Groups Using Machine and Deep Learning [0.0]
Depression is a common yet serious mental disorder that affects millions of U.S. high schoolers every year.
This study proposes a similar approach, utilizing machine learning (ML) and artificial neural network (ANN) models to classify depression in a student.
arXiv Detail & Related papers (2023-08-21T13:59:50Z) - Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - Evaluation of data imputation strategies in complex, deeply-phenotyped
data sets: the case of the EU-AIMS Longitudinal European Autism Project [0.0]
We evaluate different imputation strategies to fill in missing values in clinical data from a large (total N=764) dataset.
We consider a total of 160 clinical measures divided in 15 overlapping subsets of participants.
arXiv Detail & Related papers (2022-01-20T21:50:38Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Multi-characteristic Subject Selection from Biased Datasets [79.82881947891589]
We present a constrained optimization-based method that finds the best possible sampling fractions for the different population subgroups.
Our results show that our proposed method outperforms the baselines for all problem variations by up to 90%.
arXiv Detail & Related papers (2020-12-18T15:55:27Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.