Can x2vec Save Lives? Integrating Graph and Language Embeddings for
Automatic Mental Health Classification
- URL: http://arxiv.org/abs/2001.01126v1
- Date: Sat, 4 Jan 2020 20:56:21 GMT
- Title: Can x2vec Save Lives? Integrating Graph and Language Embeddings for
Automatic Mental Health Classification
- Authors: Alexander Ruch
- Abstract summary: I show how merging graph and language embedding models (metapath2vec and doc2vec) avoids resource limits.
When integrated, both data produce highly accurate predictions (90%, with 10% false-positives and 12% false-negatives)
These results extend research on the importance of simultaneously analyzing behavior and language in massive networks.
- Score: 91.3755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph and language embedding models are becoming commonplace in large scale
analyses given their ability to represent complex sparse data densely in
low-dimensional space. Integrating these models' complementary relational and
communicative data may be especially helpful if predicting rare events or
classifying members of hidden populations - tasks requiring huge and sparse
datasets for generalizable analyses. For example, due to social stigma and
comorbidities, mental health support groups often form in amorphous online
groups. Predicting suicidality among individuals in these settings using
standard network analyses is prohibitive due to resource limits (e.g., memory),
and adding auxiliary data like text to such models exacerbates complexity- and
sparsity-related issues. Here, I show how merging graph and language embedding
models (metapath2vec and doc2vec) avoids these limits and extracts unsupervised
clustering data without domain expertise or feature engineering. Graph and
language distances to a suicide support group have little correlation (\r{ho} <
0.23), implying the two models are not embedding redundant information. When
used separately to predict suicidality among individuals, graph and language
data generate relatively accurate results (69% and 76%, respectively); however,
when integrated, both data produce highly accurate predictions (90%, with 10%
false-positives and 12% false-negatives). Visualizing graph embeddings
annotated with predictions of potentially suicidal individuals shows the
integrated model could classify such individuals even if they are positioned
far from the support group. These results extend research on the importance of
simultaneously analyzing behavior and language in massive networks and efforts
to integrate embedding models for different kinds of data when predicting and
classifying, particularly when they involve rare events.
Related papers
- GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language
Models [33.56759621666477]
We present a benchmark dataset for evaluating the integration of graph knowledge into language models.
The proposed dataset is designed to evaluate graph-language models' ability to understand graphs and make use of it for answer generation.
We perform experiments with language-only models and the proposed graph-language model to validate the usefulness of the paired graphs and to demonstrate the difficulty of the task.
arXiv Detail & Related papers (2023-10-12T16:46:58Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Neural Graph Revealers [2.2721854258621064]
We propose Neural Graph Revealers (NGRs) to efficiently merge sparse graph recovery methods with Probabilistic Graphical Models.
NGRs view the neural networks as a glass box' or more specifically as a multitask learning framework.
We show experimental results of doing sparse graph recovery and probabilistic inference on data from Gaussian graphical models and a multimodal infant mortality dataset by Centers for Disease Control and Prevention.
arXiv Detail & Related papers (2023-02-27T08:40:45Z) - A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search.
We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z) - Graph-in-Graph (GiG): Learning interpretable latent graphs in
non-Euclidean domain for biological and healthcare applications [52.65389473899139]
Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain.
Recent works have shown that considering relationships between input data samples have a positive regularizing effect for the downstream task.
We propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications.
arXiv Detail & Related papers (2022-04-01T10:01:37Z) - Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles [66.15398165275926]
We propose a method that can automatically detect and ignore dataset-specific patterns, which we call dataset biases.
Our method trains a lower capacity model in an ensemble with a higher capacity model.
We show improvement in all settings, including a 10 point gain on the visual question answering dataset.
arXiv Detail & Related papers (2020-11-07T22:20:03Z) - Multilayer Clustered Graph Learning [66.94201299553336]
We use contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph.
Experiments show that our method leads to a clustered clusters w.r.t.
We learn a clustering algorithm for solving clustering problems.
arXiv Detail & Related papers (2020-10-29T09:58:02Z) - Edge-variational Graph Convolutional Networks for Uncertainty-aware
Disease Prediction [7.6146285961466]
We propose a generalizable framework that can automatically integrate imaging data with non-imaging data in populations for uncertainty-aware disease prediction.
Experimental results on four databases show that our method can consistently and significantly improve the diagnostic accuracy for Autism spectrum disorder, Alzheimer's disease, and ocular diseases.
arXiv Detail & Related papers (2020-09-06T15:53:17Z) - Generative Compositional Augmentations for Scene Graph Prediction [27.535630110794855]
Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of vision and language.
We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution.
We propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs.
arXiv Detail & Related papers (2020-07-11T12:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.