Identifying Illicit Drug Dealers on Instagram with Large-scale
Multimodal Data Fusion
- URL: http://arxiv.org/abs/2108.08301v1
- Date: Wed, 18 Aug 2021 04:29:47 GMT
- Title: Identifying Illicit Drug Dealers on Instagram with Large-scale
Multimodal Data Fusion
- Authors: Chuanbo Hu, Minglei Yin, Bin Liu, Xin Li, Yanfang Ye
- Abstract summary: Illicit drug trafficking via social media sites such as Instagram has become a severe problem.
How to identify illicit drug dealers from social media data has remained a technical challenge.
We propose to tackle the problem of illicit drug dealer identification by constructing a large-scale multimodal dataset.
- Score: 18.223055392013542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Illicit drug trafficking via social media sites such as Instagram has become
a severe problem, thus drawing a great deal of attention from law enforcement
and public health agencies. How to identify illicit drug dealers from social
media data has remained a technical challenge due to the following reasons. On
the one hand, the available data are limited because of privacy concerns with
crawling social media sites; on the other hand, the diversity of drug dealing
patterns makes it difficult to reliably distinguish drug dealers from common
drug users. Unlike existing methods that focus on posting-based detection, we
propose to tackle the problem of illicit drug dealer identification by
constructing a large-scale multimodal dataset named Identifying Drug Dealers on
Instagram (IDDIG). Totally nearly 4,000 user accounts, of which over 1,400 are
drug dealers, have been collected from Instagram with multiple data sources
including post comments, post images, homepage bio, and homepage images. We
then design a quadruple-based multimodal fusion method to combine the multiple
data sources associated with each user account for drug dealer identification.
Experimental results on the constructed IDDIG dataset demonstrate the
effectiveness of the proposed method in identifying drug dealers (almost 95%
accuracy). Moreover, we have developed a hashtag-based community detection
technique for discovering evolving patterns, especially those related to
geography and drug types.
Related papers
- Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training [51.87027943520492]
We present a novel paradigm Diffusion-ReID to efficiently augment and generate diverse images based on known identities.
Benefiting from our proposed paradigm, we first create a new large-scale person Re-ID dataset Diff-Person, which consists of over 777K images from 5,183 identities.
arXiv Detail & Related papers (2024-06-10T06:26:03Z) - Detection of Opioid Users from Reddit Posts via an Attention-based Bidirectional Recurrent Neural Network [11.491225833044021]
We take advantage of recent advances in machine learning to identify opioid users on Reddit.
posts from more than 1,000 users who have posted on three sub-reddits over a period of one month have been collected.
We apply an attention-based bidirectional long short memory model to identify opioid users.
arXiv Detail & Related papers (2024-02-09T22:12:20Z) - Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts [26.161892748901252]
We present a corpus of 2500 opioid-related posts from various subreddits labeled with six different phases of opioid use.
For every post, we annotate span-level explanations and crucially study their role both in annotation quality and model development.
arXiv Detail & Related papers (2023-11-15T16:05:55Z) - Unveiling the Potential of Knowledge-Prompted ChatGPT for Enhancing Drug
Trafficking Detection on Social Media [30.791563171321062]
We propose an analytical framework to compose emphknowledge-informed prompts, which serve as the interface that humans can interact with and use LLMs to perform the detection task.
Our experimental findings demonstrate that the proposed framework outperforms other baseline language models in terms of drug trafficking detection accuracy.
The implications of our research extend to social networks, emphasizing the importance of incorporating prior knowledge and scenario-based prompts into analytical tools to improve online security and public safety.
arXiv Detail & Related papers (2023-07-07T16:15:59Z) - Neural Bandits for Data Mining: Searching for Dangerous Polypharmacy [63.135687276599114]
Some polypharmacies, deemed inappropriate, may be associated with adverse health outcomes such as death or hospitalization.
We propose the OptimNeuralTS strategy to efficiently mine claims datasets and build a predictive model of the association between drug combinations and health outcomes.
Our method can detect up to 72% of PIPs while maintaining an average precision score of 99% using 30 000 time steps.
arXiv Detail & Related papers (2022-12-10T03:43:23Z) - Knowledge-Driven New Drug Recommendation [88.35607943144261]
We develop a drug-dependent multi-phenotype few-shot learner to bridge the gap between existing and new drugs.
EDGE eliminates the false-negative supervision signal using an external drug-disease knowledge base.
Results show that EDGE achieves 7.3% improvement on the ROC-AUC score over the best baseline.
arXiv Detail & Related papers (2022-10-11T16:07:52Z) - An Integrated System of Drug Matching and Abnormal Approval Number
Correction [0.0]
This paper creates an integrated system for matching drug products from two data sources.
Our integrated system achieves 98.3% drug matching accuracy, with 99.2% precision and 97.5% recall.
arXiv Detail & Related papers (2022-07-01T11:19:50Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Detection of Illicit Drug Trafficking Events on Instagram: A Deep
Multimodal Multilabel Learning Approach [18.223055392013542]
We conduct the first systematic study on fine-grained detection of illicit drug trafficking events (IDTEs) on Instagram.
Specifically, our model takes text and image data as the input and combines multimodal information to predict multiple labels of illicit drugs.
We have constructed a large-scale dataset MM-IDTE with manually annotated multiple drug labels to support fine-grained detection of illicit drugs.
arXiv Detail & Related papers (2021-08-19T21:16:21Z) - Two Step Joint Model for Drug Drug Interaction Extraction [82.49278654043577]
Drug-Drug Interaction (DDI) Extraction from Drug Labels challenge of Text Analysis Conference (TAC) 2018.
We propose a two step joint model to detect DDI and it's related mentions jointly.
A sequence tagging system (CNN-GRU encoder-decoder) finds precipitants first and search its fine-grained Trigger and determine the DDI for each precipitant in the second step.
arXiv Detail & Related papers (2020-08-28T15:30:08Z) - Whose Tweets are Surveilled for the Police: An Audit of Social-Media
Monitoring Tool via Log Files [69.02688684221265]
We obtained log files from the Corvallis (Oregon) Police Department's use of social media monitoring software called DigitalStakeout.
These log files include the results of proprietary searches by DigitalStakeout that were running over a period of 13 months and include 7240 social media posts.
We observe differences in the demographics of the users whose Tweets are flagged by DigitalStakeout compared to the demographics of the Twitter users in the region.
arXiv Detail & Related papers (2020-01-23T19:35:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.