DiPD: Disruptive event Prediction Dataset from Twitter
- URL: http://arxiv.org/abs/2111.15629v1
- Date: Thu, 25 Nov 2021 13:16:21 GMT
- Title: DiPD: Disruptive event Prediction Dataset from Twitter
- Authors: Sanskar Soni, Dev Mehta, Vinush Vishwanath, Aditi Seetha and Satyendra
Singh Chouhan
- Abstract summary: Riots and protests, if gone out of control, can cause havoc in a country.
This dataset collects tweets of past or ongoing events known to have caused disruption.
It contains 94855 records of unique events and 168706 records of unique non-events.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Riots and protests, if gone out of control, can cause havoc in a country. We
have seen examples of this, such as the BLM movement, climate strikes, CAA
Movement, and many more, which caused disruption to a large extent. Our motive
behind creating this dataset was to use it to develop machine learning systems
that can give its users insight into the trending events going on and alert
them about the events that could lead to disruption in the nation. If any event
starts going out of control, it can be handled and mitigated by monitoring it
before the matter escalates. This dataset collects tweets of past or ongoing
events known to have caused disruption and labels these tweets as 1. We also
collect tweets that are considered non-eventful and label them as 0 so that
they can also be used to train a classification system. The dataset contains
94855 records of unique events and 168706 records of unique non-events, thus
giving the total dataset 263561 records. We extract multiple features from the
tweets, such as the user's follower count and the user's location, to
understand the impact and reach of the tweets. This dataset might be useful in
various event related machine learning problems such as event classification,
event recognition, and so on.
Related papers
- Improving Event Definition Following For Zero-Shot Event Detection [66.27883872707523]
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types.
We aim to improve zero-shot event detection by training models to better follow event definitions.
arXiv Detail & Related papers (2024-03-05T01:46:50Z) - CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster
Tweet Classification [51.58605842457186]
We present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting.
Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data.
arXiv Detail & Related papers (2023-10-23T07:01:09Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - Traffic Event Detection as a Slot Filling Problem [18.61490760235035]
We introduce the new problem of extracting fine-grained traffic information from Twitter streams by making publicly available the two (constructed) traffic-related datasets from Belgium and the Brussels capital region.
We propose the use of several methods that process the two subtasks either separately or in a joint setting, and we evaluate the effectiveness of the proposed methods for solving the traffic event detection problem.
arXiv Detail & Related papers (2021-09-13T15:02:40Z) - On Informative Tweet Identification For Tracking Mass Events [0.0]
We investigate machine learning methods for automatically identifying informative tweets among those that are relevant to a target event.
We propose a hybrid model that leverages both the handcrafted features and the automatically learned ones.
Our experiments on several large datasets of real-world events show that the latter approaches significantly outperform the former.
arXiv Detail & Related papers (2021-01-14T15:10:42Z) - Unsupervised Label-aware Event Trigger and Argument Classification [73.86358632937372]
We propose an unsupervised event extraction pipeline, which first identifies events with available tools (e.g., SRL) and then automatically maps them to pre-defined event types.
We leverage pre-trained language models to contextually represent pre-defined types for both event triggers and arguments.
We successfully map 83% of the triggers and 54% of the arguments to the correct types, almost doubling the performance of previous zero-shot approaches.
arXiv Detail & Related papers (2020-12-30T17:47:24Z) - Leveraging Event Specific and Chunk Span features to Extract COVID
Events from tweets [0.0]
We describe our system entry for WNUT 2020 Shared Task-3.
The task was aimed at automating the extraction of a variety of COVID-19 related events from Twitter.
The system ranks 1st at the leader-board with F1 of 0.6598, without using any ensembles or additional datasets.
arXiv Detail & Related papers (2020-12-18T04:49:32Z) - Event-Related Bias Removal for Real-time Disaster Events [67.2965372987723]
Social media has become an important tool to share information about crisis events such as natural disasters and mass attacks.
Detecting actionable posts that contain useful information requires rapid analysis of huge volume of data in real-time.
We train an adversarial neural model to remove latent event-specific biases and improve the performance on tweet importance classification.
arXiv Detail & Related papers (2020-11-02T02:03:07Z) - On Identifying Hashtags in Disaster Twitter Data [55.17975121160699]
We construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering actionable information.
Using this dataset, we investigate Long Short Term Memory-based models within a Multi-Task Learning framework.
The best performing model achieves an F1-score as high as 92.22%.
arXiv Detail & Related papers (2020-01-05T22:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.