Where did you tweet from? Inferring the origin locations of tweets based
on contextual information
- URL: http://arxiv.org/abs/2211.16506v1
- Date: Fri, 18 Nov 2022 01:33:01 GMT
- Title: Where did you tweet from? Inferring the origin locations of tweets based
on contextual information
- Authors: Rabindra Lamsal, Aaron Harwood, Maria Rodriguez Read
- Abstract summary: Less than 1% of tweets are geotagged; in both cases--point location or bounding place information.
A major issue with tweets is that Twitter users can be at location A and exchange conversations specific to location B.
We propose a framework that uses machine-level natural language understanding to identify tweets that conceivably contain their origin location information.
- Score: 0.2320417845168326
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Public conversations on Twitter comprise many pertinent topics including
disasters, protests, politics, propaganda, sports, climate change,
epidemics/pandemic outbreaks, etc., that can have both regional and global
aspects. Spatial discourse analysis rely on geographical data. However, today
less than 1% of tweets are geotagged; in both cases--point location or bounding
place information. A major issue with tweets is that Twitter users can be at
location A and exchange conversations specific to location B, which we call the
Location A/B problem. The problem is considered solved if location entities can
be classified as either origin locations (Location As) or non-origin locations
(Location Bs). In this work, we propose a simple yet effective framework--the
True Origin Model--to address the problem that uses machine-level natural
language understanding to identify tweets that conceivably contain their origin
location information. The model achieves promising accuracy at country (80%),
state (67%), city (58%), county (56%) and district (64%) levels with support
from a Location Extraction Model as basic as the CoNLL-2003-based RoBERTa. We
employ a tweet contexualizer (locBERT) which is one of the core components of
the proposed model, to investigate multiple tweets' distributions for
understanding Twitter users' tweeting behavior in terms of mentioning origin
and non-origin locations. We also highlight a major concern with the currently
regarded gold standard test set (ground truth) methodology, introduce a new
data set, and identify further research avenues for advancing the area.
Related papers
- CityGuessr: City-Level Video Geo-Localization on a Global Scale [54.371452373726584]
We propose a novel problem of worldwide video geolocalization with the objective of hierarchically predicting the correct city, state/province, country, and continent, given a video.
No large scale video datasets that have extensive worldwide coverage exist, to train models for solving this problem.
We introduce a new dataset, CityGuessr68k comprising of 68,269 videos from 166 cities all over the world.
arXiv Detail & Related papers (2024-11-10T03:20:00Z) - Your Car Tells Me Where You Drove: A Novel Path Inference Attack via CAN Bus and OBD-II Data [57.22545280370174]
On Path Diagnostic - Intrusion & Inference (OPD-II) is a novel path inference attack leveraging a physical car model and a map matching algorithm.
We implement our attack on a set of four different cars and a total number of 41 tracks in different road and traffic scenarios.
arXiv Detail & Related papers (2024-06-30T04:21:46Z) - Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese
Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates.
We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - ContCommRTD: A Distributed Content-based Misinformation-aware Community
Detection System for Real-Time Disaster Reporting [0.5156484100374059]
We propose a novel distributed system that provides in near real-time information on hazard-related events and their evolution.
Our distributed disaster reporting system analyzes the social relationship among worldwide geolocated tweets.
As misinformation can lead to increase damage if propagated in hazards related tweets, we propose a new deep learning model to detect fake news.
arXiv Detail & Related papers (2023-01-30T15:28:47Z) - RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time
Location Estimation [18.6505004991784]
Real-time location inference of social media users is fundamental to some spatial applications.
While tweet text is the most commonly used feature in location estimation, most of the prior works suffer from either the noise or the sparsity of textual features.
We use topic modeling as a building block to characterize the geographic topic variation and lexical variation so that "one-hot" encoding vectors will no longer be directly used.
arXiv Detail & Related papers (2021-11-12T00:57:42Z) - Identification of Fine-Grained Location Mentions in Crisis Tweets [7.627299398469962]
We assemble two tweet crisis datasets and manually annotate them with specific location types.
The first dataset contains tweets from a mixed set of crisis events, while the second dataset contains tweets from the global COVID-19 pandemic.
We investigate the performance of state-of-the-art deep learning models for sequence tagging on these datasets, in both in-domain and cross-domain settings.
arXiv Detail & Related papers (2021-11-11T17:48:03Z) - Fine-grained Geolocation Prediction of Tweets with Human Machine
Collaboration [3.147379819740595]
Less than $1%$ of crawled Tweet posts come with geolocation tags.
In this research, we utilize millions of Twitter posts and end-users domain expertise to build a set of deep neural network models.
With multiple neural architecture experiments, and a collaborative human-machine workflow design, our ongoing work on geolocation detection shows promising results.
arXiv Detail & Related papers (2021-06-25T03:51:02Z) - Zero-Shot Multi-View Indoor Localization via Graph Location Networks [66.05980368549928]
indoor localization is a fundamental problem in location-based applications.
We propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.
GLN makes location predictions based on robust location representations extracted from images through message-passing networks.
We introduce a novel zero-shot indoor localization setting and tackle it by extending the proposed GLN to a dedicated zero-shot version.
arXiv Detail & Related papers (2020-08-06T07:36:55Z) - Geosocial Location Classification: Associating Type to Places Based on
Geotagged Social-Media Posts [22.313111311130662]
Associating type to locations can be used to enrich maps and can serve a plethora of geospatial applications.
We study the problem of Geosocial Location Classification, where the type of a site, e.g., a building, is discovered based on social-media posts.
arXiv Detail & Related papers (2020-02-05T16:09:52Z) - From Topic Networks to Distributed Cognitive Maps: Zipfian Topic
Universes in the Area of Volunteered Geographic Information [59.0235296929395]
We investigate how language encodes and networks geographic information on the aboutness level of texts.
Our study shows a Zipfian organization of the thematic universe in which geographical places are located in online communication.
Places, whether close to each other or not, are located in neighboring places that span similarworks in the topic universe.
arXiv Detail & Related papers (2020-02-04T18:31:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.