Related papers: UnibucKernel: Geolocating Swiss-German Jodels Using Ensemble Learning

UnibucKernel: Geolocating Swiss-German Jodels Using Ensemble Learning

URL: http://arxiv.org/abs/2102.09379v2
Date: Fri, 19 Feb 2021 08:31:31 GMT
Title: UnibucKernel: Geolocating Swiss-German Jodels Using Ensemble Learning
Authors: Mihaela Gaman, Sebastian Cojocariu, Radu Tudor Ionescu
Abstract summary: We focus on the second subtask, which is based on a data set formed of approximately 30 thousand Swiss German Jodels. The dialect identification task is about accurately predicting the latitude and longitude of test samples. We frame the task as a double regression problem, employing an XGBoost meta-learner with the combined power of a variety of machine learning approaches.
Score: 15.877673959068455
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we describe our approach addressing the Social Media Variety Geolocation task featured in the 2021 VarDial Evaluation Campaign. We focus on the second subtask, which is based on a data set formed of approximately 30 thousand Swiss German Jodels. The dialect identification task is about accurately predicting the latitude and longitude of test samples. We frame the task as a double regression problem, employing an XGBoost meta-learner with the combined power of a variety of machine learning approaches to predict both latitude and longitude. The models included in our ensemble range from simple regression techniques, such as Support Vector Regression, to deep neural models, such as a hybrid neural network and a neural transformer. To minimize the prediction error, we approach the problem from a few different perspectives and consider various types of features, from low-level character n-grams to high-level BERT embeddings. The XGBoost ensemble resulted from combining the power of the aforementioned methods achieves a median distance of 23.6 km on the test data, which places us on the third place in the ranking, at a difference of 6.05 km and 2.9 km from the submissions on the first and second places, respectively.

Related papers

RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings [7.431269929582643]
We propose a novel retrieval-augmented strategy called RANGE. We build our method on the intuition that the visual features of a location can be estimated by combining the visual features from multiple similar-looking locations. Our results show that RANGE outperforms the existing state-of-the-art models with significant margins in most tasks.
arXiv Detail & Related papers (2025-02-27T05:45:51Z)
PEnG: Pose-Enhanced Geo-Localisation [15.324623975476348]
Cross-view geo-localisation is typically performed at a coarse granularity, because densely sampled satellite image patches overlap heavily. We develop PEnG, a 2-stage system which first predicts the most likely edges from a city-scale graph representation. It then performs relative pose estimation within these edges to determine a precise position.
arXiv Detail & Related papers (2024-11-24T07:42:50Z)
Ensemble learning for predictive uncertainty estimation with application to the correction of satellite precipitation products [3.8623569699070353]
We introduce nine quantile-based ensemble learners and present the first application of these learners to large precipitation datasets. We employ a novel feature engineering strategy, reducing predictors to distance-weighted satellite precipitation at relevant locations, combined with location elevation. Ensemble learning with QR and QRNN yielded the best results across quantile levels ranging from 0.025 to 0.975, outperforming the reference method by 3.91% to 8.95%.
arXiv Detail & Related papers (2024-03-14T17:45:56Z)
Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching [0.0]
We propose a new technique, based on graph Laplacian eigenmaps, to match point clouds by taking into account fine local structures. To deal with the order and sign ambiguity of Laplacian eigenmaps, we introduce a new operator, called Coupled Laplacian. We show that the similarity between those aligned high-dimensional spaces provides a locally meaningful score to match shapes.
arXiv Detail & Related papers (2024-02-27T10:10:12Z)
GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models. We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods. Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z)
Gpachov at CheckThat! 2023: A Diverse Multi-Approach Ensemble for Subjectivity Detection in News Articles [34.98368667957678]
This paper presents the solution built by the Gpachov team for the CLEF-2023 CheckThat! lab Task2 on subjectivity detection. The three approaches are combined in a simple majority voting ensemble, resulting in 0.77 macro F1 on the test set and achieving 2nd place on the English subtask.
arXiv Detail & Related papers (2023-09-13T09:49:20Z)
Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking [61.60169764507917]
Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates. We propose an innovative framework, namely Geo-Encoder, to more effectively integrate Chinese geographical semantics into re-ranking pipelines.
arXiv Detail & Related papers (2023-09-04T13:44:50Z)
Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations. In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z)
Predicting the Geolocation of Tweets Using transformer models on Customized Data [17.55660062746406]
This research is aimed to solve the tweet/user geolocation prediction task. The suggested approach implements neural networks for natural language processing to estimate the location. The scope of proposed models has been finetuned on a Twitter dataset.
arXiv Detail & Related papers (2023-03-14T12:56:47Z)
Rethinking Spatial Invariance of Convolutional Networks for Object Counting [119.83017534355842]
We try to use locally connected Gaussian kernels to replace the original convolution filter to estimate the spatial position in the density map. Inspired by previous work, we propose a low-rank approximation accompanied with translation invariance to favorably implement the approximation of massive Gaussian convolution. Our methods significantly outperform other state-of-the-art methods and achieve promising learning of the spatial position of objects.
arXiv Detail & Related papers (2022-06-10T17:51:25Z)
Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance. To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information. Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z)
Meta-Generating Deep Attentive Metric for Few-shot Classification [53.07108067253006]
We present a novel deep metric meta-generation method to generate a specific metric for a new few-shot learning task. In this study, we structure the metric using a three-layer deep attentive network that is flexible enough to produce a discriminative metric for each task. We gain surprisingly obvious performance improvement over state-of-the-art competitors, especially in the challenging cases.
arXiv Detail & Related papers (2020-12-03T02:07:43Z)
Combining Deep Learning and String Kernels for the Localization of Swiss German Tweets [28.497747521078647]
We address the second subtask, which targets a data set composed of nearly 30 thousand Swiss German Jodels. We frame the task as a double regression problem, employing a variety of machine learning approaches to predict both latitude and longitude. Our empirical results indicate that the handcrafted model based on string kernels outperforms the deep learning approaches.
arXiv Detail & Related papers (2020-10-07T19:16:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.