Related papers: Text-to-OverpassQL: A Natural Language Interface for Complex Geodata Querying of OpenStreetMap

Text-to-OverpassQL: A Natural Language Interface for Complex Geodata Querying of OpenStreetMap

URL: http://arxiv.org/abs/2308.16060v1
Date: Wed, 30 Aug 2023 14:33:25 GMT
Title: Text-to-OverpassQL: A Natural Language Interface for Complex Geodata Querying of OpenStreetMap
Authors: Michael Staniek and Raphael Schumann and Maike Z\"ufle and Stefan Riezler
Abstract summary: We present Text-to-OverpassQL, a task designed to facilitate a natural language interface for querying geodata from OpenStreetMap (OSM) Generating Overpass queries from natural language input serves multiple use-cases.
Score: 17.01783992725517
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Text-to-OverpassQL, a task designed to facilitate a natural language interface for querying geodata from OpenStreetMap (OSM). The Overpass Query Language (OverpassQL) allows users to formulate complex database queries and is widely adopted in the OSM ecosystem. Generating Overpass queries from natural language input serves multiple use-cases. It enables novice users to utilize OverpassQL without prior knowledge, assists experienced users with crafting advanced queries, and enables tool-augmented large language models to access information stored in the OSM database. In order to assess the performance of current sequence generation models on this task, we propose OverpassNL, a dataset of 8,352 queries with corresponding natural language inputs. We further introduce task specific evaluation metrics and ground the evaluation of the Text-to-OverpassQL task by executing the queries against the OSM database. We establish strong baselines by finetuning sequence-to-sequence models and adapting large language models with in-context examples. The detailed evaluation reveals strengths and weaknesses of the considered learning strategies, laying the foundations for further research into the Text-to-OverpassQL task.

Related papers

Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling [69.84963245729826]
Large language models (LLMs) have shown compelling semantic understanding capabilities. Dense retrieval is a crucial task in Information Retrieval (IR) and is the foundation for downstream tasks as re-ranking. We introduce an auxiliary task of QL estimation to yield a better backbone for contrast learning a discriminative retriever.
arXiv Detail & Related papers (2025-04-07T16:03:59Z)
NAT-NL2GQL: A Novel Multi-Agent Framework for Translating Natural Language to Graph Query Language [13.661054027428868]
We propose NAT-NL2GQL, a novel framework for translating natural language to graph query language. Our framework consists of three synergistic agents: the Preprocessor agent, the Generator agent, and the Refiner agent. Given the scarcity of high-quality open-source NL2GQL datasets based on nGQL syntax, we developed StockGQL, a dataset constructed from a financial market graph database.
arXiv Detail & Related papers (2024-12-11T04:14:09Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models. Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z)
Text2SQL is Not Enough: Unifying AI and Databases with TAG [47.45480855418987]
Table-Augmented Generation (TAG) is a paradigm for answering natural language questions over databases. We develop benchmarks to study the TAG problem and find that standard methods answer no more than 20% of queries correctly.
arXiv Detail & Related papers (2024-08-27T00:50:14Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries. To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z)
From Text to CQL: Bridging Natural Language and Corpus Search Engine [27.56738323943742]
Corpus Query Language (CQL) is a critical tool for linguistic research and detailed analysis within text corpora. This paper presents the first text-to-CQL task that aims to automate the translation of natural language into CQL.
arXiv Detail & Related papers (2024-02-21T12:11:28Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
An In-Context Schema Understanding Method for Knowledge Base Question Answering [70.87993081445127]
Large Language Models (LLMs) have shown strong capabilities in language understanding and can be used to solve this task. Existing methods bypass this challenge by initially employing LLMs to generate drafts of logic forms without schema-specific details. We propose a simple In-Context Understanding (ICSU) method that enables LLMs to directly understand schemas by leveraging in-context learning.
arXiv Detail & Related papers (2023-10-22T04:19:17Z)
Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES. Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query. By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z)
Querying Large Language Models with SQL [16.383179496709737]
In many use-cases, information is stored in text but not available in structured data. With the rise of pre-trained Large Language Models (LLMs), there is now an effective solution to store and use information extracted from massive corpora of text documents. We present Galois, a prototype based on a traditional database architecture, but with new physical operators for querying the underlying LLM.
arXiv Detail & Related papers (2023-04-02T06:58:14Z)
Exploring Sequence-to-Sequence Models for SPARQL Pattern Composition [0.5639451539396457]
A booming amount of information is continuously added to the Internet as structured and unstructured data, feeding knowledge bases such as DBpedia and Wikidata. The aim of Question Answering systems is to allow lay users to access such data using natural language without needing to write formal queries. We show that sequence-to-sequence models are a viable and promising option to transform long utterances into complex SPARQL queries.
arXiv Detail & Related papers (2020-10-21T11:12:01Z)
ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries [10.273545005890496]
We introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL) ColloQL achieves 84.9% (execution) and 90.7% (execution) accuracy on the Wikilogical dataset.
arXiv Detail & Related papers (2020-10-19T23:53:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.