GeoSEE: Regional Socio-Economic Estimation With a Large Language Model
- URL: http://arxiv.org/abs/2406.09799v1
- Date: Fri, 14 Jun 2024 07:50:22 GMT
- Title: GeoSEE: Regional Socio-Economic Estimation With a Large Language Model
- Authors: Sungwon Han, Donghyun Ahn, Seungeon Lee, Minhyuk Song, Sungwon Park, Sangyoon Park, Jihee Kim, Meeyoung Cha,
- Abstract summary: We present GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM)
The system then computes target indicators via in-context learning after aggregating results from selected modules in the format of natural language-based texts.
Our method outperforms other predictive models in both unsupervised and low-shot contexts.
- Score: 17.31652821477571
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Moving beyond traditional surveys, combining heterogeneous data sources with AI-driven inference models brings new opportunities to measure socio-economic conditions, such as poverty and population, over expansive geographic areas. The current research presents GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM). Presented with a diverse set of information modules, including those pre-constructed from satellite imagery, GeoSEE selects which modules to use in estimation, for each indicator and country. This selection is guided by the LLM's prior socio-geographic knowledge, which functions similarly to the insights of a domain expert. The system then computes target indicators via in-context learning after aggregating results from selected modules in the format of natural language-based texts. Comprehensive evaluation across countries at various stages of development reveals that our method outperforms other predictive models in both unsupervised and low-shot contexts. This reliable performance under data-scarce setting in under-developed or developing countries, combined with its cost-effectiveness, underscores its potential to continuously support and monitor the progress of Sustainable Development Goals, such as poverty alleviation and equitable growth, on a global scale.
Related papers
- Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models [46.09562860220433]
We introduce GazeReward, a novel framework that integrates implicit feedback -- and specifically eye-tracking (ET) data -- into the Reward Model (RM)
Our approach significantly improves the accuracy of the RM on established human preference datasets.
arXiv Detail & Related papers (2024-10-02T13:24:56Z) - LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds.
Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines.
We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - Charting New Territories: Exploring the Geographic and Geospatial
Capabilities of Multimodal LLMs [35.86744469804952]
Multimodal large language models (MLLMs) have shown remarkable capabilities across a broad range of tasks but their knowledge and abilities in the geographic and geospatial domains are yet to be explored.
We conduct a series of experiments exploring various vision capabilities of MLLMs within these domains, particularly focusing on the frontier model GPT-4V.
Our methodology involves challenging these models with a small-scale geographic benchmark consisting of a suite of visual tasks, testing their abilities across a spectrum of complexity.
arXiv Detail & Related papers (2023-11-24T18:46:02Z) - Chatmap : Large Language Model Interaction with Cartographic Data [0.0]
OpenStreetMap (OSM) is the most ambitious open-source global initiative offering detailed urban and rural geographic data.
In this study, we demonstrate the proof of concept and details of the process of fine-tuning a relatively small scale (1B parameters) Large Language Models (LLMs) with a relatively small artificial dataset curated by a more capable teacher model.
The study aims to provide an initial guideline for such generative artificial intelligence (AI) adaptations and demonstrate early signs of useful emerging abilities in this context.
arXiv Detail & Related papers (2023-09-28T15:32:36Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - A General Purpose Neural Architecture for Geospatial Systems [142.43454584836812]
We present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias.
We envision how such a model may facilitate cooperation between members of the community.
arXiv Detail & Related papers (2022-11-04T09:58:57Z) - Learning Economic Indicators by Aggregating Multi-Level Geospatial
Information [20.0397537179667]
This research presents a deep learning model to predict economic indicators via aggregating traits observed from multiple levels of geographical units.
Our new multi-level learning model substantially outperforms strong baselines in predicting key indicators such as population, purchasing power, and energy consumption.
We discuss the multi-level model's implications for measuring inequality, which is the essential first step in policy and social science research on inequality and poverty.
arXiv Detail & Related papers (2022-05-03T13:05:39Z) - Measuring Attribution in Natural Language Generation Models [14.931889185122213]
We present a new evaluation framework entitled Attributable to Identified Sources (AIS) for assessing the output of natural language generation models.
We first define AIS and introduce a two-stage annotation pipeline for allowing annotators to appropriately evaluate model output.
arXiv Detail & Related papers (2021-12-23T22:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.