Cracking the Code: Enhancing Development finance understanding with artificial intelligence
- URL: http://arxiv.org/abs/2502.09495v1
- Date: Thu, 13 Feb 2025 17:01:45 GMT
- Title: Cracking the Code: Enhancing Development finance understanding with artificial intelligence
- Authors: Pierre Beaucoral,
- Abstract summary: This research employs a novel approach that combines Machine Learning (ML) techniques, specifically Natural Language Processing (NLP), and an innovative Python topic modeling technique called BERTopic.<n>By revealing existing yet hidden topics of development finance, this application of artificial intelligence enables a better understanding of donor priorities and overall development funding.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Analyzing development projects is crucial for understanding donors aid strategies, recipients priorities, and to assess development finance capacity to adress development issues by on-the-ground actions. In this area, the Organisation for Economic Co-operation and Developments (OECD) Creditor Reporting System (CRS) dataset is a reference data source. This dataset provides a vast collection of project narratives from various sectors (approximately 5 million projects). While the OECD CRS provides a rich source of information on development strategies, it falls short in informing project purposes due to its reporting process based on donors self-declared main objectives and pre-defined industrial sectors. This research employs a novel approach that combines Machine Learning (ML) techniques, specifically Natural Language Processing (NLP), an innovative Python topic modeling technique called BERTopic, to categorise (cluster) and label development projects based on their narrative descriptions. By revealing existing yet hidden topics of development finance, this application of artificial intelligence enables a better understanding of donor priorities and overall development funding and provides methods to analyse public and private projects narratives.
Related papers
- AI for Climate Finance: Agentic Retrieval and Multi-Step Reasoning for Early Warning System Investments [1.3192560874022086]
This study focuses on a real-world application: tracking EWS investments in the Climate Risk and Early Warning Systems (CREWS) Fund.
We analyze 25 MDB project documents and evaluate multiple AI-driven classification methods, including zero-shot and few-shot learning.
Our results show that the agent-based RAG approach significantly outperforms other methods, achieving 87% accuracy, 89% precision, and 83% recall.
arXiv Detail & Related papers (2025-04-07T14:11:11Z) - A Survey on Post-training of Large Language Models [185.51013463503946]
Large Language Models (LLMs) have fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration.
These challenges necessitate advanced post-training language models (PoLMs) to address shortcomings, such as restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance.
This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms.
arXiv Detail & Related papers (2025-03-08T05:41:42Z) - An Overview of Large Language Models for Statisticians [109.38601458831545]
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI)
This paper explores potential areas where statisticians can make important contributions to the development of LLMs.
We focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation.
arXiv Detail & Related papers (2025-02-25T03:40:36Z) - Deploying Large Language Models With Retrieval Augmented Generation [0.21485350418225244]
Retrieval Augmented Generation has emerged as a key approach for integrating knowledge from data sources outside of the large language model's training set.
We present insights from the development and field-testing of a pilot project that integrates LLMs with RAG for information retrieval.
arXiv Detail & Related papers (2024-11-07T22:11:51Z) - The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources [100.23208165760114]
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications.
To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet.
arXiv Detail & Related papers (2024-06-24T15:55:49Z) - A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges [60.546677053091685]
Large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain.
We explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation.
We highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications.
arXiv Detail & Related papers (2024-06-15T16:11:35Z) - A Survey on Knowledge Distillation of Large Language Models [99.11900233108487]
Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities to open-source models.
This paper presents a comprehensive survey of KD's role within the realm of Large Language Models (LLMs)
arXiv Detail & Related papers (2024-02-20T16:17:37Z) - FAIR Enough: How Can We Develop and Assess a FAIR-Compliant Dataset for Large Language Models' Training? [3.0406004578714008]
The rapid evolution of Large Language Models highlights the necessity for ethical considerations and data integrity in AI development.
While FAIR principles are crucial for ethical data stewardship, their specific application in the context of LLM training data remains an under-explored area.
We propose a novel framework designed to integrate FAIR principles into the LLM development lifecycle.
arXiv Detail & Related papers (2024-01-19T21:21:02Z) - Multimodal Gen-AI for Fundamental Investment Research [2.559302299676632]
This report outlines a transformative initiative in the financial investment industry, where the conventional decision-making process is being reimagined.
We seek to evaluate the effectiveness of fine-tuning methods on a base model (Llama2) to achieve specific application-level goals.
The project encompasses a diverse corpus dataset, including research reports, investment memos, market news, and extensive time-series market data.
arXiv Detail & Related papers (2023-12-24T03:35:13Z) - A Survey of Reasoning with Foundation Models [235.7288855108172]
Reasoning plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.
We introduce seminal foundation models proposed or adaptable for reasoning.
We then delve into the potential future directions behind the emergence of reasoning abilities within foundation models.
arXiv Detail & Related papers (2023-12-17T15:16:13Z) - Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction [104.29108668347727]
This paper proposes an innovative knowledge graph generation approach that leverages the potential of the latest generative large language models.
The approach is conveyed in a pipeline that comprises novel iterative zero-shot and external knowledge-agnostic strategies.
We claim that our proposal is a suitable solution for scalable and versatile knowledge graph construction and may be applied to different and novel contexts.
arXiv Detail & Related papers (2023-07-03T16:01:45Z) - DINGO: an ontology for projects and grants linked data [0.0]
DINGO provides a framework to model data for semantically-enabled applications relative to projects, funding, actors, and, notably, funding policies in the research landscape.
We discuss its main features, the principles followed for its development, its community uptake, its maintenance and evolution.
arXiv Detail & Related papers (2020-06-24T02:47:40Z) - Representation of Developer Expertise in Open Source Software [12.583969739954526]
We use the World of Code infrastructure to extract the complete set of APIs in the files changed by open source developers.
We then employ Doc2Vec embeddings for vector representations of APIs, developers, and projects.
We evaluate if these embeddings reflect the postulated topology of the Skill Space.
arXiv Detail & Related papers (2020-05-20T16:36:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.