Aligning ESG Controversy Data with International Guidelines through Semi-Automatic Ontology Construction
- URL: http://arxiv.org/abs/2509.10922v1
- Date: Sat, 13 Sep 2025 17:49:59 GMT
- Title: Aligning ESG Controversy Data with International Guidelines through Semi-Automatic Ontology Construction
- Authors: Tsuyoshi Iwata, Guillaume Comte, Melissa Flores, Ryoma Kondo, Ryohei Hisano,
- Abstract summary: We present a semi-automatic method for constructing structured knowledge representations of environmental, social, and governance events reported in the news.<n>Our approach uses lightweight ontology design, formal pattern modeling, and large language models to convert normative principles into reusable templates.<n>These templates are used to extract relevant information from news content and populate a structured knowledge graph that links reported incidents to specific framework principles.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing importance of environmental, social, and governance data in regulatory and investment contexts has increased the need for accurate, interpretable, and internationally aligned representations of non-financial risks, particularly those reported in unstructured news sources. However, aligning such controversy-related data with principle-based normative frameworks, such as the United Nations Global Compact or Sustainable Development Goals, presents significant challenges. These frameworks are typically expressed in abstract language, lack standardized taxonomies, and differ from the proprietary classification systems used by commercial data providers. In this paper, we present a semi-automatic method for constructing structured knowledge representations of environmental, social, and governance events reported in the news. Our approach uses lightweight ontology design, formal pattern modeling, and large language models to convert normative principles into reusable templates expressed in the Resource Description Framework. These templates are used to extract relevant information from news content and populate a structured knowledge graph that links reported incidents to specific framework principles. The result is a scalable and transparent framework for identifying and interpreting non-compliance with international sustainability guidelines.
Related papers
- The Trinity of Consistency as a Defining Principle for General World Models [106.16462830681452]
General World Models are capable of learning, simulating, and reasoning about objective physical laws.<n>We propose a principled theoretical framework that defines the essential properties requisite for a General World Model.<n>Our work establishes a principled pathway toward general world models, clarifying both the limitations of current systems and the architectural requirements for future progress.
arXiv Detail & Related papers (2026-02-26T16:15:55Z) - Doc-PP: Document Policy Preservation Benchmark for Large Vision-Language Models [13.70855540464427]
We introduce Doc-PP, a novel benchmark constructed from real-world reports requiring reasoning across heterogeneous visual and textual elements under strict non-disclosure policies.<n>Our evaluation highlights a systemic Reasoning-Induced Safety Gap: models frequently leak sensitive information when answers must be inferred through complex synthesis or aggregated across modalities.<n>We propose DVA, a structural inference framework that decouples reasoning from policy verification.
arXiv Detail & Related papers (2026-01-07T13:45:39Z) - FysicsWorld: A Unified Full-Modality Benchmark for Any-to-Any Understanding, Generation, and Reasoning [52.88164697048371]
We introduce FysicsWorld, the first unified full-modality benchmark that supports bidirectional input-output across image, video, audio, and text.<n>FysicsWorld encompasses 16 primary tasks and 3,268 curated samples, aggregated from over 40 high-quality sources.
arXiv Detail & Related papers (2025-12-14T16:41:29Z) - Pharos-ESG: A Framework for Multimodal Parsing, Contextual Narration, and Hierarchical Labeling of ESG Report [9.026784135029034]
Pharos-ESG is a framework that transforms ESG reports into structured representations through multimodal parsing, contextual nar- ration, and hierarchical labeling.<n>We release Aurora-ESG, the first large-scale public dataset of ESG re- ports, spanning Mainland China, Hong Kong, and U.S.
arXiv Detail & Related papers (2025-11-20T14:41:44Z) - Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation [57.97548022208733]
We show that seemingly superficial choices in key-value extraction can induce shifts in accuracy and stability.<n>We introduce Contextual Normalization, a strategy that adaptively standardizes context representations before generation.
arXiv Detail & Related papers (2025-10-15T06:28:25Z) - Data Dependency-Aware Code Generation from Enhanced UML Sequence Diagrams [54.528185120850274]
We propose a novel step-by-step code generation framework named API2Dep.<n>First, we introduce an enhanced Unified Modeling Language (UML) API diagram tailored for service-oriented architectures.<n>Second, recognizing the critical role of data flow, we introduce a dedicated data dependency inference task.
arXiv Detail & Related papers (2025-08-05T12:28:23Z) - Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models [93.1043186636177]
We explore the hypothesis that people use a combination of distributed and symbolic representations to construct bespoke mental models tailored to novel situations.<n>We propose a computational implementation of this idea -- a Model Synthesis Architecture''<n>We evaluate our MSA as a model of human judgments on a novel reasoning dataset.
arXiv Detail & Related papers (2025-07-16T18:01:03Z) - Modelling Privacy Compliance in Cross-border Data Transfers with Bigraphs [0.0]
We propose a privacy framework based on Milner's Bigraphical Reactive Systems.<n>We demonstrate the framework's applicability by modelling WhatsApp's privacy policies.
arXiv Detail & Related papers (2025-03-26T11:50:55Z) - The Synergy of LLMs & RL Unlocks Offline Learning of Generalizable Language-Conditioned Policies with Low-fidelity Data [50.544186914115045]
TEDUO is a novel training pipeline for offline language-conditioned policy learning in symbolic environments.<n>Our approach harnesses large language models (LLMs) in a dual capacity: first, as automatization tools augmenting offline datasets with richer annotations, and second, as generalizable instruction-following agents.
arXiv Detail & Related papers (2024-12-09T18:43:56Z) - KRAG Framework for Enhancing LLMs in the Legal Domain [0.48451657575793666]
This paper introduces Knowledge Representation Augmented Generation (KRAG)
KRAG is a framework designed to enhance the capabilities of Large Language Models (LLMs) within domain-specific applications.
We present Soft PROLEG, an implementation model under KRAG, which uses inference graphs to aid LLMs in delivering structured legal reasoning.
arXiv Detail & Related papers (2024-10-10T02:48:06Z) - Scalable Frame-based Construction of Sociocultural NormBases for Socially-Aware Dialogues [66.69453609603875]
Sociocultural norms serve as guiding principles for personal conduct in social interactions.
We propose a scalable approach for constructing a Sociocultural Norm (SCN) Base using Large Language Models (LLMs)
We construct a comprehensive and publicly accessible Chinese Sociocultural NormBase.
arXiv Detail & Related papers (2024-10-04T00:08:46Z) - Improving Large Language Model (LLM) fidelity through context-aware grounding: A systematic approach to reliability and veracity [0.0]
Large Language Models (LLMs) are increasingly sophisticated and ubiquitous in natural language processing (NLP) applications.
This paper presents a novel framework for contextual grounding in textual models, with a particular emphasis on the Context Representation stage.
Our findings have significant implications for the deployment of LLMs in sensitive domains such as healthcare, legal systems, and social services.
arXiv Detail & Related papers (2024-08-07T18:12:02Z) - AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents [74.17623527375241]
We introduce a novel framework, called AutoGuide, which automatically generates context-aware guidelines from offline experiences.<n>As a result, our guidelines facilitate the provision of relevant knowledge for the agent's current decision-making process.<n>Our evaluation demonstrates that AutoGuide significantly outperforms competitive baselines in complex benchmark domains.
arXiv Detail & Related papers (2024-03-13T22:06:03Z) - Glitter or Gold? Deriving Structured Insights from Sustainability
Reports via Large Language Models [16.231171704561714]
This study uses Information Extraction (IE) methods to extract structured insights related to ESG aspects from companies' sustainability reports.
We then leverage graph-based representations to conduct statistical analyses concerning the extracted insights.
arXiv Detail & Related papers (2023-10-09T11:34:41Z) - 1st ICLR International Workshop on Privacy, Accountability,
Interpretability, Robustness, Reasoning on Structured Data (PAIR^2Struct) [28.549151517783287]
Data Privacy, Accountability, Interpretability, Robustness, and Reasoning have been recognized as fundamental principles of using machine learning (ML) technologies on decision-critical and/or privacy-sensitive applications.
By exploiting the inherently structured knowledge, one can design plausible approaches to identify and use more relevant variables to make reliable decisions.
arXiv Detail & Related papers (2022-10-07T15:12:03Z) - The Whole Truth and Nothing But the Truth: Faithful and Controllable
Dialogue Response Generation with Dataflow Transduction and Constrained
Decoding [65.34601470417967]
We describe a hybrid architecture for dialogue response generation that combines the strengths of neural language modeling and rule-based generation.
Our experiments show that this system outperforms both rule-based and learned approaches in human evaluations of fluency, relevance, and truthfulness.
arXiv Detail & Related papers (2022-09-16T09:00:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.