Enabling Roll-up and Drill-down Operations in News Exploration with Knowledge Graphs for Due Diligence and Risk Management
- URL: http://arxiv.org/abs/2405.04929v1
- Date: Wed, 08 May 2024 09:54:55 GMT
- Title: Enabling Roll-up and Drill-down Operations in News Exploration with Knowledge Graphs for Due Diligence and Risk Management
- Authors: Sha Wang, Yuchen Li, Hanhua Xiao, Zhifeng Bao, Lambert Deng, Yanfei Dong,
- Abstract summary: NCEXPLORER is a framework designed with OLAP-like operations to enhance the news exploration experience.
NCEXPLORER empowers users to use roll-up operations for a broader content overview and drill-down operations for detailed insights.
This integration significantly augments exploration capabilities, offering a more comprehensive and efficient approach to unveiling the underlying structures and nuances embedded in news content.
- Score: 13.890931410223684
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient news exploration is crucial in real-world applications, particularly within the financial sector, where numerous control and risk assessment tasks rely on the analysis of public news reports. The current processes in this domain predominantly rely on manual efforts, often involving keywordbased searches and the compilation of extensive keyword lists. In this paper, we introduce NCEXPLORER, a framework designed with OLAP-like operations to enhance the news exploration experience. NCEXPLORER empowers users to use roll-up operations for a broader content overview and drill-down operations for detailed insights. These operations are achieved through integration with external knowledge graphs (KGs), encompassing both fact-based and ontology-based structures. This integration significantly augments exploration capabilities, offering a more comprehensive and efficient approach to unveiling the underlying structures and nuances embedded in news content. Extensive empirical studies through master-qualified evaluators on Amazon Mechanical Turk demonstrate NCEXPLORER's superiority over existing state-of-the-art news search methodologies across an array of topic domains, using real-world news datasets.
Related papers
- WebThinker: Empowering Large Reasoning Models with Deep Research Capability [60.81964498221952]
WebThinker is a deep research agent that empowers large reasoning models to autonomously search the web, navigate web pages, and draft research reports during the reasoning process.
It also employs an textbfAutonomous Think-Search-and-Draft strategy, allowing the model to seamlessly interleave reasoning, information gathering, and report writing in real time.
Our approach enhances LRM reliability and applicability in complex scenarios, paving the way for more capable and versatile deep research systems.
arXiv Detail & Related papers (2025-04-30T16:25:25Z) - A Comprehensive Survey on Composed Image Retrieval [54.54527281731775]
Composed Image Retrieval (CIR) is an emerging yet challenging task that allows users to search for target images using a multimodal query.
There is currently no comprehensive review of CIR to provide a timely overview of this field.
We synthesize insights from over 120 publications in top conferences and journals, including ACM TOIS, SIGIR, and CVPR.
arXiv Detail & Related papers (2025-02-19T01:37:24Z) - RedStone: Curating General, Code, Math, and QA Data for Large Language Models [134.49774529790693]
This study explores the untapped potential of Common Crawl as a comprehensive and flexible resource for pre-training Large Language Models.
We introduce RedStone, an innovative and scalable pipeline engineered to extract and process data from Common Crawl.
arXiv Detail & Related papers (2024-12-04T15:27:39Z) - From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents [7.41862656697588]
This study aims to analyze the various contextual elements crucial to the functioning of web navigation agents.
We focus on the influence of interaction history and web page representation.
Our work highlights improved agent performance across out-of-distribution scenarios.
arXiv Detail & Related papers (2024-10-31T01:51:41Z) - Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study [46.55831783809377]
Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach to mitigating the hallucination of large language models (LLMs)
We develop PruningRAG, a plug-and-play RAG framework that uses multi-granularity pruning strategies to more effectively incorporate relevant context and mitigate the negative impact of misleading information.
arXiv Detail & Related papers (2024-09-03T03:31:37Z) - Fake News Detection: It's All in the Data! [0.06749750044497731]
The survey meticulously outlines the key features of datasets, various labeling systems employed, and prevalent biases that can impact model performance.
GitHub repository consolidates publicly accessible datasets into a single, user-friendly portal.
arXiv Detail & Related papers (2024-07-02T10:12:06Z) - Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks [12.400599440431188]
Information Extraction (IE) plays a crucial role in Natural Language Processing (NLP)
Recent experiments focusing on English IE tasks have shed light on the challenges faced by Large Language Models (LLMs) in achieving optimal performance.
arXiv Detail & Related papers (2024-06-04T08:00:40Z) - Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application [54.984348122105516]
Large Language Models (LLMs) pretrained on massive text corpus presents a promising avenue for enhancing recommender systems.
We propose an Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework that synergizes open-world knowledge with collaborative knowledge.
arXiv Detail & Related papers (2024-05-07T04:00:30Z) - WESE: Weak Exploration to Strong Exploitation for LLM Agents [95.6720931773781]
This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE) to enhance LLM agents in solving open-world interactive tasks.
WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge.
A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task.
arXiv Detail & Related papers (2024-04-11T03:31:54Z) - Towards a RAG-based Summarization Agent for the Electron-Ion Collider [0.5504260452953508]
A Retrieval Augmented Generation (RAG)--based Summarization AI for EIC (RAGS4EIC) is under development.
This AI-Agent not only condenses information but also effectively references relevant responses, offering substantial advantages for collaborators.
Our project involves a two-step approach: first, querying a comprehensive vector database containing all pertinent experiment information; second, utilizing a Large Language Model (LLM) to generate concise summaries enriched with citations based on user queries and retrieved data.
arXiv Detail & Related papers (2024-03-23T05:32:46Z) - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers.
WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform.
BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Building Interpretable and Reliable Open Information Retriever for New
Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA)
We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query.
We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.