Online Digital Investigative Journalism using SociaLens
- URL: http://arxiv.org/abs/2410.11890v1
- Date: Sun, 13 Oct 2024 07:20:47 GMT
- Title: Online Digital Investigative Journalism using SociaLens
- Authors: Hasan M. Jamil, Sajratul Y. Rubaiat,
- Abstract summary: We introduce a versatile and autonomous investigative journalism tool, called em SociaLens, for identifying and extracting query specific data from online sources.
We envision its use in investigative journalism, law enforcement and social policy planning.
We illustrate the functionality of SociaLens using a focused case study on rape incidents in a developing country.
- Score: 0.0
- License:
- Abstract: Media companies witnessed a significant transformation with the rise of the internet, bigdata, machine learning (ML) and AI. Recent emergence of large language models (LLM) have added another aspect to this transformation. Researchers believe that with the help of these technologies, investigative digital journalism will enter a new era. Using a smart set of data gathering and analysis tools, journalists will be able to create data driven contents and insights in unprecedented ways. In this paper, we introduce a versatile and autonomous investigative journalism tool, called {\em SociaLens}, for identifying and extracting query specific data from online sources, responding to probing queries and drawing conclusions entailed by large volumes of data using ML analytics fully autonomously. We envision its use in investigative journalism, law enforcement and social policy planning. The proposed system capitalizes on the integration of ML technology with LLMs and advanced bigdata search techniques. We illustrate the functionality of SociaLens using a focused case study on rape incidents in a developing country and demonstrate that journalists can gain nuanced insights without requiring coding expertise they might lack. SociaLens is designed as a ChatBot that is capable of contextual conversation, find and collect data relevant to queries, initiate ML tasks to respond to queries, generate textual and visual reports, all fully autonomously within the ChatBot environment.
Related papers
- A Complete Survey on LLM-based AI Chatbots [46.18523139094807]
The past few decades have witnessed an upsurge in data, forming the foundation for data-hungry, learning-based AI technology.
Conversational agents, often referred to as AI chatbots, rely heavily on such data to train large language models (LLMs) and generate new content (knowledge) in response to user prompts.
This paper presents a complete survey of the evolution and deployment of LLM-based chatbots in various sectors.
arXiv Detail & Related papers (2024-06-17T09:39:34Z) - Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A Case Study at HCMUT [2.8000537365271367]
Large language models (LLMs) have emerged as a vibrant research topic.
LLMs face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations.
This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources.
arXiv Detail & Related papers (2024-04-14T16:34:31Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - ChatGPT as your Personal Data Scientist [0.9689893038619583]
This paper introduces a ChatGPT-based conversational data-science framework to act as a "personal data scientist"
Our model pivots around four dialogue states: Data visualization, Task Formulation, Prediction Engineering, and Result Summary and Recommendation.
In summary, we developed an end-to-end system that not only proves the viability of the novel concept of conversational data science but also underscores the potency of LLMs in solving complex tasks.
arXiv Detail & Related papers (2023-05-23T04:00:16Z) - A Vision for Semantically Enriched Data Science [19.604667287258724]
Key areas such as utilizing domain knowledge and data semantics are areas where we have seen little automation.
We envision how leveraging "semantic" understanding and reasoning on data in combination with novel tools for data science automation can help with consistent and explainable data augmentation and transformation.
arXiv Detail & Related papers (2023-03-02T16:03:12Z) - A Survey of Machine Unlearning [56.017968863854186]
Recent regulations now require that, on request, private information about a user must be removed from computer systems.
ML models often remember' the old data.
Recent works on machine unlearning have not been able to completely solve the problem.
arXiv Detail & Related papers (2022-09-06T08:51:53Z) - Text Mining for Processing Interview Data in Computational Social
Science [0.6820436130599382]
We use commercially available text analysis technology to process interview text data from a computational social science study.
We find that topical clustering and terminological enrichment provide for convenient exploration and quantification of the responses.
We encourage studies in social science to use text analysis, especially for exploratory open-ended studies.
arXiv Detail & Related papers (2020-11-28T00:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.