Dataset of News Articles with Provenance Metadata for Media Relevance Assessment
- URL: http://arxiv.org/abs/2506.09847v1
- Date: Wed, 11 Jun 2025 15:21:05 GMT
- Title: Dataset of News Articles with Provenance Metadata for Media Relevance Assessment
- Authors: Tomas Peterka, Matyas Bohacek,
- Abstract summary: Out-of-context and misattributed imagery is the leading form of media manipulation in today's misinformation and disinformation landscape.<n>We introduce News Media Provenance dataset, a dataset of news articles with provenance-tagged images.<n>We formulate two tasks on this dataset, location of origin relevance (LOR) and date and time of origin relevance (DTOR), and present baseline results on six large language models (LLMs).<n>We identify that, while the zero-shot performance on LOR is promising, the performance on DTOR hinders, leaving room for specialized architectures and future work.
- Score: 0.7366405857677227
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Out-of-context and misattributed imagery is the leading form of media manipulation in today's misinformation and disinformation landscape. The existing methods attempting to detect this practice often only consider whether the semantics of the imagery corresponds to the text narrative, missing manipulation so long as the depicted objects or scenes somewhat correspond to the narrative at hand. To tackle this, we introduce News Media Provenance Dataset, a dataset of news articles with provenance-tagged images. We formulate two tasks on this dataset, location of origin relevance (LOR) and date and time of origin relevance (DTOR), and present baseline results on six large language models (LLMs). We identify that, while the zero-shot performance on LOR is promising, the performance on DTOR hinders, leaving room for specialized architectures and future work.
Related papers
- Verifying Cross-modal Entity Consistency in News using Vision-language Models [6.870504041093726]
The identification of inconsistent cross-modal information is critical to detect disinformation.<n>We propose a framework for validating entity consistency between images and text in news articles.<n>Our results show the potential of LVLMs for automating cross-modal entity verification.
arXiv Detail & Related papers (2025-01-20T11:06:05Z) - Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Exploring Saliency Bias in Manipulation Detection [2.156234249946792]
Social media-fuelled explosion of fake news and misinformation supported by tampered images has led to growth in the development of models and datasets for image manipulation detection.<n>Existing detection methods mostly treat media objects in isolation, without considering the impact of specific manipulations on viewer perception.<n>We propose a framework to analyze the trends of visual and semantic saliency in popular image manipulation datasets and their impact on detection.
arXiv Detail & Related papers (2024-02-12T00:08:51Z) - Prompt me a Dataset: An investigation of text-image prompting for
historical image dataset creation using foundation models [0.9065034043031668]
We present a pipeline for image extraction from historical documents using foundation models.
We evaluate text-image prompts and their effectiveness on humanities datasets of varying levels of complexity.
arXiv Detail & Related papers (2023-09-04T15:37:03Z) - Visually-Aware Context Modeling for News Image Captioning [54.31708859631821]
News Image Captioning aims to create captions from news articles and images.
We propose a face-naming module for learning better name embeddings.
We use CLIP to retrieve sentences that are semantically close to the image.
arXiv Detail & Related papers (2023-08-16T12:39:39Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.<n>To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.<n>Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - AutoSplice: A Text-prompt Manipulated Image Dataset for Media Forensics [31.714342131823987]
This paper aims to investigate the level of challenge that language-image generation models pose to media forensics.
We propose a new approach that leverages the DALL-E2 language-image model to automatically generate and splice masked regions guided by a text prompt.
This approach has resulted in the creation of a new image dataset called AutoSplice, containing 5,894 manipulated and authentic images.
arXiv Detail & Related papers (2023-04-14T00:14:08Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Focus! Relevant and Sufficient Context Selection for News Image
Captioning [69.36678144800936]
News Image Captioning requires describing an image by leveraging additional context from a news article.
We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article.
Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models.
arXiv Detail & Related papers (2022-12-01T20:00:27Z) - News Article Retrieval in Context for Event-centric Narrative Creation [45.50837121213255]
Given an incomplete narrative, we aim to retrieve news articles that discuss relevant events that would enable the continuation of the narrative.
Experiments show that state-of-the-art lexical and semantic rankers are not sufficient for this task.
We show that combining those with a ranker that ranks articles by reverse chronological order outperforms those rankers alone.
arXiv Detail & Related papers (2021-06-30T13:27:54Z) - DORi: Discovering Object Relationship for Moment Localization of a
Natural-Language Query in Video [98.54696229182335]
We study the task of temporal moment localization in a long untrimmed video using natural language query.
Our key innovation is to learn a video feature embedding through a language-conditioned message-passing algorithm.
A temporal sub-graph captures the activities within the video through time.
arXiv Detail & Related papers (2020-10-13T09:50:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.