An Innovative Tool for Uploading/Scraping Large Image Datasets on Social
Networks
- URL: http://arxiv.org/abs/2311.09237v1
- Date: Wed, 1 Nov 2023 23:27:37 GMT
- Title: An Innovative Tool for Uploading/Scraping Large Image Datasets on Social
Networks
- Authors: Nicol\`o Fabio Arceri, Oliver Giudice, Sebastiano Battiato
- Abstract summary: We propose an automated approach by means of a digital tool that we created on purpose.
The tool is capable of automatically uploading an entire image dataset to the desired digital platform and then downloading all the uploaded pictures.
- Score: 9.27070946719462
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Nowadays, people can retrieve and share digital information in an
increasingly easy and fast fashion through the well-known digital platforms,
including sensitive data, inappropriate or illegal content, and, in general,
information that might serve as probative evidence in court. Consequently, to
assess forensics issues, we need to figure out how to trace back to the posting
chain of a digital evidence (e.g., a picture, an audio) throughout the involved
platforms -- this is what Digital (also Forensics) Ballistics basically deals
with. With the entry of Machine Learning as a tool of the trade in many
research areas, the need for vast amounts of data has been dramatically
increasing over the last few years. However, collecting or simply find the
"right" datasets that properly enables data-driven research studies can turn
out to be not trivial in some cases, if not extremely challenging, especially
when it comes with highly specialized tasks, such as creating datasets analyzed
to detect the source media platform of a given digital media. In this paper we
propose an automated approach by means of a digital tool that we created on
purpose. The tool is capable of automatically uploading an entire image dataset
to the desired digital platform and then downloading all the uploaded pictures,
thus shortening the overall time required to output the final dataset to be
analyzed.
Related papers
- Harnessing the Power of Text-image Contrastive Models for Automatic
Detection of Online Misinformation [50.46219766161111]
We develop a self-learning model to explore the constrastive learning in the domain of misinformation identification.
Our model shows the superior performance of non-matched image-text pair detection when the training data is insufficient.
arXiv Detail & Related papers (2023-04-19T02:53:59Z) - Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for
Longer-Range Object Tracking Applications [3.9776693020673677]
Digital twin is a problem of augmenting real objects with their digital counterparts.
A critical component in a good digital-twin system is real-time, accurate 3D object tracking.
In this work, we create a novel RGB-D dataset, called Digital Twin Tracking dataset (DTTD)
arXiv Detail & Related papers (2023-02-12T20:06:07Z) - Fighting Malicious Media Data: A Survey on Tampering Detection and
Deepfake Detection [115.83992775004043]
Recent advances in deep learning, particularly deep generative models, open the doors for producing perceptually convincing images and videos at a low cost.
This paper provides a comprehensive review of the current media tampering detection approaches, and discusses the challenges and trends in this field for future research.
arXiv Detail & Related papers (2022-12-12T02:54:08Z) - Synthetic Data for Object Classification in Industrial Applications [53.180678723280145]
In object classification, capturing a large number of images per object and in different conditions is not always possible.
This work explores the creation of artificial images using a game engine to cope with limited data in the training dataset.
arXiv Detail & Related papers (2022-12-09T11:43:04Z) - Comprehensive Dataset of Face Manipulations for Development and
Evaluation of Forensic Tools [0.6091702876917281]
We create a challenge dataset of edited facial images to assist the research community in developing novel approaches to address and classify the authenticity of digital media.
The goals of our dataset are to address the following challenge questions: (1) Can we determine the authenticity of a given image (edit detection)?
Our hope is that our prepared evaluation protocol will assist researchers in improving the state-of-the-art in image forensics as they pertain to these challenges.
arXiv Detail & Related papers (2022-08-24T21:17:28Z) - CapillaryX: A Software Design Pattern for Analyzing Medical Images in
Real-time using Deep Learning [0.688204255655161]
This paper provides a computing architecture that locally and in parallel can analyze medical images in real-time.
We focus on a specific medical-industrial case study, namely the quantifying of blood vessels in microcirculation images.
Our results show that our system is approximately 78% faster than its serial system counterpart and 12% faster than a master-slave parallel system architecture.
arXiv Detail & Related papers (2022-04-13T18:47:04Z) - Automatic Image Content Extraction: Operationalizing Machine Learning in
Humanistic Photographic Studies of Large Visual Archives [81.88384269259706]
We introduce Automatic Image Content Extraction framework for machine learning-based search and analysis of large image archives.
The proposed framework can be applied in several domains in humanities and social sciences.
arXiv Detail & Related papers (2022-04-05T12:19:24Z) - Automated Artefact Relevancy Determination from Artefact Metadata and
Associated Timeline Events [7.219077740523683]
Case-hindering, multi-year digital forensic evidence backlogs have become commonplace in law enforcement agencies throughout the world.
This is due to an ever-growing number of cases requiring digital forensic investigation coupled with the growing volume of data to be processed per case.
Leveraging previously processed digital forensic cases and their component artefact relevancy classifications can facilitate an opportunity for training automated artificial intelligence based evidence processing systems.
arXiv Detail & Related papers (2020-12-02T14:14:26Z) - Deep Traffic Sign Detection and Recognition Without Target Domain Real
Images [52.079665469286496]
We propose a novel database generation method that requires no real image from the target-domain, and (ii) templates of the traffic signs.
The method does not aim at overcoming the training with real data, but to be a compatible alternative when the real data is not available.
On large data sets, training with a fully synthetic data set almost matches the performance of training with a real one.
arXiv Detail & Related papers (2020-07-30T21:06:47Z) - From ImageNet to Image Classification: Contextualizing Progress on
Benchmarks [99.19183528305598]
We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset.
Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for.
arXiv Detail & Related papers (2020-05-22T17:39:16Z) - A Feature Comparison of Modern Digital Forensic Imaging Software [0.0]
Fundamental processes in digital forensic investigation, such as disk imaging, were developed when digital investigation was relatively young.
We show the weakness in current digital investigation fundamental software development and maintenance over time.
arXiv Detail & Related papers (2020-01-02T02:42:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.