Leaking Sensitive Financial Accounting Data in Plain Sight using Deep
Autoencoder Neural Networks
- URL: http://arxiv.org/abs/2012.07110v1
- Date: Sun, 13 Dec 2020 17:29:53 GMT
- Title: Leaking Sensitive Financial Accounting Data in Plain Sight using Deep
Autoencoder Neural Networks
- Authors: Marco Schreyer, Chistian Schulze, Damian Borth
- Abstract summary: We introduce a real-world threat model' designed to leak sensitive accounting data.
We show that a deep steganographic process, constituted by three neural networks, can be trained to hide such data in unobtrusive day-to-day' images.
- Score: 1.9659095632676094
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Nowadays, organizations collect vast quantities of sensitive information in
`Enterprise Resource Planning' (ERP) systems, such as accounting relevant
transactions, customer master data, or strategic sales price information. The
leakage of such information poses a severe threat for companies as the number
of incidents and the reputational damage to those experiencing them continue to
increase. At the same time, discoveries in deep learning research revealed that
machine learning models could be maliciously misused to create new attack
vectors. Understanding the nature of such attacks becomes increasingly
important for the (internal) audit and fraud examination practice. The creation
of such an awareness holds in particular for the fraudulent data leakage using
deep learning-based steganographic techniques that might remain undetected by
state-of-the-art `Computer Assisted Audit Techniques' (CAATs). In this work, we
introduce a real-world `threat model' designed to leak sensitive accounting
data. In addition, we show that a deep steganographic process, constituted by
three neural networks, can be trained to hide such data in unobtrusive
`day-to-day' images. Finally, we provide qualitative and quantitative
evaluations on two publicly available real-world payment datasets.
Related papers
- Releasing Malevolence from Benevolence: The Menace of Benign Data on Machine Unlearning [28.35038726318893]
Machine learning models trained on vast amounts of real or synthetic data often achieve outstanding predictive performance across various domains.
To address privacy concerns, machine unlearning has been proposed to erase specific data samples from models.
We introduce the Unlearning Usability Attack to distill data distribution information into a small set of benign data.
arXiv Detail & Related papers (2024-07-06T15:42:28Z) - A Customer Level Fraudulent Activity Detection Benchmark for Enhancing Machine Learning Model Research and Evaluation [0.4681661603096334]
This study introduces a benchmark that contains structured datasets specifically designed for customer-level fraud detection.
The benchmark not only adheres to strict privacy guidelines to ensure user confidentiality but also provides a rich source of information by encapsulating customer-centric features.
arXiv Detail & Related papers (2024-04-23T04:57:44Z) - Root causes, ongoing difficulties, proactive prevention techniques, and emerging trends of enterprise data breaches [0.0]
Businesses now consider data to be a crucial asset, and any breach of this data can have dire repercussions.
Enterprises now place a high premium on detecting and preventing data loss due to the growing amount of data and the increasing frequency of data breaches.
This review attempts to highlight interesting prospects and offer insightful information to those who are interested in learning about the risks that businesses face from data leaks.
arXiv Detail & Related papers (2023-11-27T20:34:10Z) - Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures
Market [0.0]
This research article aims to evaluate the efficacy of a deep semi-supervised anomaly detection technique, called Deep SAD, for detecting fraud in high-frequency financial data.
We use exclusive proprietary limit order book data from the TMX exchange in Montr'eal, with a small set of true labeled instances of fraud, to evaluate Deep SAD.
We show that incorporating a small amount of labeled data into an unsupervised anomaly detection framework can greatly improve its accuracy.
arXiv Detail & Related papers (2023-08-31T19:07:50Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak
Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts.
Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect.
Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z) - Insights into Data through Model Behaviour: An Explainability-driven
Strategy for Data Auditing for Responsible Computer Vision Applications [70.92379567261304]
This study explores an explainability-driven strategy to data auditing.
We demonstrate this strategy by auditing two popular medical benchmark datasets.
We discover hidden data quality issues that lead deep learning models to make predictions for the wrong reasons.
arXiv Detail & Related papers (2021-06-16T23:46:39Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Adversarial Attacks on Machine Learning Systems for High-Frequency
Trading [55.30403936506338]
We study valuation models for algorithmic trading from the perspective of adversarial machine learning.
We introduce new attacks specific to this domain with size constraints that minimize attack costs.
We discuss how these attacks can be used as an analysis tool to study and evaluate the robustness properties of financial models.
arXiv Detail & Related papers (2020-02-21T22:04:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.