Scrutinizing Shipment Records To Thwart Illegal Timber Trade
- URL: http://arxiv.org/abs/2208.00493v1
- Date: Sun, 31 Jul 2022 18:54:52 GMT
- Title: Scrutinizing Shipment Records To Thwart Illegal Timber Trade
- Authors: Debanjan Datta, Sathappan Muthiah, John Simeone, Amelia Meadows, Naren
Ramakrishnan
- Abstract summary: grey and black market activities in the wood and forest products sector are not limited to the countries where the wood was harvested, but extend throughout the global supply chain.
Existing approaches suffer from certain shortcomings in their applicability towards large scale trade data.
We propose Contrastive Learning based Heterogeneous Anomaly Detection (CHAD) that is generally applicable for large-scale heterogeneous data.
- Score: 14.559268536152926
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Timber and forest products made from wood, like furniture, are valuable
commodities, and like the global trade of many highly-valued natural resources,
face challenges of corruption, fraud, and illegal harvesting. These grey and
black market activities in the wood and forest products sector are not limited
to the countries where the wood was harvested, but extend throughout the global
supply chain and have been tied to illicit financial flows, like trade-based
money laundering, document fraud, species mislabeling, and other illegal
activities. The task of finding such fraudulent activities using trade data, in
the absence of ground truth, can be modelled as an unsupervised anomaly
detection problem. However existing approaches suffer from certain shortcomings
in their applicability towards large scale trade data. Trade data is
heterogeneous, with both categorical and numerical attributes in a tabular
format. The overall challenge lies in the complexity, volume and velocity of
data, with large number of entities and lack of ground truth labels. To
mitigate these, we propose a novel unsupervised anomaly detection --
Contrastive Learning based Heterogeneous Anomaly Detection (CHAD) that is
generally applicable for large-scale heterogeneous tabular data. We demonstrate
our model CHAD performs favorably against multiple comparable baselines for
public benchmark datasets, and outperforms them in the case of trade data. More
importantly we demonstrate our approach reduces assumptions and efforts
required hyperparameter tuning, which is a key challenging aspect in an
unsupervised training paradigm. Specifically, our overarching objective
pertains to detecting suspicious timber shipments and patterns using Bill of
Lading trade record data. Detecting anomalous transactions in shipment records
can enable further investigation by government agencies and supply chain
constituents.
Related papers
- Evaluating Fairness in Transaction Fraud Models: Fairness Metrics, Bias Audits, and Challenges [3.499319293058353]
Despite extensive research on algorithmic fairness, there is a notable gap in the study of bias in fraud detection models.
These challenges include the need for fairness metrics that account for fraud data's imbalanced nature and the tradeoff between fraud protection and service quality.
We present a comprehensive fairness evaluation of transaction fraud models using public synthetic datasets.
arXiv Detail & Related papers (2024-09-06T16:08:27Z) - Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods.
We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results.
We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - From Chaos to Clarity: Claim Normalization to Empower Fact-Checking [57.024192702939736]
Claim Normalization (aka ClaimNorm) aims to decompose complex and noisy social media posts into more straightforward and understandable forms.
We propose CACN, a pioneering approach that leverages chain-of-thought and claim check-worthiness estimation.
Our experiments demonstrate that CACN outperforms several baselines across various evaluation measures.
arXiv Detail & Related papers (2023-10-22T16:07:06Z) - Deep Semi-Supervised Anomaly Detection for Finding Fraud in the Futures
Market [0.0]
This research article aims to evaluate the efficacy of a deep semi-supervised anomaly detection technique, called Deep SAD, for detecting fraud in high-frequency financial data.
We use exclusive proprietary limit order book data from the TMX exchange in Montr'eal, with a small set of true labeled instances of fraud, to evaluate Deep SAD.
We show that incorporating a small amount of labeled data into an unsupervised anomaly detection framework can greatly improve its accuracy.
arXiv Detail & Related papers (2023-08-31T19:07:50Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Toward the Automated Construction of Probabilistic Knowledge Graphs for
the Maritime Domain [60.76554773885988]
International maritime crime is becoming increasingly sophisticated, often associated with wider criminal networks.
This has led to research and development efforts aimed at combining hard data with other types of data.
We propose Maritime DeepDive, an initial prototype for the automated construction of probabilistic knowledge graphs.
arXiv Detail & Related papers (2023-05-04T00:24:30Z) - A machine learning model to identify corruption in M\'exico's public
procurement contracts [0.0]
This paper proposes a machine learning model to identify and predict corrupt contracts in M'exico's public procurement data.
We found that the most critical predictors considered in the model are those related to the relationship between buyers and suppliers.
Our work presents a tool that can help in the decision-making process to identify, predict and analyze corruption in public procurement contracts.
arXiv Detail & Related papers (2022-10-25T01:22:41Z) - Customs Import Declaration Datasets [12.306592823750385]
We introduce an import declaration dataset to facilitate the collaboration between domain experts in customs administrations and researchers from diverse domains.
The dataset contains 54,000 artificially generated trades with 22 key attributes.
We empirically show that more advanced algorithms can better detect fraud.
arXiv Detail & Related papers (2022-08-04T06:20:20Z) - Towards Real-World Prohibited Item Detection: A Large-Scale X-ray
Benchmark [53.9819155669618]
This paper presents a large-scale dataset, named as PIDray, which covers various cases in real-world scenarios for prohibited item detection.
With an intensive amount of effort, our dataset contains $12$ categories of prohibited items in $47,677$ X-ray images with high-quality annotated segmentation masks and bounding boxes.
The proposed method performs favorably against the state-of-the-art methods, especially for detecting the deliberately hidden items.
arXiv Detail & Related papers (2021-08-16T11:14:16Z) - Detecting Anomalies Through Contrast in Heterogeneous Data [21.56932906044264]
We propose Contrastive Learning based Heterogeneous Anomaly Detector to address shortcomings of prior models.
Our model uses an asymmetric autoencoder that can effectively handle large arity categorical variables.
We provide a qualitative study to showcase the effectiveness of our model in detecting anomalies in timber trade.
arXiv Detail & Related papers (2021-04-02T17:21:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.