Pattern Recognition of Scrap Plastic Misclassification in Global Trade Data
- URL: http://arxiv.org/abs/2511.08638v1
- Date: Thu, 13 Nov 2025 01:01:16 GMT
- Title: Pattern Recognition of Scrap Plastic Misclassification in Global Trade Data
- Authors: Muhammad Sukri Bin Ramli,
- Abstract summary: Our system analyzes trade data to find a novel inverse price-volume signature, a pattern where reported volumes increase as average unit prices decrease.<n>The model achieves 0.9375 accuracy and was validated by comparing large-scale UN data with detailed firm-level data.<n>This scalable tool provides customs authorities with a transparent, data-driven method to shift from conventional to priority-based inspection protocols.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an interpretable machine learning framework to help identify trade data discrepancies that are challenging to detect with traditional methods. Our system analyzes trade data to find a novel inverse price-volume signature, a pattern where reported volumes increase as average unit prices decrease. The model achieves 0.9375 accuracy and was validated by comparing large-scale UN data with detailed firm-level data, confirming that the risk signatures are consistent. This scalable tool provides customs authorities with a transparent, data-driven method to shift from conventional to priority-based inspection protocols, translating complex data into actionable intelligence to support international environmental policies.
Related papers
- Puzzle it Out: Local-to-Global World Model for Offline Multi-Agent Reinforcement Learning [22.038062200642162]
offline multi-agent reinforcement learning (MARL) aims to solve cooperative decision-making problems in multi-agent systems using pre-collected datasets.<n>We introduce an uncertainty-aware sampling mechanism that adaptively weights synthetic data by prediction uncertainty, reducing approximation error propagation to policies.
arXiv Detail & Related papers (2026-01-12T12:17:11Z) - Sell Data to AI Algorithms Without Revealing It: Secure Data Valuation and Sharing via Homomorphic Encryption [10.12846924939717]
We introduce the Trustworthy Influence Protocol (TIP), a privacy-preserving framework that enables buyers to quantify the utility of external data without decrypting the raw assets.<n>By integrating Homomorphic Encryption with gradient-based influence functions, our approach allows for the precise, blinded scoring of data points against a buyer's specific AI model.<n> Empirical simulations in healthcare and generative AI domains validate the framework's economic potential.
arXiv Detail & Related papers (2025-12-04T16:35:09Z) - Semi-Supervised Federated Learning via Dual Contrastive Learning and Soft Labeling for Intelligent Fault Diagnosis [30.60728200709919]
This paper proposes a semi-supervised federated learning framework, SSFL-DCSL.<n>It integrates dual contrastive loss and soft labeling to address data and label scarcity for distributed clients.<n>It can improve accuracy by 1.15% to 7.85% over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-12T10:54:23Z) - Robust Molecular Property Prediction via Densifying Scarce Labeled Data [53.24886143129006]
In drug discovery, compounds most critical for advancing research often lie beyond the training set.<n>We propose a novel bilevel optimization approach that leverages unlabeled data to interpolate between in-distribution (ID) and out-of-distribution (OOD) data.
arXiv Detail & Related papers (2025-06-13T15:27:40Z) - Optimizing Product Provenance Verification using Data Valuation Methods [24.59951827145763]
We introduce a novel data valuation framework designed to enhance the selection and utilization of training data for machine learning models applied in Stable Isotope Ratio Analysis (SIRA)<n>We validate our methodology with extensive experiments, demonstrating its potential to significantly enhance provenance verification, mitigate fraudulent trade practices, and strengthen regulatory enforcement of global supply chains.
arXiv Detail & Related papers (2025-02-21T03:16:19Z) - Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets.
We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers.
Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z) - Harnessing Administrative Data Inventories to Create a Reliable
Transnational Reference Database for Crop Type Monitoring [0.0]
We showcase E URO C ROPS, a reference dataset for crop type classification that aggregates and harmonizes administrative data surveyed in different countries with the goal of transnational interoperability.
arXiv Detail & Related papers (2023-10-10T07:57:00Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - CDKT-FL: Cross-Device Knowledge Transfer using Proxy Dataset in Federated Learning [27.84845136697669]
We develop a novel knowledge distillation-based approach to study the extent of knowledge transfer between the global model and local models.
We show the proposed method achieves significant speedups and high personalized performance of local models.
arXiv Detail & Related papers (2022-04-04T14:49:19Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.