Neurosymbolic Feature Extraction for Identifying Forced Labor in Supply Chains
- URL: http://arxiv.org/abs/2507.07217v1
- Date: Wed, 09 Jul 2025 18:44:48 GMT
- Title: Neurosymbolic Feature Extraction for Identifying Forced Labor in Supply Chains
- Authors: Zili Wang, Frank Montabon, Kristin Yvonne Rozier,
- Abstract summary: illicit supply chains are characterized by very sparse data.<n>We propose a question tree approach for querying a large language model (LLM) to identify and quantify the relevance of articles.<n>This enables a systematic evaluation of the differences between human and machine classification of news articles related to forced labor in supply chains.
- Score: 3.938057685137866
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supply chain networks are complex systems that are challenging to analyze; this problem is exacerbated when there are illicit activities involved in the supply chain, such as counterfeit parts, forced labor, or human trafficking. While machine learning (ML) can find patterns in complex systems like supply chains, traditional ML techniques require large training data sets. However, illicit supply chains are characterized by very sparse data, and the data that is available is often (purposely) corrupted or unreliable in order to hide the nature of the activities. We need to be able to automatically detect new patterns that correlate with such illegal activity over complex, even temporal data, without requiring large training data sets. We explore neurosymbolic methods for identifying instances of illicit activity in supply chains and compare the effectiveness of manual and automated feature extraction from news articles accurately describing illicit activities uncovered by authorities. We propose a question tree approach for querying a large language model (LLM) to identify and quantify the relevance of articles. This enables a systematic evaluation of the differences between human and machine classification of news articles related to forced labor in supply chains.
Related papers
- Feature Selection for Network Intrusion Detection [3.7414804164475983]
We present a novel information-theoretic method that facilitates the exclusion of non-informative features when detecting network intrusions.
The proposed method is based on function approximation using a neural network, which enables a version of our approach that incorporates a recurrent layer.
arXiv Detail & Related papers (2024-11-18T14:25:55Z) - Enhancing Supply Chain Visibility with Knowledge Graphs and Large Language Models [49.898152180805454]
This paper presents a novel framework leveraging Knowledge Graphs (KGs) and Large Language Models (LLMs) to enhance supply chain visibility.
Our zero-shot, LLM-driven approach automates the extraction of supply chain information from diverse public sources.
With high accuracy in NER and RE tasks, it provides an effective tool for understanding complex, multi-tiered supply networks.
arXiv Detail & Related papers (2024-08-05T17:11:29Z) - Identifying contributors to supply chain outcomes in a multi-echelon setting: a decentralised approach [47.00450933765504]
We propose the use of explainable artificial intelligence for decentralised computing of estimated contributions to a metric of interest.
This approach mitigates the need to convince supply chain actors to share data, as all computations occur in a decentralised manner.
Results demonstrate the effectiveness of our approach in detecting the source of quality variations compared to a centralised approach.
arXiv Detail & Related papers (2023-07-22T20:03:16Z) - Scrutinizing Shipment Records To Thwart Illegal Timber Trade [14.559268536152926]
grey and black market activities in the wood and forest products sector are not limited to the countries where the wood was harvested, but extend throughout the global supply chain.
Existing approaches suffer from certain shortcomings in their applicability towards large scale trade data.
We propose Contrastive Learning based Heterogeneous Anomaly Detection (CHAD) that is generally applicable for large-scale heterogeneous data.
arXiv Detail & Related papers (2022-07-31T18:54:52Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - Unsupervised Abnormal Traffic Detection through Topological Flow
Analysis [1.933681537640272]
topological connectivity component of a malicious flow is less exploited.
We present a simple method that facilitate the use of connectivity graph features in unsupervised anomaly detection algorithms.
arXiv Detail & Related papers (2022-05-14T18:52:49Z) - Data Considerations in Graph Representation Learning for Supply Chain
Networks [64.72135325074963]
We present a graph representation learning approach to uncover hidden dependency links.
We demonstrate that our representation facilitates state-of-the-art performance on link prediction of a global automotive supply chain network.
arXiv Detail & Related papers (2021-07-22T12:28:15Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Detecting Anomalies Through Contrast in Heterogeneous Data [21.56932906044264]
We propose Contrastive Learning based Heterogeneous Anomaly Detector to address shortcomings of prior models.
Our model uses an asymmetric autoencoder that can effectively handle large arity categorical variables.
We provide a qualitative study to showcase the effectiveness of our model in detecting anomalies in timber trade.
arXiv Detail & Related papers (2021-04-02T17:21:12Z) - Cross-Supervised Joint-Event-Extraction with Heterogeneous Information
Networks [61.950353376870154]
Joint-event-extraction is a sequence-to-sequence labeling task with a tag set composed of tags of triggers and entities.
We propose a Cross-Supervised Mechanism (CSM) to alternately supervise the extraction of triggers or entities.
Our approach outperforms the state-of-the-art methods in both entity and trigger extraction.
arXiv Detail & Related papers (2020-10-13T11:51:17Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.