Fine-Grained Element Identification in Complaint Text of Internet Fraud
- URL: http://arxiv.org/abs/2108.08676v1
- Date: Thu, 19 Aug 2021 13:33:09 GMT
- Title: Fine-Grained Element Identification in Complaint Text of Internet Fraud
- Authors: Tong Liu, Siyuan Wang, Jingchao Fu, Lei Chen, Zhongyu Wei, Yaqi Liu,
Heng Ye, Liaosa Xu, Weiqiang Wan, Xuanjing Huang
- Abstract summary: We propose to analyse the complaint text of internet fraud in a fine-grained manner.
Considering the complaint text includes multiple clauses with various functions, we propose to identify the role of each clause and classify them into different types of fraud element.
- Score: 47.249423877146604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing system dealing with online complaint provides a final decision
without explanations. We propose to analyse the complaint text of internet
fraud in a fine-grained manner. Considering the complaint text includes
multiple clauses with various functions, we propose to identify the role of
each clause and classify them into different types of fraud element. We
construct a large labeled dataset originated from a real finance service
platform. We build an element identification model on top of BERT and propose
additional two modules to utilize the context of complaint text for better
element label classification, namely, global context encoder and label refiner.
Experimental results show the effectiveness of our model.
Related papers
- Identifying Banking Transaction Descriptions via Support Vector Machine Short-Text Classification Based on a Specialized Labelled Corpus [7.046417074932257]
We describe a novel system that combines Natural Language Processing techniques with Machine Learning algorithms to classify banking transaction descriptions.
Motivated by existing solutions in spam detection, we also propose a short text similarity detector to reduce training set size based on the Jaccard distance.
We present a use case with a personal finance application, CoinScrap, which is available at Google Play and App Store.
arXiv Detail & Related papers (2024-03-29T13:15:46Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Like a Good Nearest Neighbor: Practical Content Moderation and Text
Classification [66.02091763340094]
Like a Good Nearest Neighbor (LaGoNN) is a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor.
LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit.
arXiv Detail & Related papers (2023-02-17T15:43:29Z) - IDPS Signature Classification with a Reject Option and the Incorporation
of Expert Knowledge [3.867363075280544]
We propose and evaluate a machine learning signature classification model with a reject option (RO) to reduce the cost of setting up an intrusion detection and prevention system (IDPS)
To train the proposed model, it is essential to design features that are effective for signature classification.
The effectiveness of the proposed classification model is evaluated in experiments with two real datasets composed of signatures labeled by experts.
arXiv Detail & Related papers (2022-07-19T06:09:33Z) - Span Classification with Structured Information for Disfluency Detection
in Spoken Utterances [47.05113261111054]
We propose a novel architecture for detecting disfluencies in transcripts from spoken utterances.
Our proposed model achieves state-of-the-art results on the widely used English Switchboard for disfluency detection.
arXiv Detail & Related papers (2022-03-30T03:22:29Z) - Assessing Effectiveness of Using Internal Signals for Check-Worthy Claim
Identification in Unlabeled Data for Automated Fact-Checking [6.193231258199234]
This paper explores methodology to identify check-worthy claim sentences from fake news articles.
We leverage two internal supervisory signals - headline and the abstractive summary - to rank the sentences.
We show that while the headline has more gisting similarity with how a fact-checking website writes a claim, the summary-based pipeline is the most promising for an end-to-end fact-checking system.
arXiv Detail & Related papers (2021-11-02T16:17:20Z) - MGD-GAN: Text-to-Pedestrian generation through Multi-Grained
Discrimination [96.91091607251526]
We propose the Multi-Grained Discrimination enhanced Generative Adversarial Network, that capitalizes a human-part-based Discriminator and a self-cross-attended Discriminator.
A fine-grained word-level attention mechanism is employed in the HPD module to enforce diversified appearance and vivid details.
The substantial improvement over the various metrics demonstrates the efficacy of MGD-GAN on the text-to-pedestrian synthesis scenario.
arXiv Detail & Related papers (2020-10-02T12:24:48Z) - DFraud3- Multi-Component Fraud Detection freeof Cold-start [50.779498955162644]
The Cold-start is a significant problem referring to the failure of a detection system to recognize the authenticity of a new user.
In this paper, we model a review system as a Heterogeneous InformationNetwork (HIN) which enables a unique representation to every component.
HIN with graph induction helps to address the camouflage issue (fraudsterswith genuine reviews) which has shown to be more severe when it is coupled with cold-start, i.e., new fraudsters with genuine first reviews.
arXiv Detail & Related papers (2020-06-10T08:20:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.