Automated Text Identification Using CNN and Training Dynamics
- URL: http://arxiv.org/abs/2405.11212v1
- Date: Sat, 18 May 2024 07:37:17 GMT
- Title: Automated Text Identification Using CNN and Training Dynamics
- Authors: Claudiu Creanga, Liviu Petrisor Dinu,
- Abstract summary: We characterized the samples across 3 dimensions: confidence, variability and correctness.
This shows the presence of 3 regions: easy-to-learn, ambiguous and hard-to-learn examples.
We found that training the model only on a subset of ambiguous examples improves the model's out-of-distribution generalization.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We used Data Maps to model and characterize the AuTexTification dataset. This provides insights about the behaviour of individual samples during training across epochs (training dynamics). We characterized the samples across 3 dimensions: confidence, variability and correctness. This shows the presence of 3 regions: easy-to-learn, ambiguous and hard-to-learn examples. We used a classic CNN architecture and found out that training the model only on a subset of ambiguous examples improves the model's out-of-distribution generalization.
Related papers
- Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration [2.814748676983944]
We propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation.
Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers.
Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-08T06:48:01Z) - Complementary Learning for Real-World Model Failure Detection [15.779651238128562]
We introduce complementary learning, where we use learned characteristics from different training paradigms to detect model errors.
We demonstrate our approach by learning semantic and predictive motion labels in point clouds in a supervised and self-supervised manner.
We perform a large-scale qualitative analysis and present LidarCODA, the first dataset with labeled anomalies in lidar point clouds.
arXiv Detail & Related papers (2024-07-19T13:36:35Z) - 3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data.
We learn a set of vectors that deform the objects in an adversarial fashion.
We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z) - Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Forgetting Data from Pre-trained GANs [28.326418377665345]
We investigate how to post-edit a model after training so that it forgets certain kinds of samples.
We provide three different algorithms for GANs that differ on how the samples to be forgotten are described.
Our algorithms are capable of forgetting data while retaining high generation quality at a fraction of the cost of full re-training.
arXiv Detail & Related papers (2022-06-29T03:46:16Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Label-Free Model Evaluation with Semi-Structured Dataset Representations [78.54590197704088]
Label-free model evaluation, or AutoEval, estimates model accuracy on unlabeled test sets.
In the absence of image labels, based on dataset representations, we estimate model performance for AutoEval with regression.
We propose a new semi-structured dataset representation that is manageable for regression learning while containing rich information for AutoEval.
arXiv Detail & Related papers (2021-12-01T18:15:58Z) - Training Dynamics for Text Summarization Models [45.62439188988816]
We analyze the training dynamics for generation models, focusing on news summarization.
Across different datasets (CNN/DM, XSum, MediaSum) and summary properties, we study what the model learns at different stages of its fine-tuning process.
We find that properties such as copy behavior are learnt earlier in the training process and these observations are robust across domains.
On the other hand, factual errors, such as hallucination of unsupported facts, are learnt in the later stages, and this behavior is more varied across domains.
arXiv Detail & Related papers (2021-10-15T21:13:41Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Dataset Cartography: Mapping and Diagnosing Datasets with Training
Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets.
We leverage a largely ignored source of information: the behavior of the model on individual instances during training.
Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.