Embedding Compression for Text Classification Using Dictionary Screening
- URL: http://arxiv.org/abs/2211.12715v1
- Date: Wed, 23 Nov 2022 05:32:13 GMT
- Title: Embedding Compression for Text Classification Using Dictionary Screening
- Authors: Jing Zhou, Xinru Jing, Muyu Liu, Hansheng Wang
- Abstract summary: We propose a dictionary screening method for embedding compression in text classification tasks.
The proposed method leads to significant reductions in terms of parameters, average text sequence, and dictionary size.
- Score: 8.308609870092884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a dictionary screening method for embedding
compression in text classification tasks. The key purpose of this method is to
evaluate the importance of each keyword in the dictionary. To this end, we
first train a pre-specified recurrent neural network-based model using a full
dictionary. This leads to a benchmark model, which we then use to obtain the
predicted class probabilities for each sample in a dataset. Next, to evaluate
the impact of each keyword in affecting the predicted class probabilities, we
develop a novel method for assessing the importance of each keyword in a
dictionary. Consequently, each keyword can be screened, and only the most
important keywords are reserved. With these screened keywords, a new dictionary
with a considerably reduced size can be constructed. Accordingly, the original
text sequence can be substantially compressed. The proposed method leads to
significant reductions in terms of parameters, average text sequence, and
dictionary size. Meanwhile, the prediction power remains very competitive
compared to the benchmark model. Extensive numerical studies are presented to
demonstrate the empirical performance of the proposed method.
Related papers
- Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression [15.460141768587663]
We propose a lightweight supervised dictionary learning framework for text classification based on data compression and representation.
We evaluate our algorithm's information-theoretic performance using information bottleneck principles and introduce the information plane area rank (IPAR) as a novel metric to quantify the information-theoretic performance.
arXiv Detail & Related papers (2024-04-28T10:11:52Z) - Quantization of Large Language Models with an Overdetermined Basis [73.79368761182998]
We introduce an algorithm for data quantization based on the principles of Kashin representation.
Our findings demonstrate that Kashin Quantization achieves competitive or superior quality in model performance.
arXiv Detail & Related papers (2024-04-15T12:38:46Z) - Automatic Counterfactual Augmentation for Robust Text Classification
Based on Word-Group Search [12.894936637198471]
In general, a keyword is regarded as a shortcut if it creates a superficial association with the label, resulting in a false prediction.
We propose a new Word-Group mining approach, which captures the causal effect of any keyword combination and orders the combinations that most affect the prediction.
Our approach bases on effective post-hoc analysis and beam search, which ensures the mining effect and reduces the complexity.
arXiv Detail & Related papers (2023-07-01T02:26:34Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs.
In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z) - Weakly-supervised Text Classification Based on Keyword Graph [30.57722085686241]
We propose a novel framework called ClassKG to explore keyword-keyword correlation on keyword graph by GNN.
Our framework is an iterative process. In each iteration, we first construct a keyword graph, so the task of assigning pseudo labels is transformed to annotating keyword subgraphs.
With the pseudo labels generated by the subgraph annotator, we then train a text classifier to classify the unlabeled texts.
arXiv Detail & Related papers (2021-10-06T08:58:02Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - FRAKE: Fusional Real-time Automatic Keyword Extraction [1.332091725929965]
Keywords extraction is called identifying words or phrases that express the main concepts of texts in best.
We use a combined approach that consists of two models of graph centrality features and textural features.
arXiv Detail & Related papers (2021-04-10T18:30:17Z) - MASKER: Masked Keyword Regularization for Reliable Text Classification [73.90326322794803]
We propose a fine-tuning method, coined masked keyword regularization (MASKER), that facilitates context-based prediction.
MASKER regularizes the model to reconstruct the keywords from the rest of the words and make low-confidence predictions without enough context.
We demonstrate that MASKER improves OOD detection and cross-domain generalization without degrading classification accuracy.
arXiv Detail & Related papers (2020-12-17T04:54:16Z) - Accelerating Text Mining Using Domain-Specific Stop Word Lists [57.76576681191192]
We present a novel approach for the automatic extraction of domain-specific words called the hyperplane-based approach.
The hyperplane-based approach can significantly reduce text dimensionality by eliminating irrelevant features.
Results indicate that the hyperplane-based approach can reduce the dimensionality of the corpus by 90% and outperforms mutual information.
arXiv Detail & Related papers (2020-11-18T17:42:32Z) - PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding [16.531103175919924]
We look into the task of emphgeneralizing word embeddings.
given a set of pre-trained word vectors over a finite vocabulary, the goal is to predict embedding vectors for out-of-vocabulary words.
We propose a model, along with an efficient algorithm, that simultaneously models subword segmentation and computes subword-based compositional word embedding.
arXiv Detail & Related papers (2020-10-21T08:11:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.