Effective Black Box Testing of Sentiment Analysis Classification Networks
- URL: http://arxiv.org/abs/2407.20884v1
- Date: Tue, 30 Jul 2024 14:58:11 GMT
- Title: Effective Black Box Testing of Sentiment Analysis Classification Networks
- Authors: Parsa Karbasizadeh, Fathiyeh Faghih, Pouria Golshanrad,
- Abstract summary: Transformer-based neural networks have demonstrated remarkable performance in natural language processing tasks such as sentiment analysis.
This paper presents a collection of coverage criteria specifically designed to assess test suites created for transformer-based sentiment analysis networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer-based neural networks have demonstrated remarkable performance in natural language processing tasks such as sentiment analysis. Nevertheless, the issue of ensuring the dependability of these complicated architectures through comprehensive testing is still open. This paper presents a collection of coverage criteria specifically designed to assess test suites created for transformer-based sentiment analysis networks. Our approach utilizes input space partitioning, a black-box method, by considering emotionally relevant linguistic features such as verbs, adjectives, adverbs, and nouns. In order to effectively produce test cases that encompass a wide range of emotional elements, we utilize the k-projection coverage metric. This metric minimizes the complexity of the problem by examining subsets of k features at the same time, hence reducing dimensionality. Large language models are employed to generate sentences that display specific combinations of emotional features. The findings from experiments obtained from a sentiment analysis dataset illustrate that our criteria and generated tests have led to an average increase of 16\% in test coverage. In addition, there is a corresponding average decrease of 6.5\% in model accuracy, showing the ability to identify vulnerabilities. Our work provides a foundation for improving the dependability of transformer-based sentiment analysis systems through comprehensive test evaluation.
Related papers
- StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs [78.84060166851805]
StructTest is a novel benchmark that evaluates large language models on their ability to produce structured outputs.
We demonstrate that StructTest serves as a good proxy for general reasoning abilities.
arXiv Detail & Related papers (2024-12-23T22:08:40Z) - SentiQNF: A Novel Approach to Sentiment Analysis Using Quantum Algorithms and Neuro-Fuzzy Systems [2.9565647484584496]
We propose a novel hybrid approach for sentiment analysis called the Quantum Fuzzy Neural Network (QFNN)
QFNN leverages quantum properties and incorporates a fuzzy layer to overcome the limitations of classical sentiment analysis algorithms.
The proposed approach expedites sentiment data processing and precisely analyses different forms of textual data.
arXiv Detail & Related papers (2024-12-17T09:54:17Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.
We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.
By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Single Ground Truth Is Not Enough: Adding Flexibility to Aspect-Based Sentiment Analysis Evaluation [41.66053021998106]
Aspect-based sentiment analysis (ABSA) is a challenging task.
Traditional evaluation methods often constrain ground truths (GT) to a single term.
We propose a novel and fully automated pipeline that expands existing evaluation sets by adding alternative valid terms for aspect and opinion.
arXiv Detail & Related papers (2024-10-13T11:48:09Z) - Aspect-Based Sentiment Analysis using Local Context Focus Mechanism with
DeBERTa [23.00810941211685]
Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in the field of sentiment analysis.
Recent DeBERTa model (Decoding-enhanced BERT with disentangled attention) to solve Aspect-Based Sentiment Analysis problem.
arXiv Detail & Related papers (2022-07-06T03:50:31Z) - Sentiment Analysis on Brazilian Portuguese User Reviews [0.0]
This work analyzes the predictive performance of a range of document embedding strategies, assuming the polarity as the system outcome.
This analysis includes five sentiment analysis datasets in Brazilian Portuguese, unified in a single dataset, and a reference partitioning in training, testing, and validation sets, both made publicly available through a digital repository.
arXiv Detail & Related papers (2021-12-10T11:18:26Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Syntactic Perturbations Reveal Representational Correlates of
Hierarchical Phrase Structure in Pretrained Language Models [22.43510769150502]
It is not entirely clear what aspects of sentence-level syntax are captured by vector-based language representations.
We show that Transformers build sensitivity to larger parts of the sentence along their layers, and that hierarchical phrase structure plays a role in this process.
arXiv Detail & Related papers (2021-04-15T16:30:31Z) - Toward Scalable and Unified Example-based Explanation and Outlier
Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction.
We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.