Enhancing Sentiment Classification with Machine Learning and Combinatorial Fusion
- URL: http://arxiv.org/abs/2510.27014v1
- Date: Thu, 30 Oct 2025 21:30:30 GMT
- Title: Enhancing Sentiment Classification with Machine Learning and Combinatorial Fusion
- Authors: Sean Patten, Pin-Yu Chen, Christina Schweikert, D. Frank Hsu,
- Abstract summary: This paper presents a novel approach to sentiment classification using the application of Combinatorial Fusion Analysis (CFA)<n>CFA leverages the concept of cognitive diversity, which utilizes rank-score characteristic functions to quantify the dissimilarity between models and strategically combine their predictions.<n> Experimental results also indicate that CFA outperforms traditional ensemble methods by effectively computing and employing model diversity.
- Score: 41.99844472131922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a novel approach to sentiment classification using the application of Combinatorial Fusion Analysis (CFA) to integrate an ensemble of diverse machine learning models, achieving state-of-the-art accuracy on the IMDB sentiment analysis dataset of 97.072\%. CFA leverages the concept of cognitive diversity, which utilizes rank-score characteristic functions to quantify the dissimilarity between models and strategically combine their predictions. This is in contrast to the common process of scaling the size of individual models, and thus is comparatively efficient in computing resource use. Experimental results also indicate that CFA outperforms traditional ensemble methods by effectively computing and employing model diversity. The approach in this paper implements the combination of a transformer-based model of the RoBERTa architecture with traditional machine learning models, including Random Forest, SVM, and XGBoost.
Related papers
- Enhancing Time Series Classification with Diversity-Driven Neural Network Ensembles [0.776514389034479]
We introduce a diversity-driven ensemble learning framework that explicitly encourages feature diversity among neural network ensemble members.<n>We evaluate our framework on 128 datasets from the UCR archive and show that it achieves SOTA performance with fewer models.
arXiv Detail & Related papers (2026-02-07T15:05:04Z) - A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - Regularized Neural Ensemblers [55.15643209328513]
In this study, we explore employing regularized neural networks as ensemble methods.<n>Motivated by the risk of learning low-diversity ensembles, we propose regularizing the ensembling model by randomly dropping base model predictions.<n>We demonstrate this approach provides lower bounds for the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - Learning with MISELBO: The Mixture Cookbook [62.75516608080322]
We present the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network.
We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling.
We obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-09-30T15:01:35Z) - The effectiveness of factorization and similarity blending [0.0]
Collaborative Filtering (CF) is a technique which allows to leverage past users' preferences data to identify behavioural patterns and exploit them to predict custom recommendations.
We show that blending factorization-based and similarity-based approaches can lead to a significant error decrease (-9.4%) on stand-alone models.
We propose a novel extension of a similarity model, SCSR, which consistently reduce the complexity of the original algorithm.
arXiv Detail & Related papers (2022-09-16T13:11:27Z) - A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A
Transformer-based Approach [0.0]
We perform semantic similarity analysis and modeling on the U.S. Patent Phrase to Phrase Matching dataset using both traditional and transformer-based techniques.
The experimental results demonstrate our methodology's enhanced performance compared to traditional techniques, with an average Pearson correlation score of 0.79.
arXiv Detail & Related papers (2022-07-24T11:06:56Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Ensemble Learning-Based Approach for Improving Generalization Capability
of Machine Reading Comprehension Systems [0.7614628596146599]
Machine Reading (MRC) is an active field in natural language processing with many successful developed models in recent years.
Despite their high in-distribution accuracy, these models suffer from two issues: high training cost and low out-of-distribution accuracy.
In this paper, we investigate the effect of ensemble learning approach to improve generalization of MRC systems without retraining a big model.
arXiv Detail & Related papers (2021-07-01T11:11:17Z) - Hybrid Method Based on NARX models and Machine Learning for Pattern
Recognition [0.0]
This work presents a novel technique that integrates the methodologies of machine learning and system identification to solve multiclass problems.
The efficiency of the method was tested by running case studies investigated in machine learning, obtaining better absolute results when compared with classical classification algorithms.
arXiv Detail & Related papers (2021-06-08T00:17:36Z) - Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We introduce a novel approach to high-dimensional binary classification that integrates regularization with ensembling techniques.<n>In medical genomics applications, our approach identifies critical biomarkers overlooked by competing methods.
arXiv Detail & Related papers (2021-02-17T05:57:26Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.