On a Guided Nonnegative Matrix Factorization
- URL: http://arxiv.org/abs/2010.11365v2
- Date: Fri, 5 Feb 2021 16:56:22 GMT
- Title: On a Guided Nonnegative Matrix Factorization
- Authors: Joshua Vendrow, Jamie Haddock, Elizaveta Rebrova, Deanna Needell
- Abstract summary: We propose an approach based upon the nonnegative matrix factorization (NMF) model, deemed textit NMF, that incorporates user-designed seed word supervision.
Our experimental results demonstrate the promise of this model and illustrate that it is competitive with other methods of this ilk with only very little supervision information.
- Score: 9.813862201223973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fully unsupervised topic models have found fantastic success in document
clustering and classification. However, these models often suffer from the
tendency to learn less-than-meaningful or even redundant topics when the data
is biased towards a set of features. For this reason, we propose an approach
based upon the nonnegative matrix factorization (NMF) model, deemed
\textit{Guided NMF}, that incorporates user-designed seed word supervision. Our
experimental results demonstrate the promise of this model and illustrate that
it is competitive with other methods of this ilk with only very little
supervision information.
Related papers
- Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort [31.992947353231564]
Concept Bottleneck Models (CBMs) can provide a principled way of disclosing and guiding model behaviors through human-understandable concepts.
We propose a novel framework designed to exploit pre-trained models while being immune to these biases, thereby reducing vulnerability to spurious correlations.
We evaluate the proposed method on multiple datasets, and the results demonstrate its effectiveness in reducing model reliance on spurious correlations while preserving its interpretability.
arXiv Detail & Related papers (2024-07-12T03:07:28Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - Probing of Quantitative Values in Abstractive Summarization Models [0.0]
We evaluate the efficacy of abstract summarization models' modeling of quantitative values found in the input text.
Our results show that in most cases, the encoders of recent SOTA-performing models struggle to provide embeddings that adequately represent quantitative values.
arXiv Detail & Related papers (2022-10-03T00:59:50Z) - MRCLens: an MRC Dataset Bias Detection Toolkit [82.44296974850639]
We introduce MRCLens, a toolkit that detects whether biases exist before users train the full model.
For the convenience of introducing the toolkit, we also provide a categorization of common biases in MRC.
arXiv Detail & Related papers (2022-07-18T21:05:39Z) - Flexible and Hierarchical Prior for Bayesian Nonnegative Matrix
Factorization [4.913248451323163]
We introduce a probabilistic model for learning nonnegative matrix factorization (NMF)
We evaluate the model on several real-world datasets including MovieLens 100K and MovieLens 1M with different sizes and dimensions.
arXiv Detail & Related papers (2022-05-23T03:51:55Z) - Semi-supervised Nonnegative Matrix Factorization for Document
Classification [6.577559557980527]
We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification.
We derive training methods using multiplicative updates for each new model, and demonstrate the application of these models to single-label and multi-label document classification.
arXiv Detail & Related papers (2022-02-28T19:00:49Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Towards Debiasing NLU Models from Unknown Biases [70.31427277842239]
NLU models often exploit biases to achieve high dataset-specific performance without properly learning the intended task.
We present a self-debiasing framework that prevents models from mainly utilizing biases without knowing them in advance.
arXiv Detail & Related papers (2020-09-25T15:49:39Z) - Explainable Matrix -- Visualization for Global and Local
Interpretability of Random Forest Classification Ensembles [78.6363825307044]
We propose Explainable Matrix (ExMatrix), a novel visualization method for Random Forest (RF) interpretability.
It employs a simple yet powerful matrix-like visual metaphor, where rows are rules, columns are features, and cells are rules predicates.
ExMatrix applicability is confirmed via different examples, showing how it can be used in practice to promote RF models interpretability.
arXiv Detail & Related papers (2020-05-08T21:03:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.