Enhancing Topic Extraction in Recommender Systems with Entropy
Regularization
- URL: http://arxiv.org/abs/2306.07403v1
- Date: Mon, 12 Jun 2023 20:05:09 GMT
- Title: Enhancing Topic Extraction in Recommender Systems with Entropy
Regularization
- Authors: Xuefei Jiang, Dairui Liu, Ruihai Dong
- Abstract summary: This paper introduces a novel approach called entropy regularization to address the issue of low explainability of recommender systems.
Experiment results show a significant improvement in topic coherence, which is quantified by cosine similarity on word embeddings.
- Score: 2.7286395031146062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, many recommender systems have utilized textual data for
topic extraction to enhance interpretability. However, our findings reveal a
noticeable deficiency in the coherence of keywords within topics, resulting in
low explainability of the model. This paper introduces a novel approach called
entropy regularization to address the issue, leading to more interpretable
topics extracted from recommender systems, while ensuring that the performance
of the primary task stays competitively strong. The effectiveness of the
strategy is validated through experiments on a variation of the probabilistic
matrix factorization model that utilizes textual data to extract item
embeddings. The experiment results show a significant improvement in topic
coherence, which is quantified by cosine similarity on word embeddings.
Related papers
- From Words to Worth: Newborn Article Impact Prediction with LLM [69.41680520058418]
This paper introduces a promising approach, leveraging the capabilities of fine-tuned LLMs to predict the future impact of newborn articles.
A comprehensive dataset has been constructed and released for fine-tuning the LLM, containing over 12,000 entries with corresponding titles, abstracts, and TNCSI_SP.
arXiv Detail & Related papers (2024-08-07T17:52:02Z) - Coherent Entity Disambiguation via Modeling Topic and Categorical
Dependency [87.16283281290053]
Previous entity disambiguation (ED) methods adopt a discriminative paradigm, where prediction is made based on matching scores between mention context and candidate entities.
We propose CoherentED, an ED system equipped with novel designs aimed at enhancing the coherence of entity predictions.
We achieve new state-of-the-art results on popular ED benchmarks, with an average improvement of 1.3 F1 points.
arXiv Detail & Related papers (2023-11-06T16:40:13Z) - Topic-DPR: Topic-based Prompts for Dense Passage Retrieval [6.265789210037749]
We present Topic-DPR, a dense passage retrieval model that uses topic-based prompts.
We introduce a novel positive and negative sampling strategy, leveraging semi-structured data to boost dense retrieval efficiency.
arXiv Detail & Related papers (2023-10-10T13:45:24Z) - Boosting Event Extraction with Denoised Structure-to-Text Augmentation [52.21703002404442]
Event extraction aims to recognize pre-defined event triggers and arguments from texts.
Recent data augmentation methods often neglect the problem of grammatical incorrectness.
We propose a denoised structure-to-text augmentation framework for event extraction DAEE.
arXiv Detail & Related papers (2023-05-16T16:52:07Z) - Extractive Summarization via ChatGPT for Faithful Summary Generation [12.966825834765814]
This paper presents a thorough evaluation of ChatGPT's performance on extractive summarization.
We find that ChatGPT exhibits inferior extractive summarization performance in terms of ROUGE scores compared to existing supervised systems.
Applying an extract-then-generate pipeline with ChatGPT yields significant performance improvements over abstractive baselines in terms of summary faithfulness.
arXiv Detail & Related papers (2023-04-09T08:26:04Z) - A New Sentence Extraction Strategy for Unsupervised Extractive
Summarization Methods [26.326800624948344]
We model the task of extractive text summarization methods from the perspective of Information Theory.
To improve the feature distribution and to decrease the mutual information of summarization sentences, we propose a new sentence extraction strategy.
arXiv Detail & Related papers (2021-12-06T18:00:02Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Obtaining Better Static Word Embeddings Using Contextual Embedding
Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training.
As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z) - Experiments in Extractive Summarization: Integer Linear Programming,
Term/Sentence Scoring, and Title-driven Models [1.3286165491120467]
We describe a new framework, NewsSumm, that includes many existing and new approaches for summarization including ILP and title-driven approaches.
We show that the new title-driven reduction idea leads to improvement in performance for both unsupervised and supervised approaches considered.
arXiv Detail & Related papers (2020-08-01T01:05:55Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z) - Heavy-tailed Representations, Text Polarity Classification & Data
Augmentation [11.624944730002298]
We develop a novel method to learn a heavy-tailed embedding with desirable regularity properties.
A classifier dedicated to the tails of the proposed embedding is obtained which performance outperforms the baseline.
Numerical experiments on synthetic and real text data demonstrate the relevance of the proposed framework.
arXiv Detail & Related papers (2020-03-25T19:24:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.