Generating Diversified Comments via Reader-Aware Topic Modeling and
Saliency Detection
- URL: http://arxiv.org/abs/2102.06856v1
- Date: Sat, 13 Feb 2021 03:50:31 GMT
- Title: Generating Diversified Comments via Reader-Aware Topic Modeling and
Saliency Detection
- Authors: Wei Wang, Piji Li, Hai-Tao Zheng
- Abstract summary: We propose a reader-aware topic modeling and saliency information detection framework to enhance the quality of generated comments.
For reader-aware topic modeling, we design a variational generative clustering algorithm for latent semantic learning and topic mining from reader comments.
For saliency information detection, we introduce Bernoulli distribution estimating on news content to select saliency information.
- Score: 25.16392119801612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic comment generation is a special and challenging task to verify the
model ability on news content comprehension and language generation. Comments
not only convey salient and interesting information in news articles, but also
imply various and different reader characteristics which we treat as the
essential clues for diversity. However, most of the comment generation
approaches only focus on saliency information extraction, while the
reader-aware factors implied by comments are neglected. To address this issue,
we propose a unified reader-aware topic modeling and saliency information
detection framework to enhance the quality of generated comments. For
reader-aware topic modeling, we design a variational generative clustering
algorithm for latent semantic learning and topic mining from reader comments.
For saliency information detection, we introduce Bernoulli distribution
estimating on news content to select saliency information. The obtained topic
representations as well as the selected saliency information are incorporated
into the decoder to generate diversified and informative comments. Experimental
results on three datasets show that our framework outperforms existing baseline
methods in terms of both automatic metrics and human evaluation. The potential
ethical issues are also discussed in detail.
Related papers
- Multi-Review Fusion-in-Context [20.681734117825822]
Grounded text generation requires both content selection and content consolidation.
Recent works have proposed a modular approach, with separate components for each step.
This study lays the groundwork for further exploration of modular text generation in the multi-document setting.
arXiv Detail & Related papers (2024-03-22T17:06:05Z) - Topic Modelling: Going Beyond Token Outputs [3.072340427031969]
This paper presents a novel approach towards extending the output of traditional topic modelling methods beyond a list of isolated tokens.
To measure the interpretability of the proposed outputs against those of the traditional topic modelling approach, independent annotators manually scored each output.
arXiv Detail & Related papers (2024-01-16T16:05:54Z) - JPAVE: A Generation and Classification-based Model for Joint Product
Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE.
Two variants of our model are designed for open-world and closed-world scenarios.
Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z) - Discord Questions: A Computational Approach To Diversity Analysis in
News Coverage [84.55145223950427]
We propose a new framework to assist readers in identifying source differences and gaining an understanding of news coverage diversity.
The framework is based on the generation of Discord Questions: questions with a diverse answer pool.
arXiv Detail & Related papers (2022-11-09T16:37:55Z) - Not All Comments are Equal: Insights into Comment Moderation from a
Topic-Aware Model [8.28576076054666]
We make our models topic-aware, incorporating semantic features from a topic model into the classification decision.
Our results show that topic information improves the performance of the model, increases its confidence in correct outputs, and helps us understand the model's outputs.
arXiv Detail & Related papers (2021-09-21T08:57:17Z) - Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and
Context-Aware Auto-Encoders [59.038157066874255]
We propose a novel framework called RankAE to perform chat summarization without employing manually labeled data.
RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously.
A denoising auto-encoder is designed to generate succinct but context-informative summaries based on the selected utterances.
arXiv Detail & Related papers (2020-12-14T07:31:17Z) - Generating Pertinent and Diversified Comments with Topic-aware
Pointer-Generator Networks [5.046104800241757]
We propose a novel generation model based on Topic-aware Pointer-Generator Networks (TPGN)
We design a keyword-level and topic-level encoder attention mechanism to capture topic information in the articles.
We integrate the topic information into pointer-generator networks to guide comment generation.
arXiv Detail & Related papers (2020-05-09T09:04:09Z) - Exploring Explainable Selection to Control Abstractive Summarization [51.74889133688111]
We develop a novel framework that focuses on explainability.
A novel pair-wise matrix captures the sentence interactions, centrality, and attribute scores.
A sentence-deployed attention mechanism in the abstractor ensures the final summary emphasizes the desired content.
arXiv Detail & Related papers (2020-04-24T14:39:34Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z) - ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine
Reading Comprehension [53.037401638264235]
We present an evaluation server, ORB, that reports performance on seven diverse reading comprehension datasets.
The evaluation server places no restrictions on how models are trained, so it is a suitable test bed for exploring training paradigms and representation learning.
arXiv Detail & Related papers (2019-12-29T07:27:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.