Book Success Prediction with Pretrained Sentence Embeddings and
Readability Scores
- URL: http://arxiv.org/abs/2007.11073v2
- Date: Tue, 5 Oct 2021 16:24:51 GMT
- Title: Book Success Prediction with Pretrained Sentence Embeddings and
Readability Scores
- Authors: Muhammad Khalifa and Aminul Islam
- Abstract summary: We propose a model that leverages pretrained sentence embeddings along with various readability scores for book success prediction.
Our proposed model outperforms strong baselines for this task by as large as 6.4% F1-score points.
- Score: 8.37609145576126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting the potential success of a book in advance is vital in many
applications. This could help both publishers and readers in their
decision-making process whether or not a book is worth publishing and reading,
respectively. In this paper, we propose a model that leverages pretrained
sentence embeddings along with various readability scores for book success
prediction. Unlike previous methods, the proposed method requires no
count-based, lexical, or syntactic features. Instead, we use a convolutional
neural network over pretrained sentence embeddings and leverage different
readability scores through a simple concatenation operation. Our proposed model
outperforms strong baselines for this task by as large as 6.4\% F1-score
points. Moreover, our experiments show that according to our model, only the
first 1K sentences are good enough to predict the potential success of books.
Related papers
- Ensembling Finetuned Language Models for Text Classification [55.15643209328513]
Finetuning is a common practice across different communities to adapt pretrained models to particular tasks.
ensembles of neural networks are typically used to boost performance and provide reliable uncertainty estimates.
We present a metadataset with predictions from five large finetuned models on six datasets and report results of different ensembling strategies.
arXiv Detail & Related papers (2024-10-25T09:15:54Z) - Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition [5.575078692353885]
We propose a new model for multi-token prediction in transformers, aiming to enhance sampling efficiency without compromising accuracy.
By generalizing it to a rank-$r$ canonical probability decomposition, we develop an improved model that predicts multiple tokens simultaneously.
arXiv Detail & Related papers (2024-10-23T11:06:36Z) - Using Full-Text Content to Characterize and Identify Best Seller Books [0.6442904501384817]
We consider the task of predicting whether a book will become a best seller from the standpoint of literary works.
Dissimilarly from previous approaches, we focused on the full content of books and considered visualization and classification tasks.
Our results show that it is unfeasible to predict the success of books with high accuracy using only the full content of the texts.
arXiv Detail & Related papers (2022-10-05T15:40:25Z) - Checklist Models for Improved Output Fluency in Piano Fingering
Prediction [33.52847881359949]
We present a new approach for the task of predicting fingerings for piano music.
We put forward a checklist system, trained via reinforcement learning, that maintains a representation of recent predictions.
We demonstrate significant gains in performability directly attributable to improvements with respect to these metrics.
arXiv Detail & Related papers (2022-09-12T21:27:52Z) - From Good to Best: Two-Stage Training for Cross-lingual Machine Reading
Comprehension [51.953428342923885]
We develop a two-stage approach to enhance the model performance.
The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer.
The second stage focuses on precision: an answer-aware contrastive learning mechanism is developed to learn the fine difference between the accurate answer and other candidates.
arXiv Detail & Related papers (2021-12-09T07:31:15Z) - A framework for predicting, interpreting, and improving Learning
Outcomes [0.0]
We develop an Embibe Score Quotient model (ESQ) to predict test scores based on observed academic, behavioral and test-taking features of a student.
ESQ can be used to predict the future scoring potential of a student as well as offer personalized learning nudges.
arXiv Detail & Related papers (2020-10-06T11:22:27Z) - Toward Better Storylines with Sentence-Level Language Models [54.91921545103256]
We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.
We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task.
arXiv Detail & Related papers (2020-05-11T16:54:19Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.