Abstract: Click counts are related to the amount of money that online advertisers paid
to news sites. Such business models forced some news sites to employ a dirty
trick of click-baiting, i.e., using a hyperbolic and interesting words,
sometimes unfinished sentence in a headline to purposefully tease the readers.
Some Indonesian online news sites also joined the party of clickbait, which
indirectly degrade other established news sites' credibility. A neural network
with a pre-trained language model M-BERT that acted as a embedding layer is
then combined with a 100 nodes hidden layer and topped with a sigmoid
classifier was trained to detect clickbait headlines. With a total of 6632
headlines as a training dataset, the classifier performed remarkably well.
Evaluated with 5-fold cross validation, it has an accuracy score of 0.914,
f1-score of 0.914, precision score of 0.916, and ROC-AUC of 0.92. The usage of
multilingual BERT in Indonesian text classification task was tested and is
possible to be enhanced further. Future possibilities, societal impact, and
limitations of the clickbait detection are discussed.