SentimentArcs: A Novel Method for Self-Supervised Sentiment Analysis of
Time Series Shows SOTA Transformers Can Struggle Finding Narrative Arcs
- URL: http://arxiv.org/abs/2110.09454v1
- Date: Mon, 18 Oct 2021 16:45:31 GMT
- Title: SentimentArcs: A Novel Method for Self-Supervised Sentiment Analysis of
Time Series Shows SOTA Transformers Can Struggle Finding Narrative Arcs
- Authors: Jon Chun
- Abstract summary: This paper introduces SentimentArcs, a new self-supervised time series sentiment analysis methodology.
A large ensemble of diverse models provides a synthetic ground truth for self-supervised learning.
Simple visualizations exploit the temporal structure in narratives so domain experts can quickly spot trends.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: SOTA Transformer and DNN short text sentiment classifiers report over 97%
accuracy on narrow domains like IMDB movie reviews. Real-world performance is
significantly lower because traditional models overfit benchmarks and
generalize poorly to different or more open domain texts. This paper introduces
SentimentArcs, a new self-supervised time series sentiment analysis methodology
that addresses the two main limitations of traditional supervised sentiment
analysis: limited labeled training datasets and poor generalization. A large
ensemble of diverse models provides a synthetic ground truth for
self-supervised learning. Novel metrics jointly optimize an exhaustive search
across every possible corpus:model combination. The joint optimization over
both the corpus and model solves the generalization problem. Simple
visualizations exploit the temporal structure in narratives so domain experts
can quickly spot trends, identify key features, and note anomalies over
hundreds of arcs and millions of data points. To our knowledge, this is the
first self-supervised method for time series sentiment analysis and the largest
survey directly comparing real-world model performance on long-form narratives.
Related papers
- Personality Analysis from Online Short Video Platforms with Multi-domain Adaptation [16.555668668581237]
Personality analysis from online short videos has gained prominence due to its applications in personalized recommendation systems, sentiment analysis, and human-computer interaction.
Traditional assessment methods, such as questionnaires based on the Big Five Personality Framework, are limited by self-report biases and are impractical for large-scale or real-time analysis.
We propose a novel multi-modal personality analysis framework that addresses challenges by synchronizing and integrating features from multiple modalities.
arXiv Detail & Related papers (2024-10-26T03:29:32Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Time Series Representation Models [2.724184832774005]
Time series analysis remains a major challenge due to its sparse characteristics, high dimensionality, and inconsistent data quality.
Recent advancements in transformer-based techniques have enhanced capabilities in forecasting and imputation.
We propose a new architectural concept for time series analysis based on introspection.
arXiv Detail & Related papers (2024-05-28T13:25:31Z) - TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis [32.854449155765344]
We propose a simple tokenizer architecture that embeds time series data from varying domains using a discrete vectorized representation learned in a self-supervised manner.
We study the efficacy of TOTEM with an extensive evaluation on 17 real world time series datasets across 3 tasks.
arXiv Detail & Related papers (2024-02-26T09:11:12Z) - Towards Debiasing Frame Length Bias in Text-Video Retrieval via Causal
Intervention [72.12974259966592]
We present a unique and systematic study of a temporal bias due to frame length discrepancy between training and test sets of trimmed video clips.
We propose a causal debiasing approach and perform extensive experiments and ablation studies on the Epic-Kitchens-100, YouCook2, and MSR-VTT datasets.
arXiv Detail & Related papers (2023-09-17T15:58:27Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - When Neural Networks Fail to Generalize? A Model Sensitivity Perspective [82.36758565781153]
Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions.
This paper considers a more realistic yet more challenging scenario, namely Single Domain Generalization (Single-DG)
We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as "model sensitivity"
We propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies.
arXiv Detail & Related papers (2022-12-01T20:15:15Z) - Ensemble Creation via Anchored Regularization for Unsupervised Aspect
Extraction [1.8591803874887636]
Unsupervised aspect-based sentiment analysis allows us to generate insights without investing time or money in generating labels.
One of the models that we improve upon is ABAE that reconstructs the sentences as a linear combination of aspect terms present in it.
In this research we explore how we can use information from another unsupervised model to regularize ABAE, leading to better performance.
arXiv Detail & Related papers (2022-10-13T08:23:56Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Using Human Psychophysics to Evaluate Generalization in Scene Text
Recognition Models [7.294729862905325]
We characterize two important scene text recognition models by measuring their domains.
The domains specifies the ability of readers to generalize to different word lengths, fonts, and amounts of occlusion.
arXiv Detail & Related papers (2020-06-30T19:51:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.