Denoised Labels for Financial Time-Series Data via Self-Supervised
Learning
- URL: http://arxiv.org/abs/2112.10139v1
- Date: Sun, 19 Dec 2021 12:54:20 GMT
- Title: Denoised Labels for Financial Time-Series Data via Self-Supervised
Learning
- Authors: Yanqing Ma, Carmine Ventre, Maria Polukarov
- Abstract summary: This work takes inspiration from image classification in trading and success in self-supervised learning.
We investigate the idea of applying computer vision techniques to financial time-series to reduce the noise exposure.
Our results show that our denoised labels improve the performances of the downstream learning algorithm.
- Score: 5.743034166791607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The introduction of electronic trading platforms effectively changed the
organisation of traditional systemic trading from quote-driven markets into
order-driven markets. Its convenience led to an exponentially increasing amount
of financial data, which is however hard to use for the prediction of future
prices, due to the low signal-to-noise ratio and the non-stationarity of
financial time series. Simpler classification tasks -- where the goal is to
predict the directions of future price movement -- via supervised learning
algorithms, need sufficiently reliable labels to generalise well. Labelling
financial data is however less well defined than other domains: did the price
go up because of noise or because of signal? The existing labelling methods
have limited countermeasures against noise and limited effects in improving
learning algorithms. This work takes inspiration from image classification in
trading and success in self-supervised learning. We investigate the idea of
applying computer vision techniques to financial time-series to reduce the
noise exposure and hence generate correct labels. We look at the label
generation as the pretext task of a self-supervised learning approach and
compare the naive (and noisy) labels, commonly used in the literature, with the
labels generated by a denoising autoencoder for the same downstream
classification task. Our results show that our denoised labels improve the
performances of the downstream learning algorithm, for both small and large
datasets. We further show that the signals we obtain can be used to effectively
trade with binary strategies. We suggest that with proposed techniques,
self-supervised learning constitutes a powerful framework for generating
"better" financial labels that are useful for studying the underlying patterns
of the market.
Related papers
- ERASE: Error-Resilient Representation Learning on Graphs for Label Noise
Tolerance [53.73316938815873]
We propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE) to learn representations with error tolerance.
ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience.
Our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability.
arXiv Detail & Related papers (2023-12-13T17:59:07Z) - Unleashing the Potential of Regularization Strategies in Learning with
Noisy Labels [65.92994348757743]
We demonstrate that a simple baseline using cross-entropy loss, combined with widely used regularization strategies can outperform state-of-the-art methods.
Our findings suggest that employing a combination of regularization strategies can be more effective than intricate algorithms in tackling the challenges of learning with noisy labels.
arXiv Detail & Related papers (2023-07-11T05:58:20Z) - AutoWS: Automated Weak Supervision Framework for Text Classification [1.748907524043535]
We propose a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts.
Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data.
arXiv Detail & Related papers (2023-02-07T07:12:05Z) - Losses over Labels: Weakly Supervised Learning via Direct Loss
Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning.
We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label.
We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Investigating Power laws in Deep Representation Learning [4.996066540156903]
We propose a framework to evaluate the quality of representations in unlabelled datasets.
We estimate the coefficient of the power law, $alpha$, across three key attributes which influence representation learning.
Notably, $alpha$ is computable from the representations without knowledge of any labels, thereby offering a framework to evaluate the quality of representations in unlabelled datasets.
arXiv Detail & Related papers (2022-02-11T18:11:32Z) - Data Consistency for Weakly Supervised Learning [15.365232702938677]
Training machine learning models involves using large amounts of human-annotated data.
We propose a novel weak supervision algorithm that processes noisy labels, i.e., weak signals.
We show that it significantly outperforms state-of-the-art weak supervision methods on both text and image classification tasks.
arXiv Detail & Related papers (2022-02-08T16:48:19Z) - Trade When Opportunity Comes: Price Movement Forecasting via Locality-Aware Attention and Iterative Refinement Labeling [11.430440350359993]
We propose LARA, a novel price movement forecasting framework with two main components.
LA-Attention extracts potentially profitable samples through masked attention scheme.
RA-Labeling refines the noisy labels of potentially profitable samples.
LARA significantly outperforms several machine learning based methods on the Qlib quantitative investment platform.
arXiv Detail & Related papers (2021-07-26T05:52:42Z) - Boosting Semi-Supervised Face Recognition with Noise Robustness [54.342992887966616]
This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
arXiv Detail & Related papers (2021-05-10T14:43:11Z) - Active Learning for Noisy Data Streams Using Weak and Strong Labelers [3.9370369973510746]
We consider a novel weak and strong labeler problem inspired by humans natural ability for labeling.
We propose an on-line active learning algorithm that consists of four steps: filtering, adding diversity, informative sample selection, and labeler selection.
We derive a decision function that measures the information gain by combining the informativeness of individual samples and model confidence.
arXiv Detail & Related papers (2020-10-27T09:18:35Z) - Learning Not to Learn in the Presence of Noisy Labels [104.7655376309784]
We show that a new class of loss functions called the gambler's loss provides strong robustness to label noise across various levels of corruption.
We show that training with this loss function encourages the model to "abstain" from learning on the data points with noisy labels.
arXiv Detail & Related papers (2020-02-16T09:12:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.