Prior knowledge distillation based on financial time series
- URL: http://arxiv.org/abs/2006.09247v5
- Date: Thu, 26 Nov 2020 05:58:52 GMT
- Title: Prior knowledge distillation based on financial time series
- Authors: Jie Fang and Jianwu Lin
- Abstract summary: We propose to use neural networks to represent indicators and train a large network constructed of smaller networks as feature layers.
In numerical experiments, we find that our algorithm is faster and more accurate than traditional methods on real financial datasets.
- Score: 0.8756822885568589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the major characteristics of financial time series is that they
contain a large amount of non-stationary noise, which is challenging for deep
neural networks. People normally use various features to address this problem.
However, the performance of these features depends on the choice of
hyper-parameters. In this paper, we propose to use neural networks to represent
these indicators and train a large network constructed of smaller networks as
feature layers to fine-tune the prior knowledge represented by the indicators.
During back propagation, prior knowledge is transferred from human logic to
machine logic via gradient descent. Prior knowledge is the deep belief of
neural network and teaches the network to not be affected by non-stationary
noise. Moreover, co-distillation is applied to distill the structure into a
much smaller size to reduce redundant features and the risk of overfitting. In
addition, the decisions of the smaller networks in terms of gradient descent
are more robust and cautious than those of large networks. In numerical
experiments, we find that our algorithm is faster and more accurate than
traditional methods on real financial datasets. We also conduct experiments to
verify and comprehend the method.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation [36.41451383422967]
In scientific applications, the scale of neural networks is generally moderate-size, mainly to ensure the speed of inference.
Existing work has found that the powerful capabilities of neural networks are primarily due to their non-linearity.
We propose a condensation reduction algorithm to verify the feasibility of this idea in practical problems.
arXiv Detail & Related papers (2024-05-02T06:53:40Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Network Degeneracy as an Indicator of Training Performance: Comparing
Finite and Infinite Width Angle Predictions [3.04585143845864]
We show that as networks get deeper and deeper, they are more susceptible to becoming degenerate.
We use a simple algorithm that can accurately predict the level of degeneracy for any given fully connected ReLU network architecture.
arXiv Detail & Related papers (2023-06-02T13:02:52Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Rewiring Networks for Graph Neural Network Training Using Discrete
Geometry [0.0]
Information over-squashing is a problem that significantly impacts the training of graph neural networks (GNNs)
In this paper, we investigate the use of discrete analogues of classical geometric notions of curvature to model information flow on networks and rewire them.
We show that these classical notions achieve state-of-the-art performance in GNN training accuracy on a variety of real-world network datasets.
arXiv Detail & Related papers (2022-07-16T21:50:39Z) - Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency.
Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z) - Differentiable Sparsification for Deep Neural Networks [0.0]
We propose a fully differentiable sparsification method for deep neural networks.
The proposed method can learn both the sparsified structure and weights of a network in an end-to-end manner.
To the best of our knowledge, this is the first fully differentiable sparsification method.
arXiv Detail & Related papers (2019-10-08T03:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.