Accurate Multivariate Stock Movement Prediction via Data-Axis Transformer with Multi-Level Contexts


How can we efficiently correlate multiple stocks for accurate stock movement prediction? Stock movement prediction has received growing interest in data mining and machine learning communities due to its substantial impact on financial markets. One way to improve the prediction accuracy is to utilize the correlations between multiple stocks, getting a reliable evidence regardless of the random noises of individual prices. However, it has been challenging to acquire accurate correlations between stocks because of their asymmetric and dynamic nature which is also influenced by the global movement of a market. In this work, we propose DTML (Data-axis Transformer with Multi-Level contexts), a novel approach for stock movement prediction that learns the correlations between stocks in an end-to-end way. DTML makes asymmetric and dynamic correlations by a) learning temporal correlations within each stock, b) generating multi-level contexts based on a global market context, and c) utilizing a transformer encoder for learning inter-stock correlations. DTML achieves the state-of-the-art accuracy on six datasets collected from various stock markets from US, China, Japan, and UK, making up to 13.8%p higher profits than the best competitors and the annualized return of 44.4% on investment simulation.



We provide six datasets that are used in the experiments of the paper. ACL18 and KDD17 are from a previous work and are available in a GitHub repository. On the other hand, we collected and preprocessed the remaining four datasets according to data format of ACL18 and KDD17. The attached file contains the four datasets.

Name Country Stocks Days From To Download
ACL18 US 87 504 2014-01-01 2015-12-31 GitHub
KDD17 US 50 2,518 2007-01-01 2016-12-31 GitHub
NDX100 US 95 1,259 2013-01-01 2017-12-31
CSI300 China 219 1,119 2015-06-01 2019-12-31
NI225 Japan 51 856 2016-07-01 2019-12-31
FTSE100 UK 24 1,134 2014-01-01 2018-06-30


Please cite our paper if you use the datasets [BibTeX].