In this repository, a study on forecasting trade direction of a stock, from tick data is carried out using two different models:
Linear regression
which is used as an example to show it's non application in such problem. As a regression model, it's prediction is based on a quantity, and then on thresholds rather than classifying the result with a label.LSTM neural network
as an example in this study, which shows good results when applied with a large set of features.
Based on the information provided by the order book, different important features such as Volume Order Imbalance, Bid Ask spread, Mid-price basis, etc are computed to capture the imbalance between buy and sell orders, that will drive the price to move up or down.
As seen in the full study report 'Predicting tick's direction of a stock.pdf' the LSTM neural network has been trained and tested in several conditions, to find which features are the most important to be fed to the model, as well as the lookback period for predicting next days of prices.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
You need Python 3.x to run the following code. You can have multiple Python versions (2.x and 3.x) installed on the same system without problems. Python needs to be first installed then SciPy and pymysql as there are dependencies on packages.
In Ubuntu, Mint and Debian you can install Python 3 like this:
sudo apt-get install python3 python3-pip
Alongside Python, the SciPy packages are also required. In Ubuntu and Debian, the SciPy ecosystem can be installed by:
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose
The latest release of Scikit-Learn machine learning package, which can be installed with pip:
pip install -U scikit-learn
Finally, the Tensorflow package for neural networks modelling, which can be installed with pip:
pip install tensorflow
For other Linux flavors, OS X and Windows, packages are available at:
http://www.python.org/getit/
https://www.scipy.org/install.html
https://scikit-learn.org/stable/install.html
- 'Predicting tick's direction of a stock.pdf' which is the report written on the full study.
- 'lin_reg.py' which is one of the main file to run the linear regression on the tick data.
- 'lstm_rnn.py', which is the other main script for training and testing the lstm neural network. Some of the code needs to be commented/uncommented depending on which data or steps we are in
- 'functions.py' where helper functions are located to compute diverse new features.
- 'TRAIN_DATA.csv' which contains the original dataset for this project.
- The 'PreData' directory contains training statistics for both linear regression and lstm models.
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
We use SemVer for versioning. For the versions available, see the tags on this repository.
- David Cicoria - Initial work - DavidCico
See also the list of contributors who participated in this project.