Stock Price Forecasting with LSTM

What is LSTM Network

Long short-term memory (LSTM) network is a type of recurrent neural network (RNN). RNN is a type of artificial neural network that uses sequential data or time series data. There is a time relationship (memory) between samples in RNN, unlike in the feed-forward neural network.

LSTM are an upgraded variant of RNN. RNNs suffer from vanishing and exploding gradient problems. The vanishing and exploding gradient problems occur when the gradients become either too small or too large during backpropagation.

LSTM can refer to:

An LSTM unit (or neuron),
An LSTM layer (many LSTM units), or
An LSTM neural network (a neural network with LSTM units or layers).

To implement LSTM using Keras, this is the basic code:

model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))

The architecture has 32 neurons. The 10 represents the timestep value. The 64 represents the number of features.

LSTM Model for Stock Price Forecasting

In this section, I explain the very basic LSTM model for stock forecasting. It only uses price as a feature, and the number of timesteps is one. You can get the Jupyter notebook file from my repository.

coding_for_finance/lstm_stock_forecast_0.ipynb at main · weenslab/coding_for_financeGitHub

Import Library

To begin our project, we need to install the yfinance library to get stock price data. Then, we need to import several libraries.

Download Dataset

I use the stock price of BBCA, one of the four big banks in Indonesia. To download the data set, you can specify the start date of the data along with the time zone. The timeframe of stock price data is 1 day.

Remove Unnecessary Columns

We want to do time series forecasting with only one feature, which is price. Therefore, we need to remove other features from the dataset.

Data Normalization

Normalization converts the values of numeric columns in the dataset to a common scale, which improves the performance of our model. To scale the training dataset, we utilize Scikit-Learn's MinMaxScaler with numbers ranging from zero to one.

Incorporate Timesteps into Data

Before LSTM can be used, the time series dataset must be reframed as a supervised learning dataset. From a sequence to pairs of input and output sequences. This function, series_to_supervised(), is used to convert the data.

Technically, in time series forecasting terminology the current time $(t)$ and future times $(t+1, t+n)$ are forecast times and past observations $(t-1, t-n)$ are used to make forecasts.

Train-Test Split

Here, we divide the data into a training set and a test set.

Convert into 3D Arrays

LSTM takes as input 3 dimension tensors (batch, timesteps, feature). So, we need to convert the data into 3D arrays.

Create LSTM Model

The LSTM layer is added with the following arguments:

The 150 units is the number of LSTM neurons and also the dimensionality of the output space.
The input_shape is the shape of the training dataset (timesteps, feature)
The return_sequences=True is necessary for stacking LSTM layers so the consequent LSTM layer has a three-dimensional sequence input.

We add Dropout layer with 0.3 rate. It means that 30% of the layers will be dropped. Finally, we add the Dense layer that specifies an output of one unit.

Training

We train the model to run for 20 epochs with a batch size of 32. The epochs are the number of times the learning algorithm will work through the entire training set. The batch size is a number of samples processed before the model is updated.

Prediction

We make prediction with the test set.

Inverse Normalization

The result is still in normalized value, so we have to invert back to normal value.

Comparison with Ground Truth

We compare our model prediction result with the ground truth. We calculate the root mean squared error between the true price and the predicted price.

If you see the comparison between the predicted price and the true price, it seems that the model does a pretty good job. But if you look closely, the predicted price is just a delayed version of the true data. Therefore, the model is not yet useful for real-life trading. The predicted price is just like a moving average indicator with a very small window.

References

Types of neural networks: Recurrent Neural Networks, https://medium.com/@shekhawatsamvardhan/types-of-neural-networks-recurrent-neural-networks-7c43bd73e033
What is the difference between LSTM and RNN?, https://ai.stackexchange.com/questions/18198/what-is-the-difference-between-lstm-and-rnn
Difference between a single unit LSTM and 3-unit LSTM neural network, https://stats.stackexchange.com/questions/365428/difference-between-a-single-unit-lstm-and-3-unit-lstm-neural-network
What is the architecture behind the Keras LSTM Layer implementation?, https://stackoverflow.com/questions/49892528/what-is-the-architecture-behind-the-keras-lstm-layer-implementation
How to Convert a Time Series to a Supervised Learning Problem in Python, https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/

PreviousBuild a Bitcoin Ticker with ESP32 and Arduino

Last updated 1 year ago