Stock Price Forecasting with LSTM
Last updated
Last updated
Long short-term memory (LSTM) network is a type of recurrent neural network (RNN). RNN is a type of artificial neural network that uses sequential data or time series data. There is a time relationship (memory) between samples in RNN, unlike in the feed-forward neural network.
LSTM are an upgraded variant of RNN. RNNs suffer from vanishing and exploding gradient problems. The vanishing and exploding gradient problems occur when the gradients become either too small or too large during backpropagation.
LSTM can refer to:
An LSTM unit (or neuron),
An LSTM layer (many LSTM units), or
An LSTM neural network (a neural network with LSTM units or layers).
To implement LSTM using Keras, this is the basic code:
The architecture has 32 neurons. The 10 represents the timestep value. The 64 represents the number of features.
In this section, I explain the very basic LSTM model for stock forecasting. It only uses price as a feature, and the number of timesteps is one. You can get the Jupyter notebook file from my repository.
To begin our project, we need to install the yfinance
library to get stock price data. Then, we need to import several libraries.
I use the stock price of BBCA, one of the four big banks in Indonesia. To download the data set, you can specify the start date of the data along with the time zone. The timeframe of stock price data is 1 day.
We want to do time series forecasting with only one feature, which is price. Therefore, we need to remove other features from the dataset.
Normalization converts the values of numeric columns in the dataset to a common scale, which improves the performance of our model. To scale the training dataset, we utilize Scikit-Learn's MinMaxScaler
with numbers ranging from zero to one.
Before LSTM can be used, the time series dataset must be reframed as a supervised learning dataset. From a sequence to pairs of input and output sequences. This function, series_to_supervised()
, is used to convert the data.
Technically, in time series forecasting terminology the current time and future times are forecast times and past observations are used to make forecasts.
Here, we divide the data into a training set and a test set.
LSTM takes as input 3 dimension tensors (batch
, timesteps
, feature
). So, we need to convert the data into 3D arrays.
The LSTM layer is added with the following arguments:
The 150 units is the number of LSTM neurons and also the dimensionality of the output space.
The input_shape
is the shape of the training dataset (timesteps
, feature
)
The return_sequences=True
is necessary for stacking LSTM layers so the consequent LSTM layer has a three-dimensional sequence input.
We add Dropout
layer with 0.3 rate. It means that 30% of the layers will be dropped. Finally, we add the Dense
layer that specifies an output of one unit.
We train the model to run for 20 epochs with a batch size of 32. The epochs are the number of times the learning algorithm will work through the entire training set. The batch size is a number of samples processed before the model is updated.
We make prediction with the test set.
The result is still in normalized value, so we have to invert back to normal value.
We compare our model prediction result with the ground truth. We calculate the root mean squared error between the true price and the predicted price.
If you see the comparison between the predicted price and the true price, it seems that the model does a pretty good job. But if you look closely, the predicted price is just a delayed version of the true data. Therefore, the model is not yet useful for real-life trading. The predicted price is just like a moving average indicator with a very small window.
Types of neural networks: Recurrent Neural Networks, https://medium.com/@shekhawatsamvardhan/types-of-neural-networks-recurrent-neural-networks-7c43bd73e033
What is the difference between LSTM and RNN?, https://ai.stackexchange.com/questions/18198/what-is-the-difference-between-lstm-and-rnn
Difference between a single unit LSTM and 3-unit LSTM neural network, https://stats.stackexchange.com/questions/365428/difference-between-a-single-unit-lstm-and-3-unit-lstm-neural-network
What is the architecture behind the Keras LSTM Layer implementation?, https://stackoverflow.com/questions/49892528/what-is-the-architecture-behind-the-keras-lstm-layer-implementation
How to Convert a Time Series to a Supervised Learning Problem in Python, https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/