LSTM (Long Short Term Memory) network is a type of recurrent neural network that can identify patterns from sequential data. Sequential data is a type of data where a value at a specific timestep depends upon values in previous timesteps. For example, stock market prices where stock prices of a particular day depend upon stock prices of previous days. Similarly, a text sentence is also sequential since words that come later in a sentence often depend upon words at the beginning of a sentence.
In this article, you will implement an LSTM network that predicts the opening stock prices of the Toyota Motors company. You will be using the TensorFlow Keras library in Python to implement your LSTM.
Table of Contents:
- Importing the Required Libraries
- Importing and Preprocessing the Training Dataset
- Creating and Training the LSTM Model on Training Set
- Making Predictions on Test Data
- Comparing Actual and Predicted Opening Stock Values
Importing the Required Libraries
The following script imports the required libraries for this article:
import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline
Importing and Preprocessing the Training Dataset
For training, you will be using the 5-year data (01-January-2015 to 31-December-2019) of opening Stock prices of the Toyota Motors company. For testing, you will be using the opening stock prices of the Toyota Motor company for the month of January 2020. The training and testing datasets can be downloaded free from this yahoo finance link. You can download both the training and test sets in the CSV format.
The following script imports the training dataset and displays its five rows:
all_train_data = pd.read_csv("E:/Datasets/tm_train.csv") all_train_data.head()
Output:
You can see that the dataset contains different information regarding stock prices such as the date, opening and closing values, volume, etc.
Let’s plot the opening values for the stock prices from our dataset.
Run the following script to plot the opening values of stock prices from our training data.
plt.figure(figsize=(8, 6)) sns.set_style("darkgrid") sns.lineplot(data=all_train_data["Open"])
Output:
The output shows that the opening values follow an extreme volatile trend and looking at the above figure it is extremely difficult to predict future opening values for the stock. However, you will see that the LSTM algorithm will learn the hidden patterns in the above data and will predict the future stock prices to a certain degree of accuracy.
Since you will be predicting the opening stock prices, let’s first filter the values from the Open column.
all_train_data_open = all_train_data[['Open']].values
The final preprocessing step is to scale the dataset since the LSTM network works best when the data is scaled. Run the following script:
from sklearn.preprocessing import MinMaxScaler mm_scaler = MinMaxScaler(feature_range = (0, 1)) all_train_data_open_scaled = mm_scaler.fit_transform(all_train_data_open)
For training the LSTM we need to divide our data into the feature and label set. The feature set will consist of the opening stock prices of the previous 60 days while the labels or the output will consist of the stock prices of the 61st days. Through experimentation, I found that for predicting opening stock prices, the best accuracy is achieved when LSTM is trained on the previous 60 timesteps. You can try any other number as well to see if you can get better results. Run the following script:
training_features= [] training_labels = [] for i in range(60, len(all_train_data_open_scaled)): training_features.append(all_train_data_open_scaled[i-60:i, 0]) training_labels.append(all_train_data_open_scaled[i, 0])
Next, you need to convert your data into the Numpy Array format. Run the following script to do so:
X_train = np.array(training_features) y_train = np.array(training_labels)
LSTM algorithm expects the input feature set in 3-dimensional format (number of records, number of time steps, features per timestep). Similarly, if the output consists of a single value, it should be in the form of a column vector. The following script converts the training set into a 3-dimensional vector and the label set into a column vector:
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1) y_train= y_train.reshape(-1,1) print(X_train.shape) print(y_train.shape)
Output:
(1197, 60, 1) (1197, 1)
The output shows the shape of the feature and labels set.
Creating and Training the LSTM Model on Training Set
The data has been processed and we are now ready to train our LSTM model on the training set. The following script creates our LSTM model in the TensorFlow Keras library. The model consists of one input layer, four LSTM layers, four dropout layers to reduce overfitting, and one output layer. The optimizer used to reduce loss is the adam, and the loss type is the mean squared error.
from tensorflow.keras.models import Model from tensorflow.keras.layers import Input, Activation, Dense, Flatten, Dropout, Flatten, LSTM
ip_layer = Input(shape = (X_train.shape[1], X_train.shape[2])) lstm_layer1 = LSTM(120, activation='relu', return_sequences=True)(ip_layer) do_layer1 = Dropout(0.2)(lstm_layer1) lstm_layer2 = LSTM(120, activation='relu', return_sequences=True)(do_layer1) do_layer2 = Dropout(0.2)(lstm_layer2) lstm_layer3 = LSTM(120, activation='relu', return_sequences=True)(do_layer2) do_layer3 = Dropout(0.2)(lstm_layer3) lstm_layer4 = LSTM(120, activation='relu')(do_layer3) do_layer4 = Dropout(0.2)(lstm_layer4) op_layer = Dense(1)(do_layer4) lstm_model = Model(ip_layer, op_layer) lstm_model.compile(optimizer='adam', loss='mse')
The following script trains the model on the training set.
history = lstm_model.fit(X_train, y_train, epochs=80, verbose=1, batch_size = 16)
The batch size is 16 which means that the model weights are updated after 16 records. The number of epochs is 80 which is the number of times the LSTM model is trained on the whole dataset.
After 80 epochs, the mean squared error loss is reduced to 0.0020 as shown by the following output depicting the result of the last 5 epochs:
Epoch 76/80 75/75 [==============================] - 10s 132ms/step - loss: 0.0020 Epoch 77/80 75/75 [==============================] - 10s 139ms/step - loss: 0.0021 Epoch 78/80 75/75 [==============================] - 10s 130ms/step - loss: 0.0021 Epoch 79/80 75/75 [==============================] - 9s 126ms/step - loss: 0.0020 Epoch 80/80 75/75 [==============================] - 10s 134ms/step - loss: 0.0020
Making Predictions on Test Data
We have trained the LSTM model, now we are ready to make predictions on the test set. The test consists of the stock prices of Toyota Motors, for the month of January 2020. The following script imports the dataset and as we did for the training set, we will filter the values from the Open column in our test dataset.
all_test_data = pd.read_csv("E:/Datasets/tm_test.csv") all_test_data_open = all_test_data[['Open']].values
Since the predictions are based on the opening stock prices of the previous 60 days, we will create a dataset that consists of the stock prices for the month of January 2020, and the stock prices of the 60 days before the 1st of January. Run the following script:
final_test_data = all_train_test_data_open[len(all_train_test_data_open) - len(all_test_data_open) - 60:].values print(final_test_data.shape)
Output:
(80,)
The output shows that we have now 80 records in our final dataset. The next step is to convert the dataset in the form of a column vector:
final_test_data = final_test_data.reshape(-1,1) final_test_data_scaled = mm_scaler.transform(final_test_data) print(final_test_data.shape)
Output:
(80, 1)
Next, as we did for the training set, we will create a feature set for our test data where each record consists of the opening stock prices of the previous 60 days. Run the following script to do so:
test_features= [] for i in range(60, len(final_test_data_scaled)): test_features.append(final_test_data_scaled[i-60:i, 0])
We need to convert the feature set in the form of a NumPy array:
X_test = np.array(test_features) print(X_test.shape)
Output:
(20, 60)
The shape of the dataset from the output shows that for each record in the test set i.e. opening stock prices for the month of January 2020, we have a feature set which consists of the opening stock prices of the previous 60 days.
Before we make predictions on the test set, we need to convert the test feature set to the 3-dimensional format as shown below:
X_test = np.array(test_features) print(X_test.shape)
Output:
(20, 60, 1)
We are now ready to make predictions on the test set. To do so, you need to call the predict() method of your LSTM and pass the test set to it.
y_pred = lstm_model.predict(X_test)
Since both the training and test sets were scaled, the predictions made are also scaled. To convert the scaled predictions back into actual predictions, run the following script:
predictions = mm_scaler.inverse_transform(y_pred)
Comparing Actual and Predicted Opening Stock Values
Finally, you can plot both the actual opening stock price values and your predicted stock price values to see how good your LSTM model is for making predictions on the test set.
sns.set_style("darkgrid") plt.figure(figsize=(8,6)) plt.plot(all_test_data_open, color='green', label='Actual stock prices') plt.plot(predictions , color='blue', label='Predicted stock') plt.title('Toyota Stock Prices') plt.xlabel('Date') plt.ylabel('Value of Stock Price') plt.legend() plt.show()
Output:
In the output above, the green line shows the actual stock prices whereas the blue line shows predicted prices. The output shows that the predicted stock prices are very close to the actual prices and our LSTM model has been able to pick peaks and troughs in the test data, which shows that the LSTM model can actually identify patterns in sequential data and therefore, is able to solve sequence problems.