M3 Deep Learning & Artificial intelligence
1. EDA
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2997 entries, 2010-01-04 to 2021-11-26
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 High 2997 non-null float64
1 Low 2997 non-null float64
2 Open 2997 non-null float64
3 Close 2997 non-null float64
4 Volume 2997 non-null float64
5 Adj Close 2997 non-null float64
dtypes: float64(6)
memory usage: 163.9 KB
In addition, it could be relevant to see where, during the period, the biggest loss and profit came. Here we can conclude that biggest loss was when the COVID-10 lockdown started.
2. LSTM
2.1 Preprocessing for LSTM
1. Here we split the data into training and test set.
2. Normalize data
Now we move on to normalization of the data, which is the process of making the properties more consistent. This will in turn allow the model to predict more accuratly.
3. X_train - y_train & 4. Reshape
Next is separating the normalized data into x_train and y_train. In the same step we reshape the data. This is done because we want the input to have 3 dimensions - which is typically: data-size, number of steps and number of features.
[array([0.02480475, 0.02483599, 0.02424243, 0.02324274, 0.02389878,
0.02268041, 0.02205561, 0.02293034, 0.02483599, 0.02452359,
0.02527335, 0.0236801 , 0.02186817, 0.01858794, 0.01971259,
0.02027491, 0.020806 , 0.01921275, 0.0161512 , 0.01686973,
0.01702593, 0.01755701, 0.01508904, 0.01565136, 0.01471415,
0.01562012, 0.01555764, 0.01596376, 0.0153702 , 0.01668229,
0.01743205, 0.01861918, 0.01799438, 0.01786941, 0.01661981,
0.01755701, 0.01746329, 0.01768197, 0.01877538, 0.01702593,
0.01702593, 0.01755701, 0.01743205, 0.01755701, 0.0180881 ,
0.01861918, 0.01927523, 0.01955639, 0.01961887, 0.01986879,
0.02068103, 0.02061856, 0.02055608, 0.02058732, 0.02146204,
0.02074352, 0.02186817, 0.02077476, 0.02055608, 0.0211184 ])]
[0.019618872160330314]
[array([0.02480475, 0.02483599, 0.02424243, 0.02324274, 0.02389878,
0.02268041, 0.02205561, 0.02293034, 0.02483599, 0.02452359,
0.02527335, 0.0236801 , 0.02186817, 0.01858794, 0.01971259,
0.02027491, 0.020806 , 0.01921275, 0.0161512 , 0.01686973,
0.01702593, 0.01755701, 0.01508904, 0.01565136, 0.01471415,
0.01562012, 0.01555764, 0.01596376, 0.0153702 , 0.01668229,
0.01743205, 0.01861918, 0.01799438, 0.01786941, 0.01661981,
0.01755701, 0.01746329, 0.01768197, 0.01877538, 0.01702593,
0.01702593, 0.01755701, 0.01743205, 0.01755701, 0.0180881 ,
0.01861918, 0.01927523, 0.01955639, 0.01961887, 0.01986879,
0.02068103, 0.02061856, 0.02055608, 0.02058732, 0.02146204,
0.02074352, 0.02186817, 0.02077476, 0.02055608, 0.0211184 ]), array([0.02483599, 0.02424243, 0.02324274, 0.02389878, 0.02268041,
0.02205561, 0.02293034, 0.02483599, 0.02452359, 0.02527335,
0.0236801 , 0.02186817, 0.01858794, 0.01971259, 0.02027491,
0.020806 , 0.01921275, 0.0161512 , 0.01686973, 0.01702593,
0.01755701, 0.01508904, 0.01565136, 0.01471415, 0.01562012,
0.01555764, 0.01596376, 0.0153702 , 0.01668229, 0.01743205,
0.01861918, 0.01799438, 0.01786941, 0.01661981, 0.01755701,
0.01746329, 0.01768197, 0.01877538, 0.01702593, 0.01702593,
0.01755701, 0.01743205, 0.01755701, 0.0180881 , 0.01861918,
0.01927523, 0.01955639, 0.01961887, 0.01986879, 0.02068103,
0.02061856, 0.02055608, 0.02058732, 0.02146204, 0.02074352,
0.02186817, 0.02077476, 0.02055608, 0.0211184 , 0.01961887])]
[0.019618872160330314, 0.01921274571810193]
2.2 LSTM-model with Keras
1. Create model:
First we create and initialize the model, which is a sequential model - meaning it's a stack of layers, using previous observations to predict the next.We add layers and dense it twice in the end.
2. Compile model
Next step is to compile the model:optimizer: The optimizer does the process of updating our parameters for us here. For this adam is chosen. loss: It is a number that indicates how good or bad the model is to its prediction. As it approaches 0, the error starts to decrease.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 60, 128) 66560
_________________________________________________________________
lstm_3 (LSTM) (None, 64) 49408
_________________________________________________________________
dense_2 (Dense) (None, 25) 1625
_________________________________________________________________
dense_3 (Dense) (None, 1) 26
=================================================================
Total params: 117,619
Trainable params: 117,619
Non-trainable params: 0
_________________________________________________________________
3 & 4 Fit the model, and choose the number of epochs and the size og batch
To choose the correct number of epochs (to minimize the loss), we will create a visualization of the right amount of epochs.
Epoch 1/40
59/59 [==============================] - 15s 182ms/step - loss: 6.6426e-04 - mean_squared_error: 6.6426e-04 - val_loss: 1.4895e-04 - val_mean_squared_error: 1.4895e-04
Epoch 2/40
59/59 [==============================] - 10s 167ms/step - loss: 1.8029e-05 - mean_squared_error: 1.8029e-05 - val_loss: 7.6690e-05 - val_mean_squared_error: 7.6690e-05
Epoch 3/40
59/59 [==============================] - 10s 171ms/step - loss: 1.4206e-05 - mean_squared_error: 1.4206e-05 - val_loss: 6.9390e-05 - val_mean_squared_error: 6.9390e-05
Epoch 4/40
59/59 [==============================] - 10s 172ms/step - loss: 1.4849e-05 - mean_squared_error: 1.4849e-05 - val_loss: 6.5488e-05 - val_mean_squared_error: 6.5488e-05
Epoch 5/40
59/59 [==============================] - 9s 161ms/step - loss: 1.4372e-05 - mean_squared_error: 1.4372e-05 - val_loss: 6.3518e-05 - val_mean_squared_error: 6.3518e-05
Epoch 6/40
59/59 [==============================] - 10s 164ms/step - loss: 1.4705e-05 - mean_squared_error: 1.4705e-05 - val_loss: 7.2617e-05 - val_mean_squared_error: 7.2617e-05
Epoch 7/40
59/59 [==============================] - 10s 169ms/step - loss: 1.3127e-05 - mean_squared_error: 1.3127e-05 - val_loss: 6.3813e-05 - val_mean_squared_error: 6.3813e-05
Epoch 8/40
59/59 [==============================] - 10s 164ms/step - loss: 1.1592e-05 - mean_squared_error: 1.1592e-05 - val_loss: 1.0009e-04 - val_mean_squared_error: 1.0009e-04
Epoch 9/40
59/59 [==============================] - 10s 163ms/step - loss: 1.2259e-05 - mean_squared_error: 1.2259e-05 - val_loss: 7.0288e-05 - val_mean_squared_error: 7.0288e-05
Epoch 10/40
59/59 [==============================] - 10s 170ms/step - loss: 1.2295e-05 - mean_squared_error: 1.2295e-05 - val_loss: 5.5713e-05 - val_mean_squared_error: 5.5713e-05
Epoch 11/40
59/59 [==============================] - 9s 160ms/step - loss: 1.2307e-05 - mean_squared_error: 1.2307e-05 - val_loss: 6.9523e-05 - val_mean_squared_error: 6.9523e-05
Epoch 12/40
59/59 [==============================] - 9s 161ms/step - loss: 1.2886e-05 - mean_squared_error: 1.2886e-05 - val_loss: 5.4222e-05 - val_mean_squared_error: 5.4222e-05
Epoch 13/40
59/59 [==============================] - 9s 160ms/step - loss: 1.0708e-05 - mean_squared_error: 1.0708e-05 - val_loss: 5.7849e-05 - val_mean_squared_error: 5.7849e-05
Epoch 14/40
59/59 [==============================] - 10s 164ms/step - loss: 1.2435e-05 - mean_squared_error: 1.2435e-05 - val_loss: 4.9721e-05 - val_mean_squared_error: 4.9721e-05
Epoch 15/40
59/59 [==============================] - 10s 164ms/step - loss: 1.0112e-05 - mean_squared_error: 1.0112e-05 - val_loss: 6.4510e-05 - val_mean_squared_error: 6.4510e-05
Epoch 16/40
59/59 [==============================] - 9s 159ms/step - loss: 1.8199e-05 - mean_squared_error: 1.8199e-05 - val_loss: 1.3301e-04 - val_mean_squared_error: 1.3301e-04
Epoch 17/40
59/59 [==============================] - 10s 166ms/step - loss: 1.1986e-05 - mean_squared_error: 1.1986e-05 - val_loss: 9.5432e-05 - val_mean_squared_error: 9.5432e-05
Epoch 18/40
59/59 [==============================] - 10s 163ms/step - loss: 1.0591e-05 - mean_squared_error: 1.0591e-05 - val_loss: 6.4225e-05 - val_mean_squared_error: 6.4225e-05
Epoch 19/40
59/59 [==============================] - 10s 169ms/step - loss: 1.5352e-05 - mean_squared_error: 1.5352e-05 - val_loss: 4.5252e-05 - val_mean_squared_error: 4.5252e-05
Epoch 20/40
59/59 [==============================] - 10s 170ms/step - loss: 1.0941e-05 - mean_squared_error: 1.0941e-05 - val_loss: 6.6911e-05 - val_mean_squared_error: 6.6911e-05
Epoch 21/40
59/59 [==============================] - 10s 164ms/step - loss: 9.8449e-06 - mean_squared_error: 9.8449e-06 - val_loss: 4.2988e-05 - val_mean_squared_error: 4.2988e-05
Epoch 22/40
59/59 [==============================] - 9s 158ms/step - loss: 9.9515e-06 - mean_squared_error: 9.9515e-06 - val_loss: 4.2112e-05 - val_mean_squared_error: 4.2112e-05
Epoch 23/40
59/59 [==============================] - 10s 164ms/step - loss: 9.7170e-06 - mean_squared_error: 9.7170e-06 - val_loss: 4.6341e-05 - val_mean_squared_error: 4.6341e-05
Epoch 24/40
59/59 [==============================] - 10s 169ms/step - loss: 8.6925e-06 - mean_squared_error: 8.6925e-06 - val_loss: 4.1794e-05 - val_mean_squared_error: 4.1794e-05
Epoch 25/40
59/59 [==============================] - 10s 167ms/step - loss: 7.4483e-06 - mean_squared_error: 7.4483e-06 - val_loss: 5.5988e-05 - val_mean_squared_error: 5.5988e-05
Epoch 26/40
59/59 [==============================] - 10s 165ms/step - loss: 9.5636e-06 - mean_squared_error: 9.5636e-06 - val_loss: 4.3872e-05 - val_mean_squared_error: 4.3872e-05
Epoch 27/40
59/59 [==============================] - 9s 160ms/step - loss: 1.0350e-05 - mean_squared_error: 1.0350e-05 - val_loss: 5.2152e-05 - val_mean_squared_error: 5.2152e-05
Epoch 28/40
59/59 [==============================] - 10s 163ms/step - loss: 1.0786e-05 - mean_squared_error: 1.0786e-05 - val_loss: 3.9814e-05 - val_mean_squared_error: 3.9814e-05
Epoch 29/40
59/59 [==============================] - 9s 159ms/step - loss: 9.1273e-06 - mean_squared_error: 9.1273e-06 - val_loss: 3.6004e-05 - val_mean_squared_error: 3.6004e-05
Epoch 30/40
59/59 [==============================] - 10s 167ms/step - loss: 7.8301e-06 - mean_squared_error: 7.8301e-06 - val_loss: 4.0934e-05 - val_mean_squared_error: 4.0934e-05
Epoch 31/40
59/59 [==============================] - 10s 164ms/step - loss: 1.1371e-05 - mean_squared_error: 1.1371e-05 - val_loss: 3.8991e-05 - val_mean_squared_error: 3.8991e-05
Epoch 32/40
59/59 [==============================] - 10s 172ms/step - loss: 6.4492e-06 - mean_squared_error: 6.4492e-06 - val_loss: 3.4300e-05 - val_mean_squared_error: 3.4300e-05
Epoch 33/40
59/59 [==============================] - 10s 163ms/step - loss: 9.5622e-06 - mean_squared_error: 9.5622e-06 - val_loss: 7.0635e-05 - val_mean_squared_error: 7.0635e-05
Epoch 34/40
59/59 [==============================] - 10s 163ms/step - loss: 7.6665e-06 - mean_squared_error: 7.6665e-06 - val_loss: 7.0363e-05 - val_mean_squared_error: 7.0363e-05
Epoch 35/40
59/59 [==============================] - 10s 162ms/step - loss: 6.9080e-06 - mean_squared_error: 6.9080e-06 - val_loss: 3.9376e-05 - val_mean_squared_error: 3.9376e-05
Epoch 36/40
59/59 [==============================] - 10s 166ms/step - loss: 8.9455e-06 - mean_squared_error: 8.9455e-06 - val_loss: 5.7117e-05 - val_mean_squared_error: 5.7117e-05
Epoch 37/40
59/59 [==============================] - 10s 165ms/step - loss: 6.7359e-06 - mean_squared_error: 6.7359e-06 - val_loss: 3.2775e-05 - val_mean_squared_error: 3.2775e-05
Epoch 38/40
59/59 [==============================] - 10s 168ms/step - loss: 8.1615e-06 - mean_squared_error: 8.1615e-06 - val_loss: 3.0425e-05 - val_mean_squared_error: 3.0425e-05
Epoch 39/40
59/59 [==============================] - 10s 162ms/step - loss: 5.9963e-06 - mean_squared_error: 5.9963e-06 - val_loss: 4.6342e-05 - val_mean_squared_error: 4.6342e-05
Epoch 40/40
59/59 [==============================] - 10s 175ms/step - loss: 7.8281e-06 - mean_squared_error: 7.8281e-06 - val_loss: 3.0829e-05 - val_mean_squared_error: 3.0829e-05
Here we combine the model that we just created with the training data (x and y):
2.3 Prediction using LSTM and model evaluation
Now we move on to predicting how well our model can predict the stock price of Microsoft
The root-mean-square error (RMSE), as seen above is a frequently used measure of the differences between values predicted by a model and the values observed. RMSE depends on the scale on which the model is measured - it is just the average of the squared difference between the predicted and the actual data points - in our case 9.2$, which is pretty ok! Now we plot our result of the prediction on the 20% test set.
/shared-libs/python3.7/py-core/lib/python3.7/site-packages/ipykernel_launcher.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
after removing the cwd from sys.path.
As we see above the predictions is pretty good! the predicted line follows the valid line, which is the real stock price.
Below is a dataframe where the real prices and the predicted prices can be seen:
On the plot below, we zoomed in to the test set only, so we better can see how the valid line and the predicted line follows each other:
Execution error
In the above, we see how the model would have predicted previous dates, where we can validate against the actual stock price.
2.4 Predicting one day ahead
Now we will try to predict tomorrows price - so we can be rich, sort of.
1/1 - 0s
what is the price today: [[329.67999268]]
What is the predicted price tommorow: [[336.50162]]
predicted change in percantage from todays price compared to tomorrows price: [[2.06916553]]
2.5 Multi-step prediction
Original Prices
Close
Date
2021-11-12 336.720001
2021-11-15 336.070007
2021-11-16 339.510010
2021-11-17 339.119995
2021-11-18 341.269989
2021-11-19 343.109985
2021-11-22 339.829987
2021-11-23 337.679993
2021-11-24 337.910004
2021-11-26 329.679993
###################
Scaled Prices
[0.98003754 0.97800694 0.98875359 0.98753518 0.99425181 1.
0.98975321 0.98303657 0.98375513 0.95804438]
Original Prices
Close
Date
2021-11-12 336.720001
2021-11-15 336.070007
2021-11-16 339.510010
2021-11-17 339.119995
2021-11-18 341.269989
2021-11-19 343.109985
2021-11-22 339.829987
2021-11-23 337.679993
2021-11-24 337.910004
2021-11-26 329.679993
###################
Scaled Prices
[0.98003754 0.97800694 0.98875359 0.98753518 0.99425181 1.
0.98975321 0.98303657 0.98375513 0.95804438]
#### Input Data shape ####
(2987, 10, 1)
#### Output Data shape ####
(2987, 1)
#### Training Data shape ####
(2982, 10, 1)
(2982, 1)
#### Testing Data shape ####
(5, 10, 1)
(5, 1)
#### Training Data shape ####
(2982, 10, 1)
(2982, 1)
#### Testing Data shape ####
(5, 10, 1)
(5, 1)
[[0.02480475]
[0.02483599]
[0.02424243]
[0.02324274]
[0.02389878]
[0.02268041]
[0.02205561]
[0.02293034]
[0.02483599]
[0.02452359]]
====>
[0.02527335]
####################
[[0.02483599]
[0.02424243]
[0.02324274]
[0.02389878]
[0.02268041]
[0.02205561]
[0.02293034]
[0.02483599]
[0.02452359]
[0.02527335]]
====>
[0.0236801]
####################
[[0.02480475]
[0.02483599]
[0.02424243]
[0.02324274]
[0.02389878]
[0.02268041]
[0.02205561]
[0.02293034]
[0.02483599]
[0.02452359]]
====>
[0.02527335]
####################
[[0.02483599]
[0.02424243]
[0.02324274]
[0.02389878]
[0.02268041]
[0.02205561]
[0.02293034]
[0.02483599]
[0.02452359]
[0.02527335]]
====>
[0.0236801]
####################
Number of TimeSteps: 10
Number of Features: 1
Number of TimeSteps: 10
Number of Features: 1
Epoch 1/10
60/60 [==============================] - 5s 16ms/step - loss: 0.0784
Epoch 2/10
60/60 [==============================] - 1s 16ms/step - loss: 0.0257
Epoch 3/10
60/60 [==============================] - 1s 15ms/step - loss: 0.0049
Epoch 4/10
60/60 [==============================] - 1s 18ms/step - loss: 0.0015
Epoch 5/10
60/60 [==============================] - 1s 16ms/step - loss: 7.5390e-04
Epoch 6/10
60/60 [==============================] - 1s 16ms/step - loss: 5.3099e-04
Epoch 7/10
60/60 [==============================] - 1s 17ms/step - loss: 4.0139e-04
Epoch 8/10
60/60 [==============================] - 1s 16ms/step - loss: 3.2027e-04
Epoch 9/10
60/60 [==============================] - 1s 18ms/step - loss: 2.7000e-04
Epoch 10/10
60/60 [==============================] - 1s 16ms/step - loss: 2.2942e-04
############### Total Time Taken: 0 Minutes #############
Epoch 1/10
60/60 [==============================] - 6s 18ms/step - loss: 0.0680
Epoch 2/10
60/60 [==============================] - 1s 17ms/step - loss: 0.0254
Epoch 3/10
60/60 [==============================] - 1s 15ms/step - loss: 0.0080
Epoch 4/10
60/60 [==============================] - 1s 16ms/step - loss: 0.0038
Epoch 5/10
60/60 [==============================] - 1s 15ms/step - loss: 0.0018
Epoch 6/10
60/60 [==============================] - 1s 19ms/step - loss: 6.2719e-04
Epoch 7/10
60/60 [==============================] - 1s 17ms/step - loss: 3.9700e-04
Epoch 8/10
60/60 [==============================] - 1s 15ms/step - loss: 3.0840e-04
Epoch 9/10
60/60 [==============================] - 1s 14ms/step - loss: 2.7632e-04
Epoch 10/10
60/60 [==============================] - 1s 16ms/step - loss: 2.5282e-04
############### Total Time Taken: 0 Minutes #############
#### Predicted Prices ####
[[329.02576 335.41492 332.28094 341.896 312.14124]
[329.50726 335.89697 332.59634 342.2376 312.56744]
[330.00247 336.40353 332.94327 342.61914 312.98004]
[330.54956 337.00204 333.52512 343.22748 313.35602]
[331.075 337.66904 334.44345 344.25122 313.58356]]
#### Original Prices ####
[[343.10998535]
[339.82998657]
[337.67999268]
[337.91000366]
[329.67999268]]
#### Predicted Prices ####
[[329.02576 335.41492 332.28094 341.896 312.14124]
[329.50726 335.89697 332.59634 342.2376 312.56744]
[330.00247 336.40353 332.94327 342.61914 312.98004]
[330.54956 337.00204 333.52512 343.22748 313.35602]
[331.075 337.66904 334.44345 344.25122 313.58356]]
#### Original Prices ####
[[343.10998535]
[339.82998657]
[337.67999268]
[337.91000366]
[329.67999268]]
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/base.py:446: UserWarning: X does not have valid feature names, but MinMaxScaler was fitted with feature names
"X does not have valid feature names, but"
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/base.py:446: UserWarning: X does not have valid feature names, but MinMaxScaler was fitted with feature names
"X does not have valid feature names, but"
The close price for Microsoft at 2021-11-29 13:27:17.147573 was [329.68]
The predicted close price is 313.0 (-5.33%)
3. RNN Model
In addition to the LSTM, we created a RNN model to compare the models, to conclude which one would be the best for the purpose of predicting the Microsoft stock price.
3.1 Preprocessing
We normalize the data, which is the process of making the properties more consistent. This will in turn allow the model to predict more accurately.
Split the dataset into a training- and a testset, when that is done we convert the x & y-train into Numpy arrays. At the end we reshape the data.
[array([0.02480475, 0.02483599, 0.02424243, 0.02324274, 0.02389878,
0.02268041, 0.02205561, 0.02293034, 0.02483599, 0.02452359,
0.02527335, 0.0236801 , 0.02186817, 0.01858794, 0.01971259,
0.02027491, 0.020806 , 0.01921275, 0.0161512 , 0.01686973,
0.01702593, 0.01755701, 0.01508904, 0.01565136, 0.01471415,
0.01562012, 0.01555764, 0.01596376, 0.0153702 , 0.01668229,
0.01743205, 0.01861918, 0.01799438, 0.01786941, 0.01661981,
0.01755701, 0.01746329, 0.01768197, 0.01877538, 0.01702593,
0.01702593, 0.01755701, 0.01743205, 0.01755701, 0.0180881 ,
0.01861918, 0.01927523, 0.01955639, 0.01961887, 0.01986879,
0.02068103, 0.02061856, 0.02055608, 0.02058732, 0.02146204,
0.02074352, 0.02186817, 0.02077476, 0.02055608, 0.0211184 ])]
[0.019618872160330314]
[array([0.02480475, 0.02483599, 0.02424243, 0.02324274, 0.02389878,
0.02268041, 0.02205561, 0.02293034, 0.02483599, 0.02452359,
0.02527335, 0.0236801 , 0.02186817, 0.01858794, 0.01971259,
0.02027491, 0.020806 , 0.01921275, 0.0161512 , 0.01686973,
0.01702593, 0.01755701, 0.01508904, 0.01565136, 0.01471415,
0.01562012, 0.01555764, 0.01596376, 0.0153702 , 0.01668229,
0.01743205, 0.01861918, 0.01799438, 0.01786941, 0.01661981,
0.01755701, 0.01746329, 0.01768197, 0.01877538, 0.01702593,
0.01702593, 0.01755701, 0.01743205, 0.01755701, 0.0180881 ,
0.01861918, 0.01927523, 0.01955639, 0.01961887, 0.01986879,
0.02068103, 0.02061856, 0.02055608, 0.02058732, 0.02146204,
0.02074352, 0.02186817, 0.02077476, 0.02055608, 0.0211184 ]), array([0.02483599, 0.02424243, 0.02324274, 0.02389878, 0.02268041,
0.02205561, 0.02293034, 0.02483599, 0.02452359, 0.02527335,
0.0236801 , 0.02186817, 0.01858794, 0.01971259, 0.02027491,
0.020806 , 0.01921275, 0.0161512 , 0.01686973, 0.01702593,
0.01755701, 0.01508904, 0.01565136, 0.01471415, 0.01562012,
0.01555764, 0.01596376, 0.0153702 , 0.01668229, 0.01743205,
0.01861918, 0.01799438, 0.01786941, 0.01661981, 0.01755701,
0.01746329, 0.01768197, 0.01877538, 0.01702593, 0.01702593,
0.01755701, 0.01743205, 0.01755701, 0.0180881 , 0.01861918,
0.01927523, 0.01955639, 0.01961887, 0.01986879, 0.02068103,
0.02061856, 0.02055608, 0.02058732, 0.02146204, 0.02074352,
0.02186817, 0.02077476, 0.02055608, 0.0211184 , 0.01961887])]
[0.019618872160330314, 0.01921274571810193]
3.2 Build the RNN model with Keras
Next step would be to build the model with Keras. When the model is build, we compile it right away.
When the model is build, we train the model. In this case we have actually just used a batch size of 1 and with 1 epoch.
3.3 Predicting using RNN and model evaluation
To predict and ealuate the model, we calculate the root-mean-square error (RMSE)
The root-mean-square error (RMSE), as seen above is a frequently used measure of the differences between values predicted by a model and the values observed. RMSE depends on the scale on which the model is measured - it is just the average of the squared difference between the predicted and the actual data points - in our case 9.2$, which is pretty ok! Now we plot our result of the prediction on the 20% test set.
Now we plot the RNN model, to give a visualization
/shared-libs/python3.7/py-core/lib/python3.7/site-packages/ipykernel_launcher.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
after removing the cwd from sys.path.
4. Multivariate LSTM (Microsoft)
FEATURE LIST
['High', 'Low', 'Open', 'Close', 'Volume', 'Adj Close', 'Month', 'Year']
(2997, 8)
(2348, 50, 8) (2348,)
(599, 50, 8) (599,)
0.020681034800259693
0.020681034800259693
4.2 Model creation
400 50 8
After we ran the model training the first time we saved the result and suppresses the training and just load the trained model
We look at loss for each of the epoch iteration to the determent how the training of the model went.
Calculate the error ratings of the model
Median Squared Error (MSE): 35.99
Median Absolute Error (MAE): 4.51
Mean Absolute Percentage Error (MAPE): 2.18 %
Median Absolute Percentage Error (MDAPE): 1.6 %
R Squared (R2): 0.99
Now we visualise the models prediction and compare it to the real data.
at the bottom we show how big the difference is between the data is and if it positiv or negativ
Now we use the model to tell us what the closing value of the Microsoft stock would be tomorrow according to the stock
The close price for Microsoft at 2021-11-29 was 329.68
The predicted close price is 339.2900085449219 (+2.83%)
5. Multivariate LSTM (Multiple Stocks)
FEATURE LIST
['amazon_ac', 'facebook_ac', 'google_ac', 'apple_ac', 'microsoft_ac']
(2398, 5)
(1869, 50, 5) (1869,)
(479, 50, 5) (479,)
0.006912747660928073
0.006912747660928073
4.2 Model creation
250 50 5
Epoch 1/15
117/117 [==============================] - 47s 369ms/step - loss: 0.0028 - val_loss: 0.0015
Epoch 2/15
117/117 [==============================] - 42s 359ms/step - loss: 8.4175e-05 - val_loss: 0.0049
Epoch 3/15
117/117 [==============================] - 44s 375ms/step - loss: 5.6011e-05 - val_loss: 0.0043
Epoch 4/15
117/117 [==============================] - 42s 362ms/step - loss: 5.2070e-05 - val_loss: 0.0013
Epoch 5/15
117/117 [==============================] - 46s 396ms/step - loss: 4.8506e-05 - val_loss: 0.0037
Epoch 6/15
117/117 [==============================] - 43s 366ms/step - loss: 3.2173e-05 - val_loss: 0.0022
Epoch 7/15
117/117 [==============================] - 44s 378ms/step - loss: 3.2767e-05 - val_loss: 0.0011
Epoch 8/15
117/117 [==============================] - 43s 364ms/step - loss: 7.3403e-05 - val_loss: 0.0045
Epoch 9/15
117/117 [==============================] - 44s 374ms/step - loss: 6.4894e-05 - val_loss: 0.0021
Epoch 10/15
117/117 [==============================] - 42s 362ms/step - loss: 3.3704e-05 - val_loss: 0.0013
Epoch 11/15
117/117 [==============================] - 42s 357ms/step - loss: 4.0931e-05 - val_loss: 0.0013
Epoch 12/15
117/117 [==============================] - 42s 356ms/step - loss: 2.9641e-05 - val_loss: 0.0016
Restoring model weights from the end of the best epoch.
Epoch 00012: early stopping
Model evaluation
Calculate the error ratings of the model
Mean Squared Error (MSE): 49.39
Root Mean Absolute Error (RMSE): 0.22
Mean Absolute Error (MAE): 5.6
Mean Absolute Percentage Error (MAPE): 2.59 %
Median Absolute Percentage Error (MDAPE): 2.06 %
R Squared (R2): 0.98
Multivariate LSTM predictions
The close price for Microsoft at 2021-11-29 was 329.68
The predicted close price is 352.0799865722656 (+6.36%)