Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
8192/5148 [===============================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
x_train shape: (60000, 28, 28, 1) y_train shape: (60000, 10)
60000 train set
10000 test set
x_train shape: (55000, 28, 28, 1) y_train shape: (55000, 10)
55000 train set
5000 validation set
10000 test set
ANN
We use a 3-layer ANN model first.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 784) 0
_________________________________________________________________
dense (Dense) (None, 124) 97340
_________________________________________________________________
dense_1 (Dense) (None, 10) 1250
=================================================================
Total params: 98,590
Trainable params: 98,590
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
1719/1719 - 4s - loss: 0.2025 - accuracy: 0.9239 - val_loss: 0.3316 - val_accuracy: 0.8874
Epoch 2/10
1719/1719 - 3s - loss: 0.1958 - accuracy: 0.9262 - val_loss: 0.3390 - val_accuracy: 0.8906
Epoch 3/10
1719/1719 - 3s - loss: 0.1930 - accuracy: 0.9269 - val_loss: 0.3379 - val_accuracy: 0.8896
Epoch 4/10
1719/1719 - 3s - loss: 0.1867 - accuracy: 0.9300 - val_loss: 0.3182 - val_accuracy: 0.8944
Epoch 5/10
1719/1719 - 3s - loss: 0.1825 - accuracy: 0.9313 - val_loss: 0.3402 - val_accuracy: 0.8908
Epoch 6/10
1719/1719 - 4s - loss: 0.1794 - accuracy: 0.9326 - val_loss: 0.3353 - val_accuracy: 0.8932
Epoch 7/10
1719/1719 - 4s - loss: 0.1727 - accuracy: 0.9356 - val_loss: 0.3208 - val_accuracy: 0.8968
Epoch 8/10
1719/1719 - 4s - loss: 0.1689 - accuracy: 0.9365 - val_loss: 0.3407 - val_accuracy: 0.8954
Epoch 9/10
1719/1719 - 4s - loss: 0.1682 - accuracy: 0.9374 - val_loss: 0.3348 - val_accuracy: 0.8972
Epoch 10/10
1719/1719 - 4s - loss: 0.1619 - accuracy: 0.9393 - val_loss: 0.3429 - val_accuracy: 0.8982
We used 10 epochs. We can see that the accuracy tops at 89.82%.
313/313 [==============================] - 0s 1ms/step - loss: 0.3901 - accuracy: 0.8832
Model with 3 layers and 10 epochs -- Test loss: 39.008891582489014
Model with 3 layers and 10 epochs -- Test accuracy: 88.31999897956848
We can see that in the test set the accuracy is 88.32% and the loss is at 39.01. We will try with just 5 epochs.
Epoch 1/5
1719/1719 - 4s - loss: 0.1585 - accuracy: 0.9407 - val_loss: 0.3660 - val_accuracy: 0.8896
Epoch 2/5
1719/1719 - 3s - loss: 0.1572 - accuracy: 0.9404 - val_loss: 0.3545 - val_accuracy: 0.8930
Epoch 3/5
1719/1719 - 3s - loss: 0.1531 - accuracy: 0.9429 - val_loss: 0.3822 - val_accuracy: 0.8928
Epoch 4/5
1719/1719 - 4s - loss: 0.1504 - accuracy: 0.9437 - val_loss: 0.3531 - val_accuracy: 0.8930
Epoch 5/5
1719/1719 - 4s - loss: 0.1440 - accuracy: 0.9466 - val_loss: 0.3725 - val_accuracy: 0.8938
313/313 [==============================] - 0s 1ms/step - loss: 0.4234 - accuracy: 0.8824
Model with 3 layers and 5 epochs -- Test loss: 42.337411642074585
Model with 3 layers and 5 epochs -- Test accuracy: 88.23999762535095
With 5 epochs, the train set accuracy tops at 89.38%.
The test set accuracy is 88.24% and loss is 42.33.
We will also try a 5-layer model.
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_4 (Flatten) (None, 784) 0
_________________________________________________________________
dense_12 (Dense) (None, 100) 78500
_________________________________________________________________
dense_13 (Dense) (None, 100) 10100
_________________________________________________________________
dense_14 (Dense) (None, 100) 10100
_________________________________________________________________
dense_15 (Dense) (None, 10) 1010
=================================================================
Total params: 99,710
Trainable params: 99,710
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
1719/1719 - 5s - loss: 0.5042 - accuracy: 0.8186 - val_loss: 0.3793 - val_accuracy: 0.8634
Epoch 2/10
1719/1719 - 4s - loss: 0.3756 - accuracy: 0.8627 - val_loss: 0.3467 - val_accuracy: 0.8728
Epoch 3/10
1719/1719 - 4s - loss: 0.3384 - accuracy: 0.8742 - val_loss: 0.3670 - val_accuracy: 0.8714
Epoch 4/10
1719/1719 - 4s - loss: 0.3162 - accuracy: 0.8829 - val_loss: 0.3218 - val_accuracy: 0.8796
Epoch 5/10
1719/1719 - 4s - loss: 0.2990 - accuracy: 0.8893 - val_loss: 0.3119 - val_accuracy: 0.8840
Epoch 6/10
1719/1719 - 5s - loss: 0.2868 - accuracy: 0.8923 - val_loss: 0.3159 - val_accuracy: 0.8872
Epoch 7/10
1719/1719 - 4s - loss: 0.2724 - accuracy: 0.8981 - val_loss: 0.3348 - val_accuracy: 0.8772
Epoch 8/10
1719/1719 - 4s - loss: 0.2628 - accuracy: 0.8996 - val_loss: 0.2903 - val_accuracy: 0.8984
Epoch 9/10
1719/1719 - 4s - loss: 0.2540 - accuracy: 0.9035 - val_loss: 0.3016 - val_accuracy: 0.8952
Epoch 10/10
1719/1719 - 4s - loss: 0.2429 - accuracy: 0.9081 - val_loss: 0.3176 - val_accuracy: 0.8860
With 10 epochs, accuracy tops at 89.84%, practically same as with 3-layer model.
313/313 [==============================] - 1s 2ms/step - loss: 0.3618 - accuracy: 0.8786
Model with 5 layers and 10 epochs -- Test loss: 36.18115782737732
Model with 5 layers and 10 epochs -- Test accuracy: 87.8600001335144
Test set accuracy is 87.86% and loss is 36.18.
Epoch 1/5
1719/1719 - 5s - loss: 0.2382 - accuracy: 0.9106 - val_loss: 0.2909 - val_accuracy: 0.8962
Epoch 2/5
1719/1719 - 4s - loss: 0.2273 - accuracy: 0.9138 - val_loss: 0.3121 - val_accuracy: 0.8950
Epoch 3/5
1719/1719 - 5s - loss: 0.2201 - accuracy: 0.9156 - val_loss: 0.3077 - val_accuracy: 0.8978
Epoch 4/5
1719/1719 - 4s - loss: 0.2130 - accuracy: 0.9195 - val_loss: 0.3175 - val_accuracy: 0.8932
Epoch 5/5
1719/1719 - 4s - loss: 0.2068 - accuracy: 0.9208 - val_loss: 0.3120 - val_accuracy: 0.8934
With 5 epochs, accuracy tops at 89.78%.
313/313 [==============================] - 1s 2ms/step - loss: 0.3606 - accuracy: 0.8856
Model with 5 layers and 5 epochs -- Test loss: 36.06124818325043
Model with 5 layers and 5 epochs -- Test accuracy: 88.55999708175659
Test set accuracy is 88.56% and loss is 36.06.
As we can see above, the 5-layer model with 5 epochs gives a little edge in accuracy and the loss is significantly less than in the 3-layer models. Therefore, we decide to go with the 5-layer model with 5 epochs. Let's plot the results.
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_5 (Flatten) (None, 784) 0
_________________________________________________________________
dense_16 (Dense) (None, 100) 78500
_________________________________________________________________
dense_17 (Dense) (None, 100) 10100
_________________________________________________________________
dense_18 (Dense) (None, 100) 10100
_________________________________________________________________
dense_19 (Dense) (None, 10) 1010
=================================================================
Total params: 99,710
Trainable params: 99,710
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
1719/1719 - 5s - loss: 0.5082 - accuracy: 0.8166 - val_loss: 0.4240 - val_accuracy: 0.8456
Epoch 2/5
1719/1719 - 4s - loss: 0.3716 - accuracy: 0.8633 - val_loss: 0.3366 - val_accuracy: 0.8776
Epoch 3/5
1719/1719 - 4s - loss: 0.3392 - accuracy: 0.8747 - val_loss: 0.4062 - val_accuracy: 0.8540
Epoch 4/5
1719/1719 - 5s - loss: 0.3190 - accuracy: 0.8811 - val_loss: 0.3234 - val_accuracy: 0.8794
Epoch 5/5
1719/1719 - 4s - loss: 0.2983 - accuracy: 0.8900 - val_loss: 0.3148 - val_accuracy: 0.8892
313/313 [==============================] - 1s 2ms/step - loss: 0.3461 - accuracy: 0.8754
Model with 5 layers and 5 epochs -- Test loss: 34.608280658721924
Model with 5 layers and 5 epochs -- Test accuracy: 87.54000067710876
CNN
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_2 (Conv2D) (None, 28, 28, 64) 320
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 14, 14, 64) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 14, 14, 64) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 14, 14, 32) 8224
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 32) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 7, 7, 32) 0
_________________________________________________________________
flatten_6 (Flatten) (None, 1568) 0
_________________________________________________________________
dense_20 (Dense) (None, 56) 87864
_________________________________________________________________
dropout_5 (Dropout) (None, 56) 0
_________________________________________________________________
dense_21 (Dense) (None, 10) 570
=================================================================
Total params: 96,978
Trainable params: 96,978
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
10/860 [..............................] - ETA: 1:14 - loss: 2.2775 - accuracy: 0.1174
860/860 [==============================] - 80s 92ms/step - loss: 1.0897 - accuracy: 0.5980 - val_loss: 0.4460 - val_accuracy: 0.8468
Epoch 2/5
67/860 [=>............................] - ETA: 1:11 - loss: 0.5916 - accuracy: 0.7798IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.
Current values:
NotebookApp.iopub_msg_rate_limit=50.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)
860/860 [==============================] - 79s 92ms/step - loss: 0.5771 - accuracy: 0.7859 - val_loss: 0.3770 - val_accuracy: 0.8650
Epoch 3/5
860/860 [==============================] - 80s 93ms/step - loss: 0.5198 - accuracy: 0.8103 - val_loss: 0.3566 - val_accuracy: 0.8690
Epoch 4/5
860/860 [==============================] - 90s 105ms/step - loss: 0.4838 - accuracy: 0.8234 - val_loss: 0.3448 - val_accuracy: 0.8760
Epoch 5/5
860/860 [==============================] - 79s 92ms/step - loss: 0.4508 - accuracy: 0.8369 - val_loss: 0.3208 - val_accuracy: 0.8848
With 5 epochs, accuracy tops at 88.48%.
313/313 [==============================] - 4s 12ms/step - loss: 0.3378 - accuracy: 0.8793
5 epochs -- Test loss: 33.77579748630524
5 epochs -- Test accuracy: 87.92999982833862
In test set. accuracy is 87.93% and loss is 33.78.
Epoch 1/10
860/860 [==============================] - 79s 92ms/step - loss: 0.4360 - accuracy: 0.8409 - val_loss: 0.3023 - val_accuracy: 0.8898
Epoch 2/10
860/860 [==============================] - 78s 91ms/step - loss: 0.4205 - accuracy: 0.8480 - val_loss: 0.2929 - val_accuracy: 0.8918
Epoch 3/10
860/860 [==============================] - 77s 89ms/step - loss: 0.4098 - accuracy: 0.8521 - val_loss: 0.2824 - val_accuracy: 0.8964
Epoch 4/10
860/860 [==============================] - 77s 89ms/step - loss: 0.3954 - accuracy: 0.8566 - val_loss: 0.2753 - val_accuracy: 0.8986
Epoch 5/10
860/860 [==============================] - 77s 89ms/step - loss: 0.3861 - accuracy: 0.8600 - val_loss: 0.2696 - val_accuracy: 0.9008
Epoch 6/10
860/860 [==============================] - 77s 89ms/step - loss: 0.3813 - accuracy: 0.8619 - val_loss: 0.2664 - val_accuracy: 0.9002
Epoch 7/10
860/860 [==============================] - 76s 89ms/step - loss: 0.3734 - accuracy: 0.8642 - val_loss: 0.2584 - val_accuracy: 0.9074
Epoch 8/10
860/860 [==============================] - 77s 90ms/step - loss: 0.3661 - accuracy: 0.8671 - val_loss: 0.2519 - val_accuracy: 0.9072
Epoch 9/10
860/860 [==============================] - 77s 89ms/step - loss: 0.3605 - accuracy: 0.8687 - val_loss: 0.2563 - val_accuracy: 0.9054
Epoch 10/10
860/860 [==============================] - 77s 90ms/step - loss: 0.3585 - accuracy: 0.8710 - val_loss: 0.2470 - val_accuracy: 0.9094
With 10 epochs, accuracy tops at 90.94%.
313/313 [==============================] - 4s 12ms/step - loss: 0.2731 - accuracy: 0.9045
5 epochs -- Test loss: 27.312034368515015
5 epochs -- Test accuracy: 90.45000076293945
In test set, accuracy is 90.45% and loss is 27.31. Hence we will go with 10 epochs.
As we can see, CNN model gives a higher accuracy than ANN. The losses are also significantly less than in ANN. The most important thing to notice is that the training set not only has higher accuracies but the accuracies are pretty close to each other. This consistency is essential, which means that overfitting is not much of a worry now.
Let's see the plot.
As a result, we can conclude that the CNN model, although it takes more time, should be preferred over ANN when making image predictions.