Predicting Unique Website Visitors With a Tensorflow Time Series Model
This project will use Tensorflow to create a time series model capable of making predictions on unique website visitor data. Data source:
https://www.kaggle.com/bobnau/daily-website-visitors
Let's get the ball rolling with making sure the data is clean and usable. We know we're working with dates, so it's always a good idea to make sure these are in datetime format. We also see commas in the unique_visits numbers, which need to be removed before changing the data type to int from object.
Tensorflow and Keras both love arrays, so it is a good idea to turn the data into np.arrays now, as we will need to do so later on anyways in order to feed data into our network model.
Training and Validation Splits, Finding Baseline
Now we have our data cleaned and in the right format for analysis, let's get the Training and Validation splits set up. I chose to use data up to the 1st of Jan 2019 as training, and data post this date as validation. This is how I landed on index value 1571 as the cut-off position.
Let's have a look at the Naïve Forecast, where we take the previous period and use it to forecast the period ahead. We'll use the resulting Mean Absolute Error as our baseline accuracy to beat.
Tensorflow Model Design
Setting a checkpoint callback to save the best model weights (lowest MAE):