1 a)
The curve generally follows the slight increase of the data at the beginning and the following decrease. However, the fit can be considered pretty poor especially because the second right half of the curve is decreasing exponentially while the time series shows a slow-down in the decrease.
The variance of the residuals are not constant--it has high spikes with fairly constant frequency until around Time 100, and after that, the residual values increases at a linear pace. From this trend, this plot shows shows that data is not white noise.
As majority of the values surpasses the blue threshold line, the plot is not considered to be white noise. There is a visible trend in the residuals on this graph.
1 b)
Though the model follows the initial increase in the data, the isotonic model is inappropriate as it completely ignores the upward trend through Time 50 and is unable to reflect the decrease afterwards.
The residual plot still shows variance with wide range and visible seasonality, especially data before Index 100. Though the average seems to be a positive number for the first half of the plot, it goes down to past zero later.
There is a severe level of autocorrelations in the ACF plot. Therefore, Isotonic model is a poor model choice overall.
2 a)
There is an upward trend and seasonality with fairly high frequency.
2 b)
2 c)
J is at its largest at index 16, and it has a frequency of 16/189 = 0.0846... and a period of 189/16 = 11.8125... Given the data is recorded from 2004 to 2019 and has 189 rows, monthly (or period of 12) would make sense. There may be a leakage since there are a few positive values around the spike although they are much smaller.
2 d)
The fitted values seem to oscillate at the right frequency with the data. However, the model does not fit as well as the others because it is still not able to capture the overall upward trend of the graph as well as the sharp peak at each oscillation.
2 e)
There still are a few significant peaks, one of them being index 47. It has the frequency of about 0.25 = 1/4, and adding this may improve the model.
2 f)
Since the raw time series plot seems to have oscillation frequency of one month, I included the parameters above to get the average of each month to capture the overall trend. The filtering has equal weights over the months and can be used for forecasting by getting the average value of the future months. This way, we are are able to take away seasonality from the graph.
The residuals plot shows the widening variance as the time goes. Frequency seems to be pretty constant throughout. Though it is hard to determine it, mean tends to oscillate around value zero (or if not slightly above).
2 g) i, ii, and iii
It is difficult to point out the process that looks most like a stationary one, but seasonal.diff may be the best one as there seems to be more consistent seasonality in first_seasonal.diff. The worst among three is first.diff as there is a visible seasonality in the graph.
2 h)
Couldn't figure out a way to write in LateX on Deepnote. Sorry! See the last page of Homework 2.