Requirement already satisfied: statsmodels==0.13.1 in /root/venv/lib/python3.7/site-packages (0.13.1)
Requirement already satisfied: scipy>=1.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.13.1) (1.7.3)
Requirement already satisfied: pandas>=0.25 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.13.1) (1.2.5)
Requirement already satisfied: numpy>=1.17 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.13.1) (1.19.5)
Requirement already satisfied: patsy>=0.5.2 in /root/venv/lib/python3.7/site-packages (from statsmodels==0.13.1) (0.5.2)
Requirement already satisfied: pytz>=2017.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas>=0.25->statsmodels==0.13.1) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas>=0.25->statsmodels==0.13.1) (2.8.2)
Requirement already satisfied: six in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from patsy>=0.5.2->statsmodels==0.13.1) (1.16.0)
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
Requirement already satisfied: pmdarima in /root/venv/lib/python3.7/site-packages (1.8.4)
Requirement already satisfied: urllib3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pmdarima) (1.26.7)
Requirement already satisfied: pandas>=0.19 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pmdarima) (1.2.5)
Requirement already satisfied: joblib>=0.11 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pmdarima) (1.1.0)
Requirement already satisfied: scikit-learn>=0.22 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pmdarima) (1.0.1)
Requirement already satisfied: Cython!=0.29.18,>=0.29 in /root/venv/lib/python3.7/site-packages (from pmdarima) (0.29.26)
Requirement already satisfied: scipy>=1.3.2 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pmdarima) (1.7.3)
Requirement already satisfied: statsmodels!=0.12.0,>=0.11 in /root/venv/lib/python3.7/site-packages (from pmdarima) (0.13.1)
Requirement already satisfied: setuptools!=50.0.0,>=38.6.0 in /root/venv/lib/python3.7/site-packages (from pmdarima) (47.1.0)
Requirement already satisfied: numpy>=1.19.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pmdarima) (1.19.5)
Requirement already satisfied: python-dateutil>=2.7.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas>=0.19->pmdarima) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas>=0.19->pmdarima) (2021.3)
Requirement already satisfied: threadpoolctl>=2.0.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from scikit-learn>=0.22->pmdarima) (3.0.0)
Requirement already satisfied: patsy>=0.5.2 in /root/venv/lib/python3.7/site-packages (from statsmodels!=0.12.0,>=0.11->pmdarima) (0.5.2)
Requirement already satisfied: six>=1.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas>=0.19->pmdarima) (1.16.0)
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
Requirement already satisfied: XGBoost in /root/venv/lib/python3.7/site-packages (1.5.1)
Requirement already satisfied: numpy in /shared-libs/python3.7/py/lib/python3.7/site-packages (from XGBoost) (1.19.5)
Requirement already satisfied: scipy in /shared-libs/python3.7/py/lib/python3.7/site-packages (from XGBoost) (1.7.3)
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1519 entries, 2016-02-09 to 2021-02-08
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Open 1247 non-null float64
1 High 1247 non-null float64
2 Low 1247 non-null float64
3 Close 1247 non-null float64
4 Adj Close 1247 non-null float64
5 Volume 1247 non-null float64
dtypes: float64(6)
memory usage: 83.1 KB
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1519 entries, 2016-02-09 to 2021-02-08
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Open 1519 non-null float64
1 High 1519 non-null float64
2 Low 1519 non-null float64
3 Close 1519 non-null float64
4 Adj Close 1519 non-null float64
5 Volume 1519 non-null float64
dtypes: float64(6)
memory usage: 83.1 KB
Correlation between price deviation and binary target : 0.567698271340469
Shape of the dataset before removing NaN: (1519, 11)
Shape of the dataset after removing NaN: (1207, 11)
Average Cross Val Score : 0.7730569948186529
Out of sample accuracy : 0.768595041322314
--------------------------------------------------
Out of sample results
--------------------------------------------------
Precision score : 0.7763157894736842
Recall score : 0.8428571428571429
F1 score : 0.8082191780821917
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_confusion_matrix is deprecated; Function `plot_confusion_matrix` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: ConfusionMatrixDisplay.from_predictions or ConfusionMatrixDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_roc_curve is deprecated; Function `plot_roc_curve` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: RocCurveDisplay.from_predictions or RocCurveDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
Average Cross Val Score : 0.6808290155440414
Out of sample accuracy : 0.768595041322314
--------------------------------------------------
Out of sample results
--------------------------------------------------
Precision score : 0.7815126050420168
Recall score : 0.6642857142857143/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_confusion_matrix is deprecated; Function `plot_confusion_matrix` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: ConfusionMatrixDisplay.from_predictions or ConfusionMatrixDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_roc_curve is deprecated; Function `plot_roc_curve` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: RocCurveDisplay.from_predictions or RocCurveDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
F1 score : 0.7181467181467182
Average Cross Val Score : 0.7720207253886011
Out of sample accuracy : 0.7851239669421488
--------------------------------------------------
Out of sample results
--------------------------------------------------
Precision score : 0.8055555555555556
Recall score : 0.8285714285714286
F1 score : 0.8169014084507044
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_confusion_matrix is deprecated; Function `plot_confusion_matrix` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: ConfusionMatrixDisplay.from_predictions or ConfusionMatrixDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_roc_curve is deprecated; Function `plot_roc_curve` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: RocCurveDisplay.from_predictions or RocCurveDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
/root/venv/lib/python3.7/site-packages/xgboost/sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
warnings.warn(label_encoder_deprecation_msg, UserWarning)
[21:41:33] WARNING: ../src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
/root/venv/lib/python3.7/site-packages/xgboost/sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
warnings.warn(label_encoder_deprecation_msg, UserWarning)
[21:41:36] WARNING: ../src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
/root/venv/lib/python3.7/site-packages/xgboost/sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
warnings.warn(label_encoder_deprecation_msg, UserWarning)
[21:41:38] WARNING: ../src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
/root/venv/lib/python3.7/site-packages/xgboost/sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
warnings.warn(label_encoder_deprecation_msg, UserWarning)
[21:41:42] WARNING: ../src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
/root/venv/lib/python3.7/site-packages/xgboost/sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
warnings.warn(label_encoder_deprecation_msg, UserWarning)
[21:41:48] WARNING: ../src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
/root/venv/lib/python3.7/site-packages/xgboost/sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
warnings.warn(label_encoder_deprecation_msg, UserWarning)
[21:41:51] WARNING: ../src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
Average Cross Val Score : 0.7336787564766839
Out of sample accuracy : 0.731404958677686
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_confusion_matrix is deprecated; Function `plot_confusion_matrix` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: ConfusionMatrixDisplay.from_predictions or ConfusionMatrixDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
/shared-libs/python3.7/py/lib/python3.7/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function plot_roc_curve is deprecated; Function `plot_roc_curve` is deprecated in 1.0 and will be removed in 1.2. Use one of the class methods: RocCurveDisplay.from_predictions or RocCurveDisplay.from_estimator.
warnings.warn(msg, category=FutureWarning)
--------------------------------------------------
Out of sample results
--------------------------------------------------
Precision score : 0.7952755905511811
Recall score : 0.7214285714285714
F1 score : 0.7565543071161049
EXTRA -- ANOTHER APPROACH TO PREDICT FUTURE PRICES
There is another approach to predict future prices where i will use Arimax mode, the solution provided above is my main analysis but this is something I tried as well.
count
1247
1247
mean
461.8680834
467.3019246
std
49.93770939
51.22171008
min
368.25
374
25%
424.5
428
50%
452
458.25
75%
489.5
495.25
max
649.75
660
2016-02-09T00:00:00.000000
445.5
446.5
2016-02-10T00:00:00.000000
445.25
448
2016-02-11T00:00:00.000000
443.75
451.5
2016-02-12T00:00:00.000000
445
449
2016-02-14T00:00:00.000000
nan
nan
2016-02-15T00:00:00.000000
nan
nan
2016-02-16T00:00:00.000000
444.75
453
2016-02-17T00:00:00.000000
450
455
2016-02-18T00:00:00.000000
453.75
455.25
2016-02-19T00:00:00.000000
452.75
459.75
Requirement already satisfied: pymannkendall in /root/venv/lib/python3.7/site-packages (1.4.2)
Requirement already satisfied: scipy in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pymannkendall) (1.7.3)
Requirement already satisfied: numpy in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pymannkendall) (1.19.5)
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
/root/venv/lib/python3.7/site-packages/statsmodels/graphics/tsaplots.py:353: FutureWarning: The default method 'yw' can produce PACF values outside of the [-1,1] interval. After 0.13, the default will change tounadjusted Yule-Walker ('ywm'). You can use this method now by setting method='ywm'.
FutureWarning,
count
1207
1207
mean
454.047121
456.645187
std
40.91362907
23.44345725
min
367.75
423.4447115
25%
422.75
438.0401643
50%
445.75
446.9947917
75%
482.75
477.9676744
max
586
504.5278446
ADF Statistic: -3.2392974363232527
p-value: 0.01781973628912586
Critical Values:
1%: -3.4357884107845953
5%: -2.863941528023427
10%: -2.56804861503762
ADF Statistic: -4.031947072488856
p-value: 0.0012521623758362224
Critical Values:
1%: -3.4357884107845953
5%: -2.863941528023427
10%: -2.56804861503762
2020-07-24T00:00:00.000000
449.5
494.1864984
2020-07-26T00:00:00.000000
444.5
494.6141827
2020-07-27T00:00:00.000000
439.5
495.0582933
2020-07-28T00:00:00.000000
436.75
495.494391
2020-07-29T00:00:00.000000
445.75
495.9206731
2020-07-30T00:00:00.000000
440
496.3569712
2020-07-31T00:00:00.000000
442.5
496.7920673
2020-08-02T00:00:00.000000
436.625
497.2335737
2020-08-03T00:00:00.000000
430.75
497.7201522
2020-08-04T00:00:00.000000
422.25
498.2471955
2016-08-12T00:00:00.000000
416.25
431.0508814
2016-08-14T00:00:00.000000
414
431.0777244
2016-08-15T00:00:00.000000
411.75
431.1117788
2016-08-16T00:00:00.000000
411.25
431.1554487
2016-08-17T00:00:00.000000
418.25
431.1888355
2016-08-18T00:00:00.000000
421
431.201055
2016-08-19T00:00:00.000000
418.75
431.1907051
2016-08-21T00:00:00.000000
416.25
431.1754808
2016-08-22T00:00:00.000000
413.75
431.1907051
2016-08-23T00:00:00.000000
407.25
431.2155449
2020-04-20T00:00:00.000000
495.25
459.853766
2020-04-21T00:00:00.000000
496.75
460.3072917
2020-04-22T00:00:00.000000
490
460.7630208
2020-04-23T00:00:00.000000
485.5
461.1742788
2020-04-24T00:00:00.000000
474.75
461.5528846
ADF Statistic: -2.934941415070892
p-value: 0.04143305876453666
Critical Values:
1%: -3.436516661762673
5%: -2.8642627831621756
10%: -2.5682197107442772
ADF Statistic: -4.435841674806721
p-value: 0.0002563540317887743
Critical Values:
1%: -3.436528314312484
5%: -2.86426792284943
10%: -2.568222448164332
count
90
90
mean
451.6611111
478.7210989
std
17.16192204
11.26294792
min
421.25
459.853766
25%
439.63125
469.1140325
50%
447.875
478.6134585
75%
463.625
488.0267428
max
496.75
498.2471955
Performing stepwise search to minimize aic
ARIMA(2,1,2)(0,0,0)[0] intercept : AIC=-6038.420, Time=4.09 sec
ARIMA(0,1,0)(0,0,0)[0] intercept : AIC=-329.464, Time=0.53 sec
ARIMA(1,1,0)(0,0,0)[0] intercept : AIC=inf, Time=0.61 sec
ARIMA(0,1,1)(0,0,0)[0] intercept : AIC=inf, Time=4.13 sec
ARIMA(0,1,0)(0,0,0)[0] : AIC=-328.350, Time=0.63 sec
ARIMA(1,1,2)(0,0,0)[0] intercept : AIC=-6060.362, Time=4.86 sec
ARIMA(0,1,2)(0,0,0)[0] intercept : AIC=-3058.612, Time=4.01 sec
ARIMA(1,1,1)(0,0,0)[0] intercept : AIC=-5809.809, Time=1.95 sec
ARIMA(1,1,3)(0,0,0)[0] intercept : AIC=-6028.106, Time=7.73 sec
ARIMA(0,1,3)(0,0,0)[0] intercept : AIC=-3808.996, Time=6.98 sec
ARIMA(2,1,1)(0,0,0)[0] intercept : AIC=-5766.267, Time=1.82 sec
ARIMA(2,1,3)(0,0,0)[0] intercept : AIC=-6042.300, Time=7.55 sec
ARIMA(1,1,2)(0,0,0)[0] : AIC=-5998.347, Time=3.46 sec
Best model: ARIMA(1,1,2)(0,0,0)[0] intercept
Total fit time: 48.389 seconds
/root/venv/lib/python3.7/site-packages/statsmodels/tsa/base/tsa_model.py:393: ValueWarning: No supported index is available. Prediction results will be given with an integer index beginning at `start`.
ValueWarning)
/shared-libs/python3.7/py-core/lib/python3.7/site-packages/ipykernel_launcher.py:6: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy