!pip install yfinance
Collecting yfinance
Downloading yfinance-0.1.64.tar.gz (26 kB)
Requirement already satisfied: pandas>=0.24 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from yfinance) (1.2.5)
Requirement already satisfied: numpy>=1.15 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from yfinance) (1.19.5)
Requirement already satisfied: requests>=2.20 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from yfinance) (2.26.0)
Collecting multitasking>=0.0.7
Downloading multitasking-0.0.9.tar.gz (8.1 kB)
Requirement already satisfied: lxml>=4.5.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from yfinance) (4.6.3)
Requirement already satisfied: pytz>=2017.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas>=0.24->yfinance) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas>=0.24->yfinance) (2.8.2)
Requirement already satisfied: six>=1.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas>=0.24->yfinance) (1.16.0)
Requirement already satisfied: certifi>=2017.4.17 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests>=2.20->yfinance) (2021.10.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests>=2.20->yfinance) (1.26.7)
Requirement already satisfied: charset-normalizer~=2.0.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests>=2.20->yfinance) (2.0.7)
Requirement already satisfied: idna<4,>=2.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from requests>=2.20->yfinance) (3.3)
Building wheels for collected packages: yfinance, multitasking
Building wheel for yfinance (setup.py) ... done
Created wheel for yfinance: filename=yfinance-0.1.64-py2.py3-none-any.whl size=24109 sha256=2d162513ebdca8c1fecec801b9dcb4f025b9c3bd2205fa1dc55278acbcb3bd41
Stored in directory: /root/.cache/pip/wheels/86/fe/9b/a4d3d78796b699e37065e5b6c27b75cff448ddb8b24943c288
Building wheel for multitasking (setup.py) ... done
Created wheel for multitasking: filename=multitasking-0.0.9-py3-none-any.whl size=8375 sha256=6bbf67da37cbcbbc84e38e742c4bb26e8b1030316680178e1416d0817cb41ab0
Stored in directory: /root/.cache/pip/wheels/ae/25/47/4d68431a7ec1b6c4b5233365934b74c1d4e665bf5f968d363a
Successfully built yfinance multitasking
Installing collected packages: multitasking, yfinance
Successfully installed multitasking-0.0.9 yfinance-0.1.64
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
pip install openpyxl
Collecting openpyxl
Downloading openpyxl-3.0.9-py2.py3-none-any.whl (242 kB)
|████████████████████████████████| 242 kB 27.8 MB/s
Collecting et-xmlfile
Downloading et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-1.1.0 openpyxl-3.0.9
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
!pip install statsmodels==0.13.0
Collecting statsmodels==0.13.0
Downloading statsmodels-0.13.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)
|████████████████████████████████| 9.8 MB 13.1 MB/s
Collecting patsy>=0.5.2
Downloading patsy-0.5.2-py2.py3-none-any.whl (233 kB)
|████████████████████████████████| 233 kB 47.3 MB/s
Requirement already satisfied: pandas>=0.25 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.13.0) (1.2.5)
Requirement already satisfied: scipy>=1.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.13.0) (1.7.1)
Requirement already satisfied: numpy>=1.17 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.13.0) (1.19.5)
Requirement already satisfied: python-dateutil>=2.7.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas>=0.25->statsmodels==0.13.0) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas>=0.25->statsmodels==0.13.0) (2021.3)
Requirement already satisfied: six in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from patsy>=0.5.2->statsmodels==0.13.0) (1.16.0)
Installing collected packages: patsy, statsmodels
Successfully installed patsy-0.5.2 statsmodels-0.13.0
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
!pip install lazypredict
Collecting lazypredict
Downloading lazypredict-0.2.9-py2.py3-none-any.whl (12 kB)
Collecting scipy==1.5.4
Downloading scipy-1.5.4-cp37-cp37m-manylinux1_x86_64.whl (25.9 MB)
|████████████████████████████████| 25.9 MB 28.3 MB/s
Collecting pandas==1.0.5
Downloading pandas-1.0.5-cp37-cp37m-manylinux1_x86_64.whl (10.1 MB)
|████████████████████████████████| 10.1 MB 46.6 MB/s
Collecting six==1.15.0
Downloading six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting xgboost==1.1.1
Downloading xgboost-1.1.1-py3-none-manylinux2010_x86_64.whl (127.6 MB)
|████████████████████████████████| 127.6 MB 51 kB/s
Collecting PyYAML==5.3.1
Downloading PyYAML-5.3.1.tar.gz (269 kB)
|████████████████████████████████| 269 kB 47.6 MB/s
Collecting lightgbm==2.3.1
Downloading lightgbm-2.3.1-py2.py3-none-manylinux1_x86_64.whl (1.2 MB)
|████████████████████████████████| 1.2 MB 36.9 MB/s
Collecting numpy==1.19.1
Downloading numpy-1.19.1-cp37-cp37m-manylinux2010_x86_64.whl (14.5 MB)
|████████████████████████████████| 14.5 MB 30.7 MB/s
Collecting joblib==1.0.0
Downloading joblib-1.0.0-py3-none-any.whl (302 kB)
|████████████████████████████████| 302 kB 23.4 MB/s
Collecting pytest==5.4.3
Downloading pytest-5.4.3-py3-none-any.whl (248 kB)
|████████████████████████████████| 248 kB 51.9 MB/s
Collecting tqdm==4.56.0
Downloading tqdm-4.56.0-py2.py3-none-any.whl (72 kB)
|████████████████████████████████| 72 kB 1.4 MB/s
Collecting click==7.1.2
Downloading click-7.1.2-py2.py3-none-any.whl (82 kB)
|████████████████████████████████| 82 kB 2.2 MB/s
Collecting scikit-learn==0.23.1
Downloading scikit_learn-0.23.1-cp37-cp37m-manylinux1_x86_64.whl (6.8 MB)
|████████████████████████████████| 6.8 MB 23.2 MB/s
Requirement already satisfied: python-dateutil>=2.6.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas==1.0.5->lazypredict) (2.8.2)
Requirement already satisfied: pytz>=2017.2 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas==1.0.5->lazypredict) (2021.3)
Requirement already satisfied: py>=1.5.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pytest==5.4.3->lazypredict) (1.10.0)
Collecting more-itertools>=4.0.0
Downloading more_itertools-8.10.0-py3-none-any.whl (51 kB)
|████████████████████████████████| 51 kB 559 kB/s
Requirement already satisfied: attrs>=17.4.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pytest==5.4.3->lazypredict) (21.2.0)
Collecting pluggy<1.0,>=0.12
Downloading pluggy-0.13.1-py2.py3-none-any.whl (18 kB)
Requirement already satisfied: wcwidth in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pytest==5.4.3->lazypredict) (0.2.5)
Requirement already satisfied: packaging in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pytest==5.4.3->lazypredict) (21.0)
Requirement already satisfied: importlib-metadata>=0.12 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pytest==5.4.3->lazypredict) (4.8.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from scikit-learn==0.23.1->lazypredict) (3.0.0)
Requirement already satisfied: typing-extensions>=3.6.4 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata>=0.12->pytest==5.4.3->lazypredict) (3.10.0.2)
Requirement already satisfied: zipp>=0.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata>=0.12->pytest==5.4.3->lazypredict) (3.6.0)
Requirement already satisfied: pyparsing>=2.0.2 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from packaging->pytest==5.4.3->lazypredict) (2.4.7)
Building wheels for collected packages: PyYAML
Building wheel for PyYAML (setup.py) ... done
Created wheel for PyYAML: filename=PyYAML-5.3.1-cp37-cp37m-linux_x86_64.whl size=44635 sha256=1620969e2eaea9ff2d3192cce2c69b2b29e05080b0b0ddbd8ea33f5d08741795
Stored in directory: /root/.cache/pip/wheels/5e/03/1e/e1e954795d6f35dfc7b637fe2277bff021303bd9570ecea653
Successfully built PyYAML
Installing collected packages: numpy, six, scipy, joblib, scikit-learn, pluggy, more-itertools, xgboost, tqdm, PyYAML, pytest, pandas, lightgbm, click, lazypredict
Attempting uninstall: numpy
Found existing installation: numpy 1.19.5
Not uninstalling numpy at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'numpy'. No files were found to uninstall.
Attempting uninstall: six
Found existing installation: six 1.16.0
Not uninstalling six at /shared-libs/python3.7/py-core/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'six'. No files were found to uninstall.
Attempting uninstall: scipy
Found existing installation: scipy 1.7.1
Not uninstalling scipy at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'scipy'. No files were found to uninstall.
Attempting uninstall: joblib
Found existing installation: joblib 1.1.0
Not uninstalling joblib at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'joblib'. No files were found to uninstall.
Attempting uninstall: scikit-learn
Found existing installation: scikit-learn 1.0
Not uninstalling scikit-learn at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'scikit-learn'. No files were found to uninstall.
Attempting uninstall: pluggy
Found existing installation: pluggy 1.0.0
Not uninstalling pluggy at /shared-libs/python3.7/py-core/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'pluggy'. No files were found to uninstall.
Attempting uninstall: tqdm
Found existing installation: tqdm 4.62.3
Not uninstalling tqdm at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'tqdm'. No files were found to uninstall.
Attempting uninstall: PyYAML
Found existing installation: PyYAML 5.4.1
Not uninstalling pyyaml at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'PyYAML'. No files were found to uninstall.
Attempting uninstall: pytest
Found existing installation: pytest 6.2.5
Not uninstalling pytest at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'pytest'. No files were found to uninstall.
Attempting uninstall: pandas
Found existing installation: pandas 1.2.5
Not uninstalling pandas at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'pandas'. No files were found to uninstall.
Attempting uninstall: click
Found existing installation: click 8.0.3
Not uninstalling click at /shared-libs/python3.7/py/lib/python3.7/site-packages, outside environment /root/venv
Can't uninstall 'click'. No files were found to uninstall.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.4.1 requires numpy~=1.19.2, but you have numpy 1.19.1 which is incompatible.
tensorflow 2.4.1 requires typing-extensions~=3.7.4, but you have typing-extensions 3.10.0.2 which is incompatible.
Successfully installed PyYAML-5.3.1 click-7.1.2 joblib-1.0.0 lazypredict-0.2.9 lightgbm-2.3.1 more-itertools-8.10.0 numpy-1.19.1 pandas-1.0.5 pluggy-0.13.1 pytest-5.4.3 scikit-learn-0.23.1 scipy-1.5.4 six-1.15.0 tqdm-4.56.0 xgboost-1.1.1
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
import pandas as pd
import numpy as np
import yfinance as yf
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from statsmodels.formula.api import ols, logit
from lazypredict.Supervised import LazyClassifier, LazyRegressor
from sklearn.model_selection import train_test_split
# set matplotlib.pyplot to default style
plt.style.use('default')
/root/venv/lib/python3.7/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.utils.testing module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.utils. Anything that cannot be imported from sklearn.utils is now part of the private API.
warnings.warn(message, FutureWarning)
# Random selected companies from 4 industry
Manufacturing = ["TSM", "TXN", "AMAT", "IRBT", "LRCX", "PATK", "ACCO", "AIN", "EL", "ECL", "ZBRA", "RRX", "TSLA", "DOV", "KMB", "JBL", "FLEX", "ENTG", "ACH", "HELE", "MHK", "CAJ", "JJSF", "LULU", "CHD"] #25 companies
Medical = ["EW", "WST", "IDXX", "RMD","VRTX","AMGN", "ALGN", "PFE", "ZTS", "DHR", "TMO", "CERN", "UNH", "COO", "CNC", "QGEN", "CAH", "HCA", "MDT", "MRK", "JNJ", "REGN", "TFX", "LLY", "DVA", "ABC"] #26 companies
Logistic_transportation= ["KSU", "CHRW", "EXPD", "UNP", "FDX", "TIAIY", "ORAN", "EIX", "ASR", "DAL", "KNX", "SKYW", "HUBG", "HTLD", "JBHT", "OKE", "KEX", "NCLH", "ODFL", "PAC", "ZNH", "UAL", "TDW"]#23 companies #"TIGO","OSG",
Energy = ['NEE','ORA','ES','SRE','BWXT', "AEP", "DUK", "ATO", "PAGP", "DVN", "VLO", "PDCE", "HLX", "RRC", "KOS", "FRO", "TRGP", "LIN", "HE", "ENLAY", "CPE", "EC", "WLL", "CRK"]#24 compenies in use, excl. 'PLUG','ENPH'"BTU",
# Stock return
start=dt.datetime(2014,12,31)
end=dt.datetime(2021,9,30)
overall = []
overall.extend(Manufacturing)
overall.extend(Medical)
overall.extend(Logistic_transportation)
overall.extend(Energy)
data_O=yf.download(overall,start=start,end=end)
df_O_all=data_O['Adj Close']
df_O_all=df_O_all.ffill()
# we used 5 year data from 2014 to 2019. The return for 2020 is too volatile due to pandemic
df_O = df_O_all[:dt.datetime(2019,12,31)]
[*********************100%***********************] 98 of 98 completed
# Benchmark data (market indices & ESG ETFs)
benchmark_df = yf.download(['SPY'], start=start, end=end)
benchmark_SPY_all = benchmark_df['Adj Close']
benchmark_SPY = benchmark_SPY_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['^IXIC'], start=start, end=end)
benchmark_IXIC_all = benchmark_df['Adj Close']
benchmark_IXIC = benchmark_IXIC_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['DJI'], start=start, end=end)
benchmark_DJI_all = benchmark_df['Adj Close']
benchmark_DJI = benchmark_DJI_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['ESGD'], start=start, end=end)
benchmark_ESGD_all = benchmark_df['Adj Close']
benchmark_ESG = benchmark_ESGD_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['ESGV'], start=start, end=end)
benchmark_ESGV_all = benchmark_df['Adj Close']
benchmark_ESG2 = benchmark_ESGV_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['ICLN'], start=start, end=end)
benchmark_ICLN_all = benchmark_df['Adj Close']
benchmark_ESG3 = benchmark_ICLN_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['SUSA'], start=start, end=end)
benchmark_SUSA_all = benchmark_df['Adj Close']
benchmark_ESG4 = benchmark_SUSA_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['ESGHX'], start=start, end=end)
benchmark_ESGHX_all = benchmark_df['Adj Close']
benchmark_ESG5 = benchmark_ESGHX_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['NEXTX'], start=start, end=end)
benchmark_NEXTX_all = benchmark_df['Adj Close']
benchmark_ESG6 = benchmark_NEXTX_all[:dt.datetime(2019,12,31)]
benchmark_df = yf.download(['ATEYX'], start=start, end=end)
benchmark_ATEYX_all = benchmark_df['Adj Close']
benchmark_ESG7 = benchmark_ATEYX_all[:dt.datetime(2019,12,31)]
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
[*********************100%***********************] 1 of 1 completed
# ESG Benchmark
_5benchmark =['ICLN','SUSA','ESGHX','NEXTX','ATEYX']
# market indices benchamrk
_3benchmark = ['SPY', 'IXIC','DJI']
pip install xlrd
Collecting xlrd
Downloading xlrd-2.0.1-py2.py3-none-any.whl (96 kB)
|████████████████████████████████| 96 kB 9.6 MB/s
Installing collected packages: xlrd
Successfully installed xlrd-2.0.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
#Read and clean data in Fama-French 3
path_1 = "FF3.CSV"
FF3=pd.read_csv(path_1, skiprows=4)
#FF3 = FF3.sort_index(ascending = False)
#FF3 = FF3.reset_index()
FF3 = FF3.rename(columns={'Unnamed: 0': 'Year'})
FF3 = FF3[:-1]
FF3["Year"] = pd.to_datetime(FF3['Year'],format='%Y%m%d')
FF3 = FF3[FF3['Year'] >= dt.datetime(2014,12,31)]
df = pd.DataFrame()
for i in overall:
path = "Archive/ESG Table for " + i + ".xlsx"
temp_n=pd.read_excel(path,engine='openpyxl')
temp_n = temp_n[1:14]
temp_n = temp_n.drop(temp_n.index[1:7])
#transpose the column
temp_n = temp_n.T
temp_n.columns = temp_n.iloc[0]
temp_n = temp_n.drop(temp_n.index[0])
# Rename the rows
temp_n.set_index(temp_n.columns[0],inplace=True)
temp_n = temp_n[1:6]
temp_n = temp_n.reset_index()
# Rename column names
temp_n = temp_n.rename(columns={np.nan: 'Year'})
temp_n.columns = ['Year','ESG_C_S', 'ESG_S', 'Env_S', 'Soc_S', 'Gov_S', 'ESG_Controversial']
# Rename the score into integer
temp_n = temp_n.replace(["D-", "D","D+","C-", "C","C+","B-", "B","B+","A-", "A","A+"], [0,1,2,3,4,5,6,7,8,9,10,11])
temp_n['Year'] = pd.to_datetime(temp_n['Year'])
temp_n = temp_n.set_index("Year")
new_date1 = pd.DataFrame(temp_n[-1:].values, index=[pd.to_datetime('2014-12-31')], columns=temp_n.columns)
new_date2 = pd.DataFrame(temp_n[-1:].values, index=[pd.to_datetime('2019-12-31')], columns=temp_n.columns)
temp_n = temp_n.append(new_date1)
temp_n = temp_n.append(new_date2)
temp_n = temp_n.sort_index()
temp_n = temp_n.resample('D').asfreq()
temp_n = temp_n.fillna(method = 'ffill')
# adjust column type
#temp_n["Year"] = pd.to_numeric(temp_n["Year"])
temp_n = temp_n.apply(pd.to_numeric)
#Merge two files
temp_n = temp_n.merge(FF3, how = "inner", left_on = temp_n.index, right_on = FF3['Year'])
temp_n = temp_n.set_index("Year")
temp_n = temp_n.drop(columns = 'key_0')
# Compute Daily Return
temp_return = df_O[i].ffill().pct_change()
temp_return = temp_return.to_frame()
temp_return.columns = [i]
temp_return = temp_return[1:]
temp_return = temp_return.round(3)
temp_return = temp_return.reset_index()
temp_return = temp_return.rename(columns={'Date': 'Year', i: "Ret"})
temp_return['Year'] = pd.DatetimeIndex(temp_return['Year'])
temp_n = temp_n.merge(temp_return, how = "outer", on = "Year")
#Set the company name as index
temp_n["Company"] = i
if i in Manufacturing:
temp_n['Industry'] = 'Manufacturing'
elif i in Medical:
temp_n['Industry'] = 'Medical'
elif i in Logistic_transportation:
temp_n['Industry'] = 'Logistic'
else:
temp_n['Industry'] = 'Energy'
temp_n = temp_n.set_index("Company")
df = pd.concat([df,temp_n], axis = 0)
df_I = df
# get dummy variable for Industry
df = pd.get_dummies(df, prefix=['I'], columns=['Industry'],drop_first=True)
df = df.dropna()
df
df.isna().sum()
# plot regress plot to see if there is any trend in the data
plt.figure(figsize = (16,8))
sns.regplot(x = 'ESG_C_S', y = "Ret", data = df, logistic=True, scatter_kws={"color": "black"}, y_jitter=.05,line_kws={"color": "red"})
plt.ylabel("return")
plt.xlabel("ESG index")
def CI_test(df, i):
bs_replicates = np.empty(10000)
for j in range(10000):
bs_replicates[j] = np.mean(np.random.choice(df[i], len(df[i])))
# Bootstrapped confidence intervals
conf_int = np.percentile(bs_replicates,[2.5, 97.5])
print('95% bootstrapped confidence interval =', conf_int, '%',i)
# Theoretical confidence intervals
conf_int_actual_upper = df[i].mean() + df[i].std()/np.sqrt(df[i].count())*1.96
conf_int_actual_lower = df[i].mean() - df[i].std()/np.sqrt(df[i].count())*1.96
conf_int_actual = [conf_int_actual_lower, conf_int_actual_upper]
print('-'*120)
print('95% theoretical confidence interval =', conf_int_actual, '%',i)
CI_test(df, 'Ret')
95% bootstrapped confidence interval = [0.00042024 0.00066063] % Ret
------------------------------------------------------------------------------------------------------------------------
95% theoretical confidence interval = [0.0004200968927749914, 0.0006629309129418897] % Ret
# remove Year as it contain multicollinearity problem
df = df.drop(columns = 'Year')
df.columns
# Correlation heatmap --> identify potential correlated factors
corr = df.corr()
plt.figure(figsize = (16,8))
sns.heatmap(corr,annot=True)
plt.show()
# dependent/target/outcome variable
y = df['Ret']
# independent/predictor/explanatory variable
X = df[['ESG_C_S','ESG_S','Env_S', 'Soc_S','Gov_S','ESG_Controversial', 'Mkt-RF','SMB', 'RF' ,'HML','I_Logistic', 'I_Manufacturing', 'I_Medical']]
#
#'ESG_C_S', 'ESG_S', 'Env_S', 'Soc_S','Gov_S','ESG_Controversial'
#'Mkt-RF','SMB', 'RF' ,'HML','I_Logistic', 'I_Manufacturing', 'I_Medical',
# OLS Regression
X = sm.add_constant(X)
ols_model = sm.OLS(y,X.astype(float))
result = ols_model.fit()
print(result.summary2())
#conditional No. should be lower than 20
Results: Ordinary least squares
=====================================================================
Model: OLS Adj. R-squared: 0.174
Dependent Variable: Ret AIC: -617543.8501
Date: 2021-11-07 13:05 BIC: -617407.7386
No. Observations: 123284 Log-Likelihood: 3.0879e+05
Df Model: 13 F-statistic: 1996.
Df Residuals: 123270 Prob (F-statistic): 0.00
R-squared: 0.174 Scale: 0.00039088
---------------------------------------------------------------------
Coef. Std.Err. t P>|t| [0.025 0.975]
---------------------------------------------------------------------
const 0.0003 0.0005 0.4983 0.6183 -0.0008 0.0013
ESG_C_S 0.0002 0.0001 1.5063 0.1320 -0.0001 0.0005
ESG_S -0.0004 0.0002 -1.8547 0.0636 -0.0008 0.0000
Env_S 0.0000 0.0001 0.8746 0.3818 -0.0001 0.0001
Soc_S 0.0001 0.0001 1.2424 0.2141 -0.0000 0.0002
Gov_S 0.0000 0.0000 0.9397 0.3474 -0.0000 0.0001
ESG_Controversial -0.0000 0.0000 -0.9301 0.3523 -0.0001 0.0000
Mkt-RF 0.0102 0.0001 154.9059 0.0000 0.0101 0.0103
SMB 0.0026 0.0001 22.4436 0.0000 0.0023 0.0028
RF 0.0208 0.0164 1.2663 0.2054 -0.0114 0.0531
HML 0.0015 0.0001 14.9223 0.0000 0.0013 0.0017
I_Logistic 0.0000 0.0002 0.0598 0.9523 -0.0003 0.0003
I_Manufacturing 0.0003 0.0002 2.0870 0.0369 0.0000 0.0007
I_Medical 0.0003 0.0002 2.0013 0.0454 0.0000 0.0007
---------------------------------------------------------------------
Omnibus: 60722.824 Durbin-Watson: 2.030
Prob(Omnibus): 0.000 Jarque-Bera (JB): 18148921.520
Skew: 1.159 Prob(JB): 0.000
Kurtosis: 62.395 Condition No.: 4862
=====================================================================
* The condition number is large (5e+03). This might indicate
strong multicollinearity or other numerical problems.
df_O.index
df_O = df_O.drop(df_O.index[0] ,axis = 0)
df_1 = df_O.fillna(method = 'ffill')
df_1 = df_1.pct_change()
df_1 = df_1.drop(df_1.index[0] ,axis = 0)
df_1
# Stationary test
from statsmodels.tsa.stattools import adfuller
def test_stationarity(timeseries, i):
# Determining rolling statistics
rolmean = timeseries.rolling(14).mean()
rolstd = timeseries.rolling(14).std()
# Plot rolling statistics:
#orig = plt.plot(timeseries, color='blue',label='Original')
#mean = plt.plot(rolmean, color='red', label='Rolling Mean')
#std = plt.plot(rolstd, color='black', label = 'Rolling Std')
#plt.legend(loc='best')
#plt.title('Rolling Mean & Standard Deviation')
#plt.show(block=False)
# Perform Dickey-Fuller test:
#print ('Results of Dickey-Fuller Test:' + i)
timeseries = timeseries.iloc[:].values
dftest = adfuller(timeseries, autolag='AIC')
#dfoutput = pd.Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
#for key,value in dftest[4].items():
# dfoutput['Critical Value (%s)'%key] = value
#print(dfoutput)
if dftest[1] > 0.05:
print(i)
for i in overall:
test_stationarity(df_1[i], i)
# load data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2, random_state=42)
#LazyPredict on regression models
# fit all models
reg = LazyRegressor(predictions=True)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)
models
predictions
!pip install PyPortfolioOpt==1.2.1
Collecting PyPortfolioOpt==1.2.1
Downloading PyPortfolioOpt-1.2.1-py3-none-any.whl (49 kB)
|████████████████████████████████| 49 kB 9.8 MB/s
Collecting pandas<0.26.0,>=0.25.3
Downloading pandas-0.25.3-cp37-cp37m-manylinux1_x86_64.whl (10.4 MB)
|████████████████████████████████| 10.4 MB 14.6 MB/s
Requirement already satisfied: numpy in /root/venv/lib/python3.7/site-packages (from PyPortfolioOpt==1.2.1) (1.19.1)
Requirement already satisfied: scipy in /root/venv/lib/python3.7/site-packages (from PyPortfolioOpt==1.2.1) (1.5.4)
Collecting cvxpy<2.0.0,>=1.0.28
Downloading cvxpy-1.1.17-cp37-cp37m-manylinux_2_24_x86_64.whl (2.8 MB)
|████████████████████████████████| 2.8 MB 41.1 MB/s
Collecting ecos>=2
Downloading ecos-2.0.7.post1-cp37-cp37m-manylinux1_x86_64.whl (147 kB)
|████████████████████████████████| 147 kB 46.0 MB/s
Collecting scs>=1.1.6
Downloading scs-2.1.4.tar.gz (6.6 MB)
|████████████████████████████████| 6.6 MB 30.0 MB/s
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Collecting osqp>=0.4.1
Downloading osqp-0.6.2.post0-cp37-cp37m-manylinux2014_x86_64.whl (212 kB)
|████████████████████████████████| 212 kB 41.7 MB/s
Collecting qdldl
Downloading qdldl-0.1.5.post0-cp37-cp37m-manylinux2014_x86_64.whl (941 kB)
|████████████████████████████████| 941 kB 25.8 MB/s
Requirement already satisfied: pytz>=2017.2 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas<0.26.0,>=0.25.3->PyPortfolioOpt==1.2.1) (2021.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas<0.26.0,>=0.25.3->PyPortfolioOpt==1.2.1) (2.8.2)
Requirement already satisfied: six>=1.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from python-dateutil>=2.6.1->pandas<0.26.0,>=0.25.3->PyPortfolioOpt==1.2.1) (1.16.0)
Building wheels for collected packages: scs
Building wheel for scs (PEP 517) ... done
Created wheel for scs: filename=scs-2.1.4-cp37-cp37m-linux_x86_64.whl size=113459 sha256=7f81bcffe514af72c72216969d6f78038591cafbd9c07f0a49102c575e046fdc
Stored in directory: /root/.cache/pip/wheels/d6/58/0b/a55ff6d4e2da4c9776e363a778ea13e4068037a7be82eb85e0
Successfully built scs
Installing collected packages: qdldl, scs, osqp, ecos, pandas, cvxpy, PyPortfolioOpt
Attempting uninstall: pandas
Found existing installation: pandas 1.0.5
Uninstalling pandas-1.0.5:
Successfully uninstalled pandas-1.0.5
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lazypredict 0.2.9 requires pandas==1.0.5, but you have pandas 0.25.3 which is incompatible.
lazypredict 0.2.9 requires six==1.15.0, but you have six 1.16.0 which is incompatible.
Successfully installed PyPortfolioOpt-1.2.1 cvxpy-1.1.17 ecos-2.0.7.post1 osqp-0.6.2.post0 pandas-0.25.3 qdldl-0.1.5.post0 scs-2.1.4
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
pip install --upgrade scipy
Requirement already satisfied: scipy in /root/venv/lib/python3.7/site-packages (1.5.4)
Collecting scipy
Downloading scipy-1.7.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.2 MB)
|████████████████████████████████| 38.2 MB 197 kB/s
Requirement already satisfied: numpy<1.23.0,>=1.16.5 in /root/venv/lib/python3.7/site-packages (from scipy) (1.19.1)
Installing collected packages: scipy
Attempting uninstall: scipy
Found existing installation: scipy 1.5.4
Uninstalling scipy-1.5.4:
Successfully uninstalled scipy-1.5.4
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lazypredict 0.2.9 requires pandas==1.0.5, but you have pandas 0.25.3 which is incompatible.
lazypredict 0.2.9 requires scipy==1.5.4, but you have scipy 1.7.2 which is incompatible.
lazypredict 0.2.9 requires six==1.15.0, but you have six 1.16.0 which is incompatible.
Successfully installed scipy-1.7.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns
from pypfopt import cla
from pypfopt.plotting import Plotting
from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices
from pypfopt import objective_functions
import scipy.stats as st
# compute efficient frontier & portfolio statistics
class EF:
def __init__(self, asset, asset_list):
self.asset = asset
self.asset_list = asset_list
self.computation(asset, asset_list)
def computation(self, asset, asset_list):
# Check NaN values in the data
nullin_df = pd.DataFrame(self.asset,columns=self.asset_list)
# Calculate portfolio mean return
mu = expected_returns.mean_historical_return(self.asset)
# Calculate portfolio return variance
sigma = risk_models.sample_cov(self.asset)
# weight bounds in negative allows shorting of stocks
ef = EfficientFrontier(mu, sigma, weight_bounds=(0,1))
ef.add_objective(objective_functions.L2_reg, gamma=2)
# optional constraints possible, read pypfopt documentation.
sharpe_portfolio=ef.max_sharpe(risk_free_rate=0.00078) #US Bond
sharpe_portfolio_wt=ef.clean_weights()
latest_prices = get_latest_prices(self.asset)
# Allocate Portfolio Value in $ as required to show number of shares/stocks to buy,
# also bounds for shorting will affect allocation
# Maximum Sharpe Portfolio Allocation $1000000
da = DiscreteAllocation(sharpe_portfolio_wt, latest_prices, total_portfolio_value=1000000)
allocation, leftover = da.greedy_portfolio()
max_sharpe_cla = cla.CLA(mu, sigma)
max_sharpe_cla.max_sharpe()
sharpe_portfolio_wt_list = list(sharpe_portfolio_wt.values())
ret_data1 = self.asset.pct_change()[1:]
weighted_returns1 = (sharpe_portfolio_wt_list * ret_data1)
portfolio_ret1 = pd.DataFrame(weighted_returns1.sum(axis=1))
ret_data1 = ret_data1.merge(portfolio_ret1, on="Date", how="left")
ret_data1 = ret_data1.rename(columns={0: "portfolio_ret"})
ret_data1['cumulative_portfolio_ret'] = (ret_data1['portfolio_ret'] + 1).cumprod()
self.sharpe_portfolio_wt = sharpe_portfolio_wt
self.portfolio_ret1 = portfolio_ret1
self.allocation = allocation
self.max_sharpe_cla = max_sharpe_cla
self.ret_data1 = ret_data1
self.leftover = leftover
def SP_weight(self):
print(self.sharpe_portfolio_wt)
print(self.allocation)
print("Leftover Fund value for the maximum Sharpe portfolio is ${:.2f}".format(self.leftover))
P_weight = Plotting.plot_weights(self.sharpe_portfolio_wt)
return P_weight
def Efficient_frontier(self):
EF_plot = Plotting.plot_efficient_frontier(self.max_sharpe_cla, show_assets="True")
return EF_plot
def cumulative_return(self):
Cum_ret = sns.scatterplot('Date', 'cumulative_portfolio_ret', data=self.ret_data1, color = 'green')
print(self.ret_data1['cumulative_portfolio_ret'][-1])
return Cum_ret
def plot_hist(self):
hist_ret = plt.hist(self.ret_data1['portfolio_ret'], bins=10)
print(st.norm.interval(alpha=0.95, loc=np.mean(self.ret_data1['portfolio_ret']), scale=st.sem(self.ret_data1['portfolio_ret'])))
return hist_ret
def port_stat(self):
geometric_port_return = np.prod(self.portfolio_ret1 + 1) ** (252/self.portfolio_ret1.shape[0]) - 1
annual_std = np.std(self.portfolio_ret1) * np.sqrt(252)
port_sharpe_ratio = geometric_port_return / annual_std
print("Annual Return", geometric_port_return)
print("Stdev", annual_std)
print("Sharpe Ratio", port_sharpe_ratio)
def CI_test(self):
bs_replicates = np.empty(10000)
for i in range(10000):
bs_replicates[i] = np.mean(np.random.choice(self.ret_data1['portfolio_ret'], len(self.ret_data1['portfolio_ret'])))
# Bootstrapped confidence intervals
conf_int = np.percentile(bs_replicates,[2.5, 97.5])
print('95% bootstrapped confidence interval =', conf_int, '%')
# Theoretical confidence intervals
conf_int_actual_upper = self.ret_data1['portfolio_ret'].mean() + self.ret_data1['portfolio_ret'].std()/np.sqrt(self.ret_data1['portfolio_ret'].count())*1.96
conf_int_actual_lower = self.ret_data1['portfolio_ret'].mean() - self.ret_data1['portfolio_ret'].std()/np.sqrt(self.ret_data1['portfolio_ret'].count())*1.96
conf_int_actual = [conf_int_actual_lower, conf_int_actual_upper]
print('-'*120)
print('95% theoretical confidence interval =', conf_int_actual, '%')
# compute efficient frontier & portfolio statistics with benchmarks
class EF_benchmark5:
def __init__(self, asset, asset_list,benchmark1,benchmark2,benchmark3,benchmark4,benchmark5, benchmark_name = None):
self.asset = asset
self.asset_list = asset_list
self.benchmark1 = benchmark1
self.benchmark2 = benchmark2
self.benchmark3 = benchmark3
self.benchmark4 = benchmark4
self.benchmark5 = benchmark5
self.benchmark_name = benchmark_name
self.computation(asset, asset_list,benchmark1,benchmark2,benchmark3,benchmark4,benchmark5)
def computation(self, asset, asset_list,benchmark1,benchmark2,benchmark3,benchmark4,benchmark5):
# Check NaN values in the data
nullin_df = pd.DataFrame(self.asset,columns=self.asset_list)
# Calculate portfolio mean return
mu = expected_returns.mean_historical_return(self.asset)
# Calculate portfolio return variance
sigma = risk_models.sample_cov(self.asset)
# weight bounds in negative allows shorting of stocks
ef = EfficientFrontier(mu, sigma, weight_bounds=(0,1))
ef.add_objective(objective_functions.L2_reg, gamma=2)
# optional constraints possible, read pypfopt documentation.
sharpe_portfolio=ef.max_sharpe(risk_free_rate=0.00078) #US Bond
sharpe_portfolio_wt=ef.clean_weights()
latest_prices = get_latest_prices(self.asset)
# Allocate Portfolio Value in $ as required to show number of shares/stocks to buy,
# also bounds for shorting will affect allocation
# Maximum Sharpe Portfolio Allocation $1000000
da = DiscreteAllocation(sharpe_portfolio_wt, latest_prices, total_portfolio_value=1000000)
allocation, leftover = da.greedy_portfolio()
max_sharpe_cla = cla.CLA(mu, sigma)
max_sharpe_cla.max_sharpe()
sharpe_portfolio_wt_list = list(sharpe_portfolio_wt.values())
ret_data1 = self.asset.pct_change()[1:]
weighted_returns1 = (sharpe_portfolio_wt_list * ret_data1)
portfolio_ret1 = pd.DataFrame(weighted_returns1.sum(axis=1))
ret_data1 = ret_data1.merge(portfolio_ret1, on="Date", how="left")
ret_data1 = ret_data1.rename(columns={0: "portfolio_ret"})
ret_data1['cumulative_portfolio_ret'] = (ret_data1['portfolio_ret'] + 1).cumprod()
ret_benchmark1 = self.benchmark1.pct_change()[1:]
ret_benchmark1['cumulative_benchmark_ret'] = (ret_benchmark1 + 1).cumprod()
ret_benchmark2 = self.benchmark2.pct_change()[1:]
ret_benchmark2['cumulative_benchmark_ret'] = (ret_benchmark2 + 1).cumprod()
ret_benchmark3 = self.benchmark3.pct_change()[1:]
ret_benchmark3['cumulative_benchmark_ret'] = (ret_benchmark3 + 1).cumprod()
ret_benchmark4 = self.benchmark4.pct_change()[1:]
ret_benchmark4['cumulative_benchmark_ret'] = (ret_benchmark4 + 1).cumprod()
ret_benchmark5 = self.benchmark5.pct_change()[1:]
ret_benchmark5['cumulative_benchmark_ret'] = (ret_benchmark5 + 1).cumprod()
self.sharpe_portfolio_wt = sharpe_portfolio_wt
self.portfolio_ret1 = portfolio_ret1
self.allocation = allocation
self.max_sharpe_cla = max_sharpe_cla
self.ret_data1 = ret_data1
self.ret_benchmark1 = ret_benchmark1
self.ret_benchmark2 = ret_benchmark2
self.ret_benchmark3 = ret_benchmark3
self.ret_benchmark4 = ret_benchmark4
self.ret_benchmark5 = ret_benchmark5
self.leftover = leftover
def SP_weight(self):
print(self.sharpe_portfolio_wt)
print(self.allocation)
print("Leftover Fund value for the maximum Sharpe portfolio is ${:.2f}".format(self.leftover))
P_weight = Plotting.plot_weights(self.sharpe_portfolio_wt)
return P_weight
def Efficient_frontier(self):
EF_plot = Plotting.plot_efficient_frontier(self.max_sharpe_cla, show_assets="True")
return EF_plot
def cumulative_return(self):
Cum_ret = sns.scatterplot('Date', 'cumulative_portfolio_ret', data=self.ret_data1, color = 'yellowgreen', size = 1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark1, color = 'olivedrab',size = 1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark2, color = 'darkslategray',size = 1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark3, color = 'darkseagreen',size = 1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark4, color = 'darkgreen',size = 1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark5, color = 'teal',size = 1)
if self.benchmark_name != None:
plt.legend(labels = ["ESG", *self.benchmark_name])
else:
plt.legend(labels = ["ESG", "Benchmark"])
print(self.ret_data1['cumulative_portfolio_ret'][-1])
return Cum_ret
def plot_hist(self):
hist_ret = plt.hist(self.ret_data1['portfolio_ret'], bins=10)
print(st.norm.interval(alpha=0.95, loc=np.mean(self.ret_data1['portfolio_ret']), scale=st.sem(self.ret_data1['portfolio_ret'])))
return hist_ret
def port_stat(self):
geometric_port_return = np.prod(self.portfolio_ret1 + 1) ** (252/self.portfolio_ret1.shape[0]) - 1
annual_std = np.std(self.portfolio_ret1) * np.sqrt(252)
port_sharpe_ratio = geometric_port_return / annual_std
print("Annual Return", geometric_port_return)
print("Stdev", annual_std)
print("Sharpe Ratio", port_sharpe_ratio)
def hypothesis_testing(self):
#import scipy.stats as stats
plt1 = self.ret_data1['portfolio_ret'].hist(bins=100, color='r', alpha=0.5, figsize = (16,8))
plt1 = self.benchmark2.pct_change()[1:].hist(bins=100, color='olivedrab', alpha=0.5)
print("Difference in mean return (" + self.benchmark_name[1] + ", ESG): ")
print((self.ret_data1['portfolio_ret'].mean() - self.benchmark2.pct_change()[1:].mean())*100)
stat, p = st.ttest_ind(self.ret_data1['portfolio_ret'], self.benchmark2.pct_change()[1:], equal_var=False)#, alternative='greater')
alpha = 0.05
print("p value is " + str(p))
if p <= alpha:
print('The difference in mean return is significantly different (reject null hypothesis)')
else:
print('The difference in mean return is not significantly different (failed to reject null hypothesis)')
def CI_test(self):
bs_replicates = np.empty(10000)
for i in range(10000):
bs_replicates[i] = np.mean(np.random.choice(self.ret_data1['portfolio_ret'], len(self.ret_data1['portfolio_ret'])))
# Bootstrapped confidence intervals
conf_int = np.percentile(bs_replicates,[2.5, 97.5])
print('95% bootstrapped confidence interval =', conf_int, '%')
# Theoretical confidence intervals
conf_int_actual_upper = self.ret_data1['portfolio_ret'].mean() + self.ret_data1['portfolio_ret'].std()/np.sqrt(self.ret_data1['portfolio_ret'].count())*1.96
conf_int_actual_lower = self.ret_data1['portfolio_ret'].mean() - self.ret_data1['portfolio_ret'].std()/np.sqrt(self.ret_data1['portfolio_ret'].count())*1.96
conf_int_actual = [conf_int_actual_lower, conf_int_actual_upper]
print('-'*120)
print('95% theoretical confidence interval =', conf_int_actual, '%')
def CI_test_benchmark(self):
bs_replicates = np.empty(10000)
for i in range(10000):
bs_replicates[i] = np.mean(np.random.choice(self.benchmark2.pct_change()[1:], len(self.benchmark2.pct_change()[1:])))
# Bootstrapped confidence intervals
conf_int = np.percentile(bs_replicates,[2.5, 97.5])
print('95% bootstrapped confidence interval =', conf_int, '%')
# Theoretical confidence intervals
conf_int_actual_upper = self.benchmark2.pct_change()[1:].mean() + self.benchmark2.pct_change()[1:].std()/np.sqrt(self.benchmark2.pct_change()[1:].count())*1.96
conf_int_actual_lower = self.benchmark2.pct_change()[1:].mean() - self.benchmark2.pct_change()[1:].std()/np.sqrt(self.benchmark2.pct_change()[1:].count())*1.96
conf_int_actual = [conf_int_actual_lower, conf_int_actual_upper]
print('-'*120)
print('95% theoretical confidence interval =', conf_int_actual, '%')
# compute efficient frontier & portfolio statistics with benchmarks
class EF_benchmark3:
def __init__(self, asset, asset_list,benchmark1,benchmark2,benchmark3, benchmark_name = None):
self.asset = asset
self.asset_list = asset_list
self.benchmark1 = benchmark1
self.benchmark2 = benchmark2
self.benchmark3 = benchmark3
self.benchmark_name = benchmark_name
self.computation(asset, asset_list,benchmark1,benchmark2,benchmark3)
def computation(self, asset, asset_list,benchmark1,benchmark2,benchmark3):
# Check NaN values in the data
nullin_df = pd.DataFrame(self.asset,columns=self.asset_list)
# Calculate portfolio mean return
mu = expected_returns.mean_historical_return(self.asset)
# Calculate portfolio return variance
sigma = risk_models.sample_cov(self.asset)
# weight bounds in negative allows shorting of stocks
ef = EfficientFrontier(mu, sigma, weight_bounds=(0,1))
ef.add_objective(objective_functions.L2_reg, gamma=2)
# optional constraints possible, read pypfopt documentation.
sharpe_portfolio=ef.max_sharpe(risk_free_rate=0.00078) #US Bond
sharpe_portfolio_wt=ef.clean_weights()
latest_prices = get_latest_prices(self.asset)
# Allocate Portfolio Value in $ as required to show number of shares/stocks to buy,
# also bounds for shorting will affect allocation
# Maximum Sharpe Portfolio Allocation $1000000
da = DiscreteAllocation(sharpe_portfolio_wt, latest_prices, total_portfolio_value=1000000)
allocation, leftover = da.greedy_portfolio()
max_sharpe_cla = cla.CLA(mu, sigma)
max_sharpe_cla.max_sharpe()
sharpe_portfolio_wt_list = list(sharpe_portfolio_wt.values())
ret_data1 = self.asset.pct_change()[1:]
weighted_returns1 = (sharpe_portfolio_wt_list * ret_data1)
portfolio_ret1 = pd.DataFrame(weighted_returns1.sum(axis=1))
ret_data1 = ret_data1.merge(portfolio_ret1, on="Date", how="left")
ret_data1 = ret_data1.rename(columns={0: "portfolio_ret"})
ret_data1['cumulative_portfolio_ret'] = (ret_data1['portfolio_ret'] + 1).cumprod()
ret_benchmark1 = self.benchmark1.pct_change()[1:]
ret_benchmark1['cumulative_benchmark_ret'] = (ret_benchmark1 + 1).cumprod()
ret_benchmark2 = self.benchmark2.pct_change()[1:]
ret_benchmark2['cumulative_benchmark_ret'] = (ret_benchmark2 + 1).cumprod()
ret_benchmark3 = self.benchmark3.pct_change()[1:]
ret_benchmark3['cumulative_benchmark_ret'] = (ret_benchmark3 + 1).cumprod()
self.sharpe_portfolio_wt = sharpe_portfolio_wt
self.portfolio_ret1 = portfolio_ret1
self.allocation = allocation
self.max_sharpe_cla = max_sharpe_cla
self.ret_data1 = ret_data1
self.ret_benchmark1 = ret_benchmark1
self.ret_benchmark2 = ret_benchmark2
self.ret_benchmark3 = ret_benchmark3
self.leftover = leftover
def SP_weight(self):
print(self.sharpe_portfolio_wt)
print(self.allocation)
print("Leftover Fund value for the maximum Sharpe portfolio is ${:.2f}".format(self.leftover))
P_weight = Plotting.plot_weights(self.sharpe_portfolio_wt)
return P_weight
def Efficient_frontier(self):
EF_plot = Plotting.plot_efficient_frontier(self.max_sharpe_cla, show_assets="True")
return EF_plot
def cumulative_return(self):
Cum_ret = sns.scatterplot('Date', 'cumulative_portfolio_ret', data=self.ret_data1, color = 'yellowgreen', size = 1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark1, color = 'teal',size =1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark2, color = 'olivedrab',size =1)
Cum_ret = sns.scatterplot('Date', 'cumulative_benchmark_ret', data=self.ret_benchmark3, color = 'darkgreen',size =1)
if self.benchmark_name != None:
plt.legend(labels = ["ESG", *self.benchmark_name])
else:
plt.legend(labels = ["ESG", "Benchmark"])
print(self.ret_data1['cumulative_portfolio_ret'][-1])
return Cum_ret
def plot_hist(self):
hist_ret = plt.hist(self.ret_data1['portfolio_ret'], bins=10)
print(st.norm.interval(alpha=0.95, loc=np.mean(self.ret_data1['portfolio_ret']), scale=st.sem(self.ret_data1['portfolio_ret'])))
return hist_ret
def port_stat(self):
geometric_port_return = np.prod(self.portfolio_ret1 + 1) ** (252/self.portfolio_ret1.shape[0]) - 1
annual_std = np.std(self.portfolio_ret1) * np.sqrt(252)
port_sharpe_ratio = geometric_port_return / annual_std
print("Annual Return", geometric_port_return)
print("Stdev", annual_std)
print("Sharpe Ratio", port_sharpe_ratio)
def hypothesis_testing(self):
#import scipy.stats as stats
plt1 = self.ret_data1['portfolio_ret'].hist(bins=100, color='r', alpha=0.5, figsize = (16,8))
plt1 = self.benchmark1.pct_change()[1:].hist(bins=100, color='olivedrab', alpha=0.5)
print("Difference in mean return (" + self.benchmark_name[0] + ", ESG): ")
print((self.ret_data1['portfolio_ret'].mean() - self.benchmark1.pct_change()[1:].mean())*100)
stat, p = st.ttest_ind(self.ret_data1['portfolio_ret'], self.benchmark1.pct_change()[1:], equal_var=False)#, alternative='greater')
alpha = 0.05
print("p value is " + str(p))
if p <= alpha:
print('The difference in mean return is significantly different (reject null hypothesis)')
else:
print('The difference in mean return is not significantly different (failed to reject null hypothesis)')
def CI_test(self):
bs_replicates = np.empty(10000)
for i in range(10000):
bs_replicates[i] = np.mean(np.random.choice(self.ret_data1['portfolio_ret'], len(self.ret_data1['portfolio_ret'])))
# Bootstrapped confidence intervals
conf_int = np.percentile(bs_replicates,[2.5, 97.5])
print('95% bootstrapped confidence interval =', conf_int, '%')
# Theoretical confidence intervals
conf_int_actual_upper = self.ret_data1['portfolio_ret'].mean() + self.ret_data1['portfolio_ret'].std()/np.sqrt(self.ret_data1['portfolio_ret'].count())*1.96
conf_int_actual_lower = self.ret_data1['portfolio_ret'].mean() - self.ret_data1['portfolio_ret'].std()/np.sqrt(self.ret_data1['portfolio_ret'].count())*1.96
conf_int_actual = [conf_int_actual_lower, conf_int_actual_upper]
print('-'*120)
print('95% theoretical confidence interval =', conf_int_actual, '%')
def CI_test_benchmark(self):
bs_replicates = np.empty(10000)
for i in range(10000):
bs_replicates[i] = np.mean(np.random.choice(self.benchmark1.pct_change()[1:], len(self.benchmark1.pct_change()[1:])))
# Bootstrapped confidence intervals
conf_int = np.percentile(bs_replicates,[2.5, 97.5])
print('95% bootstrapped confidence interval =', conf_int, '%')
# Theoretical confidence intervals
conf_int_actual_upper = self.benchmark1.pct_change()[1:].mean() + self.benchmark1.pct_change()[1:].std()/np.sqrt(self.benchmark1.pct_change()[1:].count())*1.96
conf_int_actual_lower = self.benchmark1.pct_change()[1:].mean() - self.benchmark1.pct_change()[1:].std()/np.sqrt(self.benchmark1.pct_change()[1:].count())*1.96
conf_int_actual = [conf_int_actual_lower, conf_int_actual_upper]
print('-'*120)
print('95% theoretical confidence interval =', conf_int_actual, '%')
#Ranked the data into 3 segment
def gmb(df, subset = "Energy"):
test = df[df["Industry"] == subset]
test = test['ESG_C_S']
test = test.groupby(test.index).mean()
category = pd.qcut(test, 3,labels=["bad", "medium", "good"])
category = pd.DataFrame(category)
bad = list(category[category['ESG_C_S'] == "bad"].index)
medium = list(category[category['ESG_C_S'] == "medium"].index)
good = list(category[category['ESG_C_S'] == "good"].index)
return bad, medium, good
E_bad, E_medium, E_good = gmb(df_I,"Energy")
M1_bad, M1_medium, M1_good = gmb(df_I,"Manufacturing")
M2_bad, M2_medium, M2_good = gmb(df_I,"Medical")
L_bad, L_medium, L_good = gmb(df_I,"Logistic")
bad = []
bad += E_bad
bad += M1_bad
bad += M2_bad
bad += L_bad
medium = []
medium += E_medium
medium += M1_medium
medium += M2_medium
medium += L_medium
good = []
good += E_good
good += M1_good
good += M2_good
good += L_good
sns.violinplot(x="Industry", y="ESG_C_S", data=df_I)# hue = 'Industry',
g_ef = EF(df_O[good], good)
b_ef = EF(df_O[bad], bad)
m_ef = EF(df_O[medium], medium)
g_ef.cumulative_return()
2.3780501351868075
# good ESG stocks in each industry
E_good,M1_good,M2_good,L_good
pip install yahoo_fin
Collecting yahoo_fin
Downloading yahoo_fin-0.8.9.1-py3-none-any.whl (10 kB)
Requirement already satisfied: requests in /shared-libs/python3.7/py/lib/python3.7/site-packages (from yahoo_fin) (2.26.0)
Collecting requests-html
Downloading requests_html-0.10.0-py3-none-any.whl (13 kB)
Requirement already satisfied: pandas in /root/venv/lib/python3.7/site-packages (from yahoo_fin) (0.25.3)
Collecting feedparser
Downloading feedparser-6.0.8-py3-none-any.whl (81 kB)
|████████████████████████████████| 81 kB 15.0 MB/s
Collecting sgmllib3k
Downloading sgmllib3k-1.0.0.tar.gz (5.8 kB)
Requirement already satisfied: pytz>=2017.2 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas->yahoo_fin) (2021.3)
Requirement already satisfied: numpy>=1.13.3 in /root/venv/lib/python3.7/site-packages (from pandas->yahoo_fin) (1.19.1)
Requirement already satisfied: python-dateutil>=2.6.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas->yahoo_fin) (2.8.2)
Requirement already satisfied: six>=1.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from python-dateutil>=2.6.1->pandas->yahoo_fin) (1.16.0)
Requirement already satisfied: certifi>=2017.4.17 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests->yahoo_fin) (2021.10.8)
Requirement already satisfied: idna<4,>=2.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from requests->yahoo_fin) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests->yahoo_fin) (2.0.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests->yahoo_fin) (1.26.7)
Collecting pyppeteer>=0.0.14
Downloading pyppeteer-0.2.6-py3-none-any.whl (83 kB)
|████████████████████████████████| 83 kB 5.4 MB/s
Collecting bs4
Downloading bs4-0.0.1.tar.gz (1.1 kB)
Collecting fake-useragent
Downloading fake-useragent-0.1.11.tar.gz (13 kB)
Requirement already satisfied: w3lib in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests-html->yahoo_fin) (1.22.0)
Collecting parse
Downloading parse-1.19.0.tar.gz (30 kB)
Collecting pyquery
Downloading pyquery-1.4.3-py3-none-any.whl (22 kB)
Requirement already satisfied: tqdm<5.0.0,>=4.42.1 in /root/venv/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests-html->yahoo_fin) (4.56.0)
Collecting appdirs<2.0.0,>=1.4.3
Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting websockets<10.0,>=9.1
Downloading websockets-9.1-cp37-cp37m-manylinux2010_x86_64.whl (103 kB)
|████████████████████████████████| 103 kB 47.5 MB/s
Requirement already satisfied: importlib-metadata>=1.4 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests-html->yahoo_fin) (4.8.1)
Collecting pyee<9.0.0,>=8.1.0
Downloading pyee-8.2.2-py2.py3-none-any.whl (12 kB)
Requirement already satisfied: typing-extensions>=3.6.4 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata>=1.4->pyppeteer>=0.0.14->requests-html->yahoo_fin) (3.10.0.2)
Requirement already satisfied: zipp>=0.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata>=1.4->pyppeteer>=0.0.14->requests-html->yahoo_fin) (3.6.0)
Collecting beautifulsoup4
Downloading beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
|████████████████████████████████| 97 kB 12.4 MB/s
Collecting soupsieve>1.2
Downloading soupsieve-2.3-py3-none-any.whl (37 kB)
Requirement already satisfied: lxml>=2.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pyquery->requests-html->yahoo_fin) (4.6.3)
Requirement already satisfied: cssselect>0.7.9 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pyquery->requests-html->yahoo_fin) (1.1.0)
Building wheels for collected packages: bs4, fake-useragent, parse, sgmllib3k
Building wheel for bs4 (setup.py) ... done
Created wheel for bs4: filename=bs4-0.0.1-py3-none-any.whl size=1271 sha256=4282ab87a9cc3491ad578a3b1105ae3c07810222815d5008d57ff3ba4b150b03
Stored in directory: /root/.cache/pip/wheels/0a/9e/ba/20e5bbc1afef3a491f0b3bb74d508f99403aabe76eda2167ca
Building wheel for fake-useragent (setup.py) ... done
Created wheel for fake-useragent: filename=fake_useragent-0.1.11-py3-none-any.whl size=13502 sha256=56ba2c46df95487582ede19c74973284ec19f9eafe67ff98de8339be03887429
Stored in directory: /root/.cache/pip/wheels/ed/f7/62/50ab6c9a0b5567267ab76a9daa9d06315704209b2c5d032031
Building wheel for parse (setup.py) ... done
Created wheel for parse: filename=parse-1.19.0-py3-none-any.whl size=24591 sha256=12f7ead085084a44c66742c4e27e75e0e6e8f7b58d19c815734d33c18e3fb0fd
Stored in directory: /root/.cache/pip/wheels/9c/aa/cc/f2228050ccb40f22144b073f15a2c84f11204f29fc0dce028e
Building wheel for sgmllib3k (setup.py) ... done
Created wheel for sgmllib3k: filename=sgmllib3k-1.0.0-py3-none-any.whl size=6065 sha256=514bf187cf465ea0f3ebcb55282b4622be76fd8c0203125b01aca5d808a062a2
Stored in directory: /root/.cache/pip/wheels/73/ad/a4/0dff4a6ef231fc0dfa12ffbac2a36cebfdddfe059f50e019aa
Successfully built bs4 fake-useragent parse sgmllib3k
Installing collected packages: soupsieve, websockets, pyee, beautifulsoup4, appdirs, sgmllib3k, pyquery, pyppeteer, parse, fake-useragent, bs4, requests-html, feedparser, yahoo-fin
Successfully installed appdirs-1.4.4 beautifulsoup4-4.10.0 bs4-0.0.1 fake-useragent-0.1.11 feedparser-6.0.8 parse-1.19.0 pyee-8.2.2 pyppeteer-0.2.6 pyquery-1.4.3 requests-html-0.10.0 sgmllib3k-1.0.0 soupsieve-2.3 websockets-9.1 yahoo-fin-0.8.9.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
pip install requests_html
Requirement already satisfied: requests_html in /usr/local/lib/python3.7/site-packages (0.10.0)
Requirement already satisfied: requests in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests_html) (2.26.0)
Requirement already satisfied: bs4 in /usr/local/lib/python3.7/site-packages (from requests_html) (0.0.1)
Requirement already satisfied: pyppeteer>=0.0.14 in /usr/local/lib/python3.7/site-packages (from requests_html) (0.2.6)
Requirement already satisfied: fake-useragent in /usr/local/lib/python3.7/site-packages (from requests_html) (0.1.11)
Requirement already satisfied: w3lib in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests_html) (1.22.0)
Requirement already satisfied: parse in /usr/local/lib/python3.7/site-packages (from requests_html) (1.19.0)
Requirement already satisfied: pyquery in /usr/local/lib/python3.7/site-packages (from requests_html) (1.4.3)
Requirement already satisfied: tqdm<5.0.0,>=4.42.1 in /root/venv/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests_html) (4.56.0)
Requirement already satisfied: appdirs<2.0.0,>=1.4.3 in /usr/local/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests_html) (1.4.4)
Requirement already satisfied: websockets<10.0,>=9.1 in /usr/local/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests_html) (9.1)
Requirement already satisfied: importlib-metadata>=1.4 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests_html) (4.8.1)
Requirement already satisfied: pyee<9.0.0,>=8.1.0 in /usr/local/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests_html) (8.2.2)
Requirement already satisfied: urllib3<2.0.0,>=1.25.8 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pyppeteer>=0.0.14->requests_html) (1.26.7)
Requirement already satisfied: typing-extensions>=3.6.4 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata>=1.4->pyppeteer>=0.0.14->requests_html) (3.10.0.2)
Requirement already satisfied: zipp>=0.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata>=1.4->pyppeteer>=0.0.14->requests_html) (3.6.0)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.7/site-packages (from bs4->requests_html) (4.10.0)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.7/site-packages (from beautifulsoup4->bs4->requests_html) (2.3)
Requirement already satisfied: cssselect>0.7.9 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pyquery->requests_html) (1.1.0)
Requirement already satisfied: lxml>=2.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pyquery->requests_html) (4.6.3)
Requirement already satisfied: certifi>=2017.4.17 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests->requests_html) (2021.10.8)
Requirement already satisfied: idna<4,>=2.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from requests->requests_html) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests->requests_html) (2.0.7)
Requirement already satisfied: six>=1.4.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from w3lib->requests_html) (1.16.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
# Extract company data
import yahoo_fin.stock_info as si
test = si.get_stats_valuation('AEP')
test['Unnamed: 0']
items = ['Market Cap (intraday) 5', 'Enterprise Value 3', 'Trailing P/E', 'Forward P/E 1', 'PEG Ratio (5 yr expected) 1', 'Price/Sales (ttm)', 'Price/Book (mrq)', 'Enterprise Value/Revenue 3', 'Enterprise Value/EBITDA 7']
def stats1(asset_list, item):
dict_temp = {}
for i in asset_list:
#print(i)
temp = si.get_stats_valuation(i)
temp = temp.iloc[:,:2]
temp.columns = ["Attribute", "Recent"]
dict_temp[i] = temp[(temp["Attribute"] == item)]
combined_stats = pd.concat(dict_temp)
combined_stats = combined_stats.reset_index()
if item == 'Market Cap (intraday) 5' or item == 'Enterprise Value 3':
combined_stats.Recent = combined_stats.Recent.apply(lambda x: float(x[:-1]) if 'B' in x else x)
else:
combined_stats.Recent = combined_stats.Recent.apply(lambda x: float(x))
combined_stats = combined_stats[["level_0", "Recent"]]
combined_stats.columns = ['Firm',item]
combined_stats = combined_stats.sort_values(by=[item],ascending=False)
return combined_stats
def stats2(asset_list, *item):
output = pd.DataFrame(columns=['Firm'])
for i in item:
#print(i)
temp = stats1(asset_list, i)
output = output.merge(temp, how = 'outer', on = "Firm")
return output
#Select stock based on their finacial ratios
output_E = stats2(E_good, items[0],items[3],items[5])
output_M1 = stats2(M1_good, items[0],items[3],items[5])
output_M2 = stats2(M2_good, items[0],items[3],items[5])
output_L = stats2(L_good, items[0],items[3],items[5])
output_E
output_M1
output_M2
output_L
# Select stocks based on the above criteria
Selected_Stock = ["NEE", "LIN", "ENLAY", "ES", "SRE",'UNH', 'TMO', 'WST','RMD','LLY', "TSM", "TXN", "ECL", "AMAT", "KMB", "UNP", "FDX", "ASR", "EIX", "ZNH"]
def alpha_beta_calculation(asset, assetlist, benchmark, start_date = "2014-12-31", end_date = "2019-12-31"):
nullin_df = pd.DataFrame(asset.loc[start_date:end_date],columns=assetlist)
# Calculate portfolio mean return
mu = expected_returns.mean_historical_return(asset.loc[start_date:end_date])
# Calculate portfolio return variance
sigma = risk_models.sample_cov(asset.loc[start_date:end_date])
# weight bounds in negative allows shorting of stocks
ef = EfficientFrontier(mu, sigma, weight_bounds=(0,1))
ef.add_objective(objective_functions.L2_reg, gamma=2)
# optional constraints possible, read pypfopt documentation.
sharpe_portfolio=ef.max_sharpe(risk_free_rate=0.00078) #US Bond
sharpe_portfolio_wt=ef.clean_weights()
latest_prices = get_latest_prices(asset.loc[start_date:end_date])
# Allocate Portfolio Value in $ as required to show number of shares/stocks to buy,
# also bounds for shorting will affect allocation
# Maximum Sharpe Portfolio Allocation $1000000
da = DiscreteAllocation(sharpe_portfolio_wt, latest_prices, total_portfolio_value=1000000)
allocation, leftover = da.greedy_portfolio()
max_sharpe_cla = cla.CLA(mu, sigma)
max_sharpe_cla.max_sharpe()
sharpe_portfolio_wt_list = list(sharpe_portfolio_wt.values())
ret_data1 = asset.loc[start_date:end_date].pct_change()[1:]
weighted_returns1 = (sharpe_portfolio_wt_list * ret_data1)
portfolio_ret1 =weighted_returns1.sum(axis=1)
benchmark1 = benchmark.loc[start_date:end_date].pct_change()[1:]
(Beta, Alpha) = st.linregress(benchmark1.values, portfolio_ret1.values)[0:2]
year = int(end_date[:4]) - int(start_date[:4])
print(str(year) + "-year " + f"Beta: {Beta:.6f}", str(year) + "-year " +f"Alpha: {Alpha:.6f}")
#print(str(year) + "-year " + f"Beta: {res[0]:.6f}", str(year) + "-year " +f"Alpha: {res[1]:.6f}")
#idosycratic risk work in favor to the stock (show as residual)
# --> good ESG & valued stock
alpha_beta_calculation(df_O[Selected_Stock], Selected_Stock,benchmark_SPY)
5-year Beta: 0.909452 5-year Alpha: 0.000366
alpha_beta_calculation(df_O[Selected_Stock], Selected_Stock,benchmark_SPY,"2018-12-31","2019-12-31")
1-year Beta: 0.872020 1-year Alpha: 0.000593
alpha_beta_calculation(df_O[Selected_Stock], Selected_Stock,benchmark_SPY,"2016-12-31","2019-12-31")
3-year Beta: 0.863542 3-year Alpha: 0.000473
alpha_beta_calculation(df_O_all[Selected_Stock], Selected_Stock,benchmark_SPY_all,"2016-09-30","2021-09-30")
5-year Beta: 0.974696 5-year Alpha: 0.000423
alpha_beta_calculation(df_O_all[Selected_Stock], Selected_Stock,benchmark_SPY_all,"2020-09-30","2021-09-30")
1-year Beta: 1.086714 1-year Alpha: 0.000647
alpha_beta_calculation(df_O_all[Selected_Stock], Selected_Stock,benchmark_SPY_all,"2018-09-30","2021-09-30")
3-year Beta: 0.970280 3-year Alpha: 0.000667
selected_EF_3 = EF_benchmark3(df_O[Selected_Stock], Selected_Stock,benchmark_SPY, benchmark_IXIC,benchmark_DJI, _3benchmark)
M_EF = selected_EF_3.cumulative_return()
2.591151580420214
selected_EF_3.Efficient_frontier()
selected_EF_3.SP_weight()
{'NEE': 0.06524, 'LIN': 0.03959, 'ENLAY': 0.05311, 'ES': 0.04243, 'SRE': 0.03127, 'UNH': 0.07738, 'TMO': 0.06321, 'WST': 0.07265, 'RMD': 0.07697, 'LLY': 0.0532, 'TSM': 0.07595, 'TXN': 0.06657, 'ECL': 0.0429, 'AMAT': 0.06887, 'KMB': 0.02282, 'UNP': 0.03592, 'FDX': 0.0, 'ASR': 0.03652, 'EIX': 0.0254, 'ZNH': 0.05001}
{'UNH': 271, 'RMD': 504, 'TSM': 1357, 'WST': 485, 'AMAT': 1149, 'TXN': 546, 'NEE': 1116, 'TMO': 196, 'LLY': 418, 'ENLAY': 7367, 'ZNH': 1488, 'ECL': 226, 'ES': 523, 'LIN': 191, 'ASR': 199, 'UNP': 206, 'SRE': 219, 'EIX': 365, 'KMB': 175}
Leftover Fund value for the maximum Sharpe portfolio is $49.37
g_ef.Efficient_frontier()
selected_EF_3.port_stat()
Annual Return 0 0.21
dtype: float64
Stdev 0 0.14
dtype: float64
Sharpe Ratio 0 1.53
dtype: float64
selected_EF_3.CI_test()
95% bootstrapped confidence interval = [0.00031276 0.00127704] %
------------------------------------------------------------------------------------------------------------------------
95% theoretical confidence interval = [0.00031673570025221077, 0.0012723064923422641] %
selected_EF_3.CI_test_benchmark()
95% bootstrapped confidence interval = [-5.77980139e-06 9.38373429e-04] %
------------------------------------------------------------------------------------------------------------------------
95% theoretical confidence interval = [3.457662116241071e-06, 0.0009379367291780653] %
selected_EF_5 = EF_benchmark5(df_O[Selected_Stock], Selected_Stock,benchmark_ESG3, benchmark_ESG4,benchmark_ESG5,benchmark_ESG6,benchmark_ESG7, _5benchmark)
S_EF = selected_EF_5.cumulative_return()
2.591151580420214
selected_EF_3.hypothesis_testing()
Difference in mean return (SPY, ESG):
0.032382390065008426
p value is 0.34233023152277464
The difference in mean return is not significantly different (failed to reject null hypothesis)
selected_EF_5.hypothesis_testing()
Difference in mean return (SUSA, ESG):
0.034535902085077994
p value is 0.31186998049045117
The difference in mean return is not significantly different (failed to reject null hypothesis)
selected_EF_5.port_stat()
Annual Return 0 0.21
dtype: float64
Stdev 0 0.14
dtype: float64
Sharpe Ratio 0 1.53
dtype: float64
a = {'NEE': 0.0652, 'LIN': 0.03926, 'ENLAY': 0.05335, 'ES': 0.04204, 'SRE': 0.03119, 'UNH': 0.07758, 'TMO': 0.06344, 'WST': 0.07308, 'RMD': 0.07704, 'LLY': 0.05329, 'TSM': 0.07588, 'TXN': 0.06643, 'ECL': 0.04255, 'AMAT': 0.06896, 'KMB': 0.02258, 'UNP': 0.0358, 'FDX': 0.0, 'ASR': 0.03637, 'EIX': 0.02486, 'ZNH': 0.05111}
a = pd.DataFrame.from_dict(a,orient='index')
a = a.rename(columns={0: 'Weight'})
a = a.sort_values('Weight', ascending = False)
a.head(10)
b = df_I.loc[Selected_Stock]
test = b[['ESG_C_S','Env_S','Soc_S','Gov_S','ESG_Controversial','Industry']]
test1 = round(test.groupby('Industry').mean(),2)
test1
test1_2 = test1.mean()
test1_2 # portfolio
c = df_I
test = c[['ESG_C_S','Env_S','Soc_S','Gov_S','ESG_Controversial','Industry']]
test2 = round(test.groupby('Industry').mean(),2)
test2
test2_2 = test.mean()
test2_2 # overall (98) companies
features = ['Combined \n Score','Environment','Social','Government','Controversial','Combined \n score']
def plot_rader(value1, value2, features, value1_name = 'Portfolio',value2_name = 'Overall', rader_size = 10):
# 使用ggplot的绘图风格
plt.style.use('ggplot')
# 构造数据
values = value1.values.tolist()
values2 = value2.values.tolist()
feature = features
N = len(values)
# 设置雷达图的角度,用于平分切开一个圆面
angles=np.linspace(0, 2*np.pi, N, endpoint=False)
# 为了使雷达图一圈封闭起来,需要下面的步骤
values=np.concatenate((values,[values[0]]))
values2=np.concatenate((values2,[values2[0]]))
angles=np.concatenate((angles,[angles[0]]))
# 绘图
fig=plt.figure()
ax = fig.add_subplot(111, polar=True)
# 绘制折线图
ax.plot(angles, values, 'o-', linewidth=2, label = value1_name, color = 'limegreen')
# 填充颜色
ax.fill(angles, values, alpha=0.25,color= 'limegreen')
# 绘制第二条折线图
ax.plot(angles, values2, 'o-', linewidth=2, label = value2_name, color = 'darkgreen')
ax.fill(angles, values2, alpha=0.25, color = 'darkgreen')
# 添加每个特征的标签
ax.set_thetagrids(angles * 180/np.pi, feature)
# 设置雷达图的范围
ax.set_ylim(0,rader_size)
# 添加标题
plt.title('ESG_Rader')
# 添加网格线
ax.grid(True)
# 设置图例
plt.legend(loc = 2, prop={'size': 7.4})
# 显示图形
plt.show()
plot_rader(test1_2, test2_2, features, 'Portfolio', 'Overall',10)
plot_rader(test1.loc['Energy'], test2.loc['Energy'], features, 'Portfolio', 'Overall',12)
plot_rader(test1.loc['Medical'], test2.loc['Medical'], features, 'Portfolio', 'Overall',12)
plot_rader(test1.loc['Manufacturing'], test2.loc['Manufacturing'], features, 'Portfolio', 'Overall',12)
plot_rader(test1.loc['Logistic'], test2.loc['Logistic'], features, 'Portfolio', 'Overall',12)
# Proportional to Industrial
I = pd.DataFrame(df_I.loc[Selected_Stock][['Year','Industry']])
I_1 = I[I['Year']=='2014-12-31 00:00:00']
I_2 = a.merge(I_1["Industry"], how = 'inner', left_on = a.index, right_on = I_1.index)
I_2 = I_2.groupby("Industry").sum()
# Pie chart, where the slices will be ordered and plotted counter-clockwise:
labels = I_2.index
sizes = round(I_2['Weight'],2)
fig1, ax1 = plt.subplots()
ax1.pie(sizes, labels=labels, autopct='%1.1f%%',
shadow=False, startangle=90, colors = ['darkseagreen','olivedrab','green', 'mediumseagreen'])
ax1.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle.
plt.title("Portfolio Weighting by Industry")
plt.show()