The Impact of Education on Economic Growth in Asia
1. Introduction
2. Methodology&Data
2.1 Methodology
2.2 Data
# Import data
df = pd.read_csv("https://raw.githubusercontent.com/quarcs-lab/mendez2020-convergence-clubs-code-data/master/assets/dat.csv")
# Import data from DataBank World Development Indicatior Database
df_wb= pd.read_csv("/work/world-bank.csv")
# Merge dataset
df1 = pd.merge(df,df_wb)
# Select data of countries in Asia
df2 = df1[['country','year','region','log_GDPpc','s','lpr','lp','hi1990']].query("region==['Asia']")
2.3 Descriptive statistics
df2.describe().round(2)
This chart is empty
Chart was probably not set up properly in the notebook
3. Exploratory data analysis
3.1 Data Visualization
px.line(df2[df2['region']=='Asia'],
x = 'year',
y = 's',
color = 'country',
hover_name='country',
facet_col= 'hi1990',
template="simple_white",
labels=dict(s = 'Years of Schooling',
hi1990 = 'Is a high income country?'),
title = "The evoluation of years of schooling in Asia between 1990 and 2014")
px.scatter(df2.query("year == 2014"),
x = 's',
y = 'log_GDPpc',
color = 'country',
hover_name = 'country',
trendline = 'ols',
trendline_scope = 'overall',
labels=dict(s = 'Years of schooling',
log_GDPpc = 'GDP per capita in log'),
template="simple_white",
title = "Does years of schooling affect GDP per capita? (2014)"
)
px.scatter(df2.query("year == 2014"),
x = 'lpr',
y = 'log_GDPpc',
color = 'country',
hover_name = 'country',
trendline = 'ols',
trendline_scope = 'overall',
labels=dict(lpr = ' Labor force participation rate',
log_GDPpc = 'GDP per capita in log'),
template="simple_white",
title = "Does years of labor force participation rate affect GDP per capita? (2014)"
)
px.scatter(df2.query("year == 2014"),
x = 'lp',
y = 'log_GDPpc',
color = 'country',
hover_name = 'country',
trendline = 'ols',
trendline_scope = 'overall',
labels=dict(log_GDPpc = 'GDP per capita in log',
lp = 'Labor productivity'),
template="simple_white",
title = "Does labor productivity affect GDP per capita? (2014)"
)
4. Regression analysis
#Model 1
mod1 = smf.ols(formula='log_GDPpc ~ s', data=df2).fit()
Stargazer([mod1])
#Model 2
mod1 = smf.ols(formula='log_GDPpc ~ s', data=df2).fit()
mod2 = smf.ols(formula='log_GDPpc ~ s + lpr', data=df2).fit()
Stargazer([mod1, mod2])
#Model 3
mod1 = smf.ols(formula='log_GDPpc ~ s', data=df1).fit()
mod2 = smf.ols(formula='log_GDPpc ~ s + lpr', data=df1).fit()
mod3 = smf.ols(formula='log_GDPpc ~ s + lpr + lp', data=df1).fit()
Stargazer([mod1, mod2, mod3])
# summary
smf.ols(formula='log_GDPpc ~ s + lpr + lp', data = df1).fit().summary()