Assignment 1 done by:
Emil Holmsten - Time spent on the assignment: 15h
Frida Bilén - Time spent on the assignment: 15h
a. We assume that the dataset we obtained from ourworldindata.org is correct. Both life expectancy and GDP per capita came from the same dataset so we did not combine two different datasets ourselves. The data was obtained from: Our World In Data - Life Expectancy vs GDP per capita. The dataset contains data from the years 1800 to 2016, and for the discussions of the results, the most recent data (2016) is considered. However, three different years (2010, 2015 and 2016) were compared to confirm that 2016 was representable for the last few years in the dataset.
b. We assume that the combined graph of multiple years from recent times can be used to draw general conclusions about the relation between GDP per capita and Life Expectancy in the modern world.
From the data we hypothesise that higer GDP per capita is related to:
- Higher quality of life
- Access to food and clean water
- Safe housing and shelter
- A functioning government and society
- Access to healthcare
- Reuced child mortailty Source: Child mortality vs GDP per capita - our world in data
All of these factors and probably others could be related to better health and longer life.
GDP per capita and life expectancy seems to be roughly logarithmically related, as can be seen in the graph below, with an increased life expectancy as GDP per capita increases. Below 30000 GDP per capita life expectancy increases drastically with increasing GDP per capita.
Our hypotheses about why a higher GDP per capita could increase life expectancy seems reasonable.
c. We selected and plotted data for three years from modern times, excluding the rest of the data, which initially included data from the 1800s. A plot including all of this data is similar to our plot of three years, indicating that the data is valid for drawing more general conclusions. We cleaned the data so that only countries were included, and thereby removing data that was for continents or other groups of countries. We also removed some columns we had no need for and renamed two that had annoyingly long names since they made the data less readable.
g. When comparing the previous graph (of life expectancy vs GDP per capita) with the one below (life expectancy vs GDP) we can se that high GDP does not correlate nearly as well with life expectancy as GDP per capita does. So even if GDP might be a better measure of a strong economy when comparing the economic impact the country might have in the world, the GDP per capita better describes how the economy will impact the citizens, and therefore has a better correlation with the life expectancy.
d. The countries with a life expectancy higher than one standard deviation above the mean can be seen in the list output from the code below:
e. The countries with a life expectancy higher than one standard deviation above the mean and a GDP lower than the mean can be seen in the list output from the code below:
g.2. When comparing the plots above and below we can see that those countries with High life expectancy and GDP below mean all have very high GDP per capita. They seem to be mostly small and wealthy countries such as Sweden.
f. As we can see below there are countries with strong economies (indicated by a high GDP) and low life expectancy.
There are also countries with high GDP per capita but low life expectancy. It seems to us that inequalities in the distribution of wealth is the best explanation. We discuss this further in Part 2 and for question g.
a/b. Below you can see a visualization of the national average of self-reported life satisfaction versus GDP per capita. The countries with the lowest life satisfaction also have a quite low GDP per capita, which would indicate that these can be related. After a certain level however, the GDP per capita does not seem to increase the hapiness of the population, and many of the countries with a GDP per capita below 20k$ still has a life satisfaction score above 6.
A dataset with economic inequality versus GDP per capita can also be seen below. For this dataset, data for the year 2010 is used since it is relatively recent, but has more available data than more recent years. We assume the data for 2010 is still recent enough to make conclusions that can be applicable today.
These two datasets together with the dataset used in part 1 will be used to answer the questions below:
Is a good economy correlated with life satisfaction?
A good economy (here determined by GDP per capita) seems to be strongly correlated with higher life satisfaction which makes sense on an intuative level. To gain deeper knowlege more extensive analysis would be required.
Is life expectancy correlated with life satisfaction?
Life expectancy seems to be related to life satisfaction in the same way as a good economy. It might not be the case though that one of them causes the other, we believe that it is more likely that a good economy is causing both life expectancy and life satisfaction to rise independently, meaning that the life expectancy-life satisfaction is problably correlated but not causational.
Does economic inequality have a negative impact on life expectancy or life satisfaction?
Our hypothesis was that economic inequality would be correlated to life expectancy and life satisfaction, and that this inequality could explain why some countries with a high GDP per capita still has a relatively low life expectancy or life satisfaction. From visualizing the economic inequality vs GDP per capita, it can be seen that the economic inequality is generally decreasing when GDP per capita increases, and the countries with a high GDP per capita but low life satisfaction or life expectancy could not be explained by having a high (>Std1) economic inequality. Therefore something else must play a role in this.
As can be seen below life expectancy seems to be correlated with life satisfaction.
Our assumption made in Part 1 that Low life expectancy and high GDP per capita countries have high economic inequality (high Gini coefficient) seems to be incorrect or inconclusive since the data is not available for all of the countries with high GDP and low Life expectancy.
We want to compare the economic inequality (Gini coefficients) with the life satisfaction of the countries. Therefore, the life satisfaction data of 2010 will be used so that it is the same year as the data of economic inequality.