In this project, we'll be analyzing statistics about all of the countries on Earth. The main aim is just to practice some SQL, so we'll be limiting our exploration to:
Also note that you won't be able to run this project on Deepnote. I completed it locally and manually uploaded it to Deepnote to publish it. Silly I know, I'm building a Jupyter publishing service so I no longer need to do this.
We'll be working with SQL data from the CIA World Factbook.
The Factbook contains demographic information like the following:
population: The global population.
population_growth: The annual population growth rate, as a percentage.
area: The total land and water area.
Let's preview some of the information:
Let's run some SQL queries to explore general population and population growth information
From these queries, we can see a few interesting things:
Let's see what who these guilty countries are
It looks like the database table contains a row for the
World, which explains the 7.2 billion.
Similarly, having a row for
Antarctica explains the population value of 0.
Let's find the average population and area for a country, excluding
World which would bias the results
This highlights that the average country population size is
32,242,666 million, while the average area is
555,093 sq km.
Given that we know the average population and area, let's find densely-populated countries. We'll consider them densely populated if they:
populationsize greater than the average
arealess than the average