In this project, we'll be analyzing statistics about all of the countries on Earth. The main aim is just to practice some SQL, so we'll be limiting our exploration to:
Also note that you won't be able to run this project on Deepnote. I completed it locally and manually uploaded it to Deepnote to publish it. Silly I know, I'm building a Jupyter publishing service so I no longer need to do this.
We'll be working with SQL data from the CIA World Factbook.
The Factbook contains demographic information like the following:
population
: The global population.population_growth
: The annual population growth rate, as a percentage.area
: The total land and water area.Let's preview some of the information:
Let's run some SQL queries to explore general population and population growth information
From these queries, we can see a few interesting things:
0
.7256490011
(7.2 billion)Let's see what who these guilty countries are
It looks like the database table contains a row for the World
, which explains the 7.2 billion.
Similarly, having a row for Antarctica
explains the population value of 0.
Let's find the average population and area for a country, excluding World
which would bias the results
This highlights that the average country population size is 32,242,666
million, while the average area is 555,093
sq km.
Given that we know the average population and area, let's find densely-populated countries. We'll consider them densely populated if they:
population
size greater than the averagearea
less than the average