Analyzing Country Statistics Using SQL
In this project, we'll be analyzing statistics about all of the countries on Earth. The main aim is just to practice some SQL, so we'll be limiting our exploration to:
- Population and population growth
- Average population and area
- Densely-populated countriies
Also note that you won't be able to run this project on Deepnote. I completed it locally and manually uploaded it to Deepnote to publish it. Silly I know, I'm building a Jupyter publishing service so I no longer need to do this.
We'll be working with SQL data from the CIA World Factbook.
The Factbook contains demographic information like the following:
population: The global population.
population_growth: The annual population growth rate, as a percentage.
area: The total land and water area.
Let's preview some of the information:
Population and Population Growth
Let's run some SQL queries to explore general population and population growth information
From these queries, we can see a few interesting things:
- There's a country with a population of
- There's a country with a population greater than
Let's see what who these guilty countries are
It looks like the database table contains a row for the
World, which explains the 7.2 billion.
Similarly, having a row for
Antarctica explains the population value of 0.
Average Population and Area
Let's find the average population and area for a country, excluding
World which would bias the results
This highlights that the average country population size is
32,242,666 million, while the average area is
555,093 sq km.
Given that we know the average population and area, let's find densely-populated countries. We'll consider them densely populated if they:
- have a
populationsize greater than the average
- have an
arealess than the average