Exploratory analysis of pokemon data!
We have 801 rows of Pokemon across 7 generations of games. There are 40 columns for each Pokemon, the vast majority of which are that Pokemon's type effectiveness damage taken from other types.
First, let's visualize the number of Pokemon added per each generation. We're interested in seeing how each generation's makeup of Pokemon changes.
How does pokemon height and weight correlate to its base stats?
Pokemon have 6 stat values ['hp, 'attack', 'sp_attack', 'defense', 'sp_defense', 'speed'] We would like to know if these stats are in any way correlated to the base height or weight of the pokemon.
This is a good opportunity to produce a heatmap visualization to clearly see how those variables correlate.
Results
After looking at our heatmap, it is clear that there is a correlation between the height and weight of a Pokemon. Logically, this makes sense. We would assume as a Pokemon is taller it is heavier.
There is also a weak to moderately-strong correlations between height and weight and the various combat stats. This makes sense because bigger Pokemon tend to be the more evolved and stronger versions of smaller Pokemon.
We had to drop 20 rows to get the above heatmap due to NaN's in the height and weight columns. We take a look at the Pokemon that we had to drop here. As you can see, the vast majority of them are from Generation 1.
Legendary Pokemon may skew the results.
Legendary Pokemon may skew the results of our correlation. This is because their total base stats are much higher than regular Pokemon but may not necessarily be much different in size. (Some are incredibly huge such as Primal Kyogre, but some are also incredibly small such as Jirachi)
Therefore, we will try to calculate coefficients again, but this time splitting Legendary and non-Legendary Pokemon into different groups. Hopefully this will lead to stronger correlation values in either of the two groups.
There are 70 legendary Pokemon in our dataset of 800 Pokemon. If we calculate the correlation coefficients again with these two datasets separately, maybe our return coefficients will be stronger.
First let's try producing the correlation heatmap for non-legendary Pokemon.
Results
Subsetting by Legendary Pokemon shows that Legendary Pokemon have much weaker correlations between their size and their actual power. Notably, the correlation coefficients between height_m and defense and speed are very close to 0, as well as the correlation coefficients between weight_kg and attack, sp_attack, and sp_defense.
The only exceptions are between height_m and hp, indicating a moderately-strong correlation that taller Legendary Pokemon tend to have more health points, and between weight_kg and speed, showing a weak-to-moderately-strong correlation that heavier Pokemon tend to be slower.
Most interestingly, subsetting by legendaries did not appear to improve the correlation between size and power in Non-Legendary Pokemon. In many cases, these correlations got weaker instead.
Dashboard
We want to create a dashboard that shows you Pokemon that closest to your inputed height and weight, including showing an image of said Pokemon.
We will need to pull requests from PokeAPI to retrieve these images. We will append the links of these images to our existing pokemon_data DataFrame so that we do not have to make a request to the API everytime a user enters their height and weight.