PokeAPI Data Analysis
This project is meant to serve as a requests and pandas warmup for similar projects under the HackForLA repository. This specific notebook is a showcase of what we can do with the Pokemon REST API! In this project, we go over - Accessing data via an API GET request - Transforming the output data from json into a pandas dataframe -Quering and aggregating the data with pandas to answer some sample questions We are going to be using the pokemon endpoint for the PokeAPI found here. Documentation on this and the other endpoints are found here.

JSON -> DataFrame Setup
Run to view results
What does the JSON actually look like?
The JSONs are too large to efficiently output in a notebook, but here are the links to visualize its nested structure (tick 'Pretty-print' at the top left for a more organized view)
Outer JSON aka 'name': https://pokeapi.co/api/v2/pokemon?limit=151&offset=0
Example of Inner (each Pokemon's personal) JSON aka 'url': https://pokeapi.co/api/v2/pokemon/1/
Run to view results
Normalize JSON into a pandas df
Run to view results
This fully normalized df isn't particularly useful as there's a sea of information about attributes that are not useful yet. Therefore, we'll filter how we normalize the JSON based on the attributes we need to answer each specific question.
1. Which pokemon that is a grass type has the largest hp stat?
Here, the relevant (sub)fields would be 'name' (the Pokemon), 'types.type.name' (its type), and 'stats.stat.name' (its hp).
First, output needs to be converted to a list of dictionaries, so that we can have each Pokemon be a record when we normalize
Run to view results
Since the JSON is nested, we'll want to use json_normalize's record_path and meta parameters to specify how to flatten the JSON.
Run to view results
Run to view results
Now that we have two resulting df's that have been normalized and filtered to the desired attributes (type and hp), we can merge the tables, joining on pokemon_id, to produce a table that can answer the question
Run to view results
Specifically, the answer to "what Grass type Pokemon has the highest hp?" is given by the first record:
Run to view results
Answer: Exeggutor
2. How many pokemon have poison as one of their types?
Since we already found a way to normalize a df of all Pokemon types, we can simply filter to only those with type 'poison'.
Run to view results
Run to view results
Answer: 33
3. Which pokemon has the fewest available moves?
Let's create another df to normalize the 'moves' dict
Run to view results
Run to view results
Answer: ditto has the lowest number of moves (1).
4. Which pokemon type has the fewest members?
Again, we can reuse the normalized type df
Run to view results
Answer: Steel type
5. How many pokemon are in all 8 generations (yes there are 9 generations but only 8 in this API)?
We should be able to use the sprites.versions sub-dictionary to check whether a given Pokemon has dicts for generation-i to generation-viii. However, unlike the previous questions, we cannot use normalization directly since the 'versions' field contains nested dict's instead of lists, thus record_path won't work.
Run to view results
Answer: Upon further inspection, it seems each Pokemon in the Gen 1 dataset has a folder for each of the other generations, implying that all 151 are in all 8 generations. However, it's possible that some Pokemon's generation folders are empty or that the question was misinterpreted.
Bonus Question: What's the distribution of types across pokemon with the 50 highest HPs?
To answer this, we can normalize the stats and types dictionaries again.
Run to view results
Run to view results
Run to view results
Despite left joining, there are more rows due to the fact that there are pokemon with more than one type. Still, we can check the value counts to see which type has the most members of the top-20 hp club:
Run to view results
Answer:
Run to view results