Web Scraping Tables with Python
How to scrape tables on a webpage and create a dataset.
The Internet provides an enormous amount of information that can be used for many different purposes. Many disciplines, such as marketing, medical research, predictive analytics, and investigative reporting can benefit enormously from collecting and analyzing data from websites.
In this article, we will present a Python program that will scrape a Wikipedia NBA Finals webpage and extract some tables needed for a dataset. This webpage can be accessed by selecting the following link https://en.wikipedia.org/wiki/NBA_Finals . There are two tables on the webpage that we will be extracting data from (shown below).
Roadmap
The following steps will be performed using Python.
1. Import the Required Libraries.
2. Find and Select the Tables on the Webpage.
3. Create and Display the Data Frames.
The Program
Objective: Find and extract tables on a webpage and store the data in a dataset.
Import the Required Libraries
Find and Select the Tables on the Webpage
We will read a specific website URL and store the results in a variable. Then find the tables on the webpage that contain the words “Finals appearances” in the header of the table. The len() function is used to return the length of the variable (NBA_tables) which is the number of tables found that matched.
Output: 2
In this case, there were two tables on the webpage whose titles contained the words ‘Finals appearances’.
Create and Display the Data Frames
We will now create a data frame for a selected table in the NBA_tables variable. Since there were two tables found, we can specify either [0] for the first table or [1] for the second table. We will select the first table found on the webpage.
Let's look at the first NBA table from the webpage in a data frame.
Now let's select and show the second NBA table from the webpage in a dataframe.
This article shows how you can easily scrape tables on a web page and create a structured dataset.
Thanks so much for reading my article! If you have any comments or feedback please let me know.
If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. Membership gives you unlimited access to all articles on Medium. You can sign up using this link https://medium.com/@dniggl/membership