E-Learning | 🧽 Data Cleaning & Transformation
Business Problem
In this project, We will analyze the data of "LearnData", a company in the e-learning sector that specializes in selling online data analysis courses. Their primary objectives are:
Data-Set | Diagram
"LearnData" uses the following tools to manage its business:
The data is originally in CSV format, and it is imported into MySQL to create the database named "learndata_crudo."
Analysis:
Preliminary Analysis:
Data Sources: The company utilizes WordPress with a WooCommerce plugin as the platform for selling its online courses. Additionally, they use Stripe as a payment gateway, handling credit card payments.
Data Format: The raw data is downloaded in CSV format directly from the sources.
Types of Data: The available data includes information about products (courses), customers, orders, and payments received through Stripe.
Data Model: The data is structured into several tables, including an Orders table linked to the Customers and Products tables through SKU_product and id_cliente. Another table is for Stripe payments, which is related to the Orders table using the order number.
Steps for executing the project:
1. Creating a new database named "learndata" along with the tables: "dim_clientes", "dim_producto", "fac_pedidos", and "fac_pagos_stripe".
Run to view results
2. Create the Products table from raw data: Check the data format, change field names if necessary, and insert the fields into the new table.
Run to view results
3. Create the Customers table from raw data: Check the data format, change columns names, convert the "date_created" field from string to date format, extract necessary customer dimensions from the billing field (parse JSON), and insert the fields into the new table.
Run to view results
4. Create the Orders table from raw data: Check the data format, change columns names, replace product names with product IDs, normalize the payment method column, convert the "date_pedido" column to a date format, round decimals in the cost_articulo column to integers, and insert the orders into the table.
Run to view results
Run to view results
Run to view results
Run to view results
Run to view results
Run to view results
5. Create the Stripe Payments table from raw data: Check the data format, change columns names, obtain the order number using the RIGHT function, remove the order number from the description (which will be used for table joins), convert the "created" field to a timestamp, replace commas with periods, convert numbers to decimals with two decimal places, and insert the table into the new database.
Run to view results