Evergreen Learning Path Dynamic Prediction
Author: Sandra Nguyen
The purpose of this notebook is to explore a machine learning workflow from start to finish with mock data in preparation for real-world game data from the Unity embed. Extra effort is spent on the feature engineering to ensure logical and accurate labels, as well as a well-prepared dataset for modeling. Random Forest is selected as a robust classification algorithm, and care is taken to prevent overfitting with techniques of hyperparameter tuning and K-fold cross-validation. An impressive score of 99.7% accuracy is achieved for the aggregate data across all 6 domains. The majority of this code is then imported into the project. See the GitHub for the full app.
Import Libraries
Import jsons
Mock data
generated w/ careful prompt engr from chatgpt lol
Preprocessing & Feature Engineering
Initial
Secondary
Seasoned
Domain concatenation & Restructure df
FIll NaN
Label Encode
x, y
Normalize
Classification Modeling
X, y objects
K-fold cross-validation
Evaluation
Per domain
Fully
Prediction
To-be implemented in project