Predicting heart disease using machine learning
Approach: 1.) Problem Definition 2.) Data 3.) Evaluation 4.) Features 5.) Modelling 6.) Experimentation
1. Problem Definition
Given clinical parameters about a patient, can we predict whether or not they have heart disease?
The original data came from the Cleveland data from the UCI Machine Learning Repository.
If we can reach 95% accuracy at predicting whether or not a patient has heart disease during the proof of concept, we'll pursue the project.
Create a data dictionary
- What question(s) are we trying to solve?
- What kind of data do we have and how do we treat different types?
- What's missing from the data and how do you deal with it?
- Where are the outliers and why should you care about them?
- How can you add, change, or remove features to get more out of your data?