Lung Cancer
Import Libraries and Dataset
This dataset contains information on patients with lung cancer, including their age, gender, air pollution exposure, alcohol use, dust allergy, occupational hazards, genetic risk, chronic lung disease, balanced diet, obesity, smoking status, passive smoker status, chest pain, coughing of blood, fatigue levels , weight loss , shortness of breath , wheezing , swallowing difficulty , clubbing of finger nails , frequent colds , dry coughs , and snoring. By analyzing this data we can gain insight into what causes lung cancer and how best to treat it
Age: The age of the patient. (Numeric) Gender: The gender of the patient. (Categorical) Air Pollution: The level of air pollution exposure of the patient. (Categorical) Alcohol use: The level of alcohol use of the patient. (Categorical) Dust Allergy: The level of dust allergy of the patient. (Categorical) OccuPational Hazards: The level of occupational hazards of the patient. (Categorical) Genetic Risk: The level of genetic risk of the patient. (Categorical) chronic Lung Disease: The level of chronic lung disease of the patient. (Categorical) Balanced Diet: The level of balanced diet of the patient. (Categorical) Obesity: The level of obesity of the patient. (Categorical) Smoking: The level of smoking of the patient. (Categorical) Passive Smoker: The level of passive smoker of the patient. (Categorical) Chest Pain: The level of chest pain of the patient. (Categorical) Coughing of Blood: The level of coughing of blood of the patient. (Categorical) Fatigue: The level of fatigue of the patient. (Categorical) Weight Loss: The level of weight loss of the patient. (Categorical) Shortness of Breath: The level of shortness of breath of the patient. (Categorical) Wheezing: The level of wheezing of the patient. (Categorical) Swallowing Difficulty: The level of swallowing difficulty of the patient. (Categorical) Clubbing of Finger Nails: The level of clubbing of finger nails of the patient. (Categorical)
Descriptive Statistics
Data Visualization
Correlation
Removing Outliers
Removing observations with Z-scores beyond a certain threshold
Visualize the results