Klib + Deepnote
Klib is a python library that performs data cleaning tasks or analysis easily.
Here I'm going to show some klib functions to show you, how fast a task is done.
Importing the Data
Checking for missing values
All black lines correspond to missing values.
Klib Data Cleaning
Klib.data_cleaning() drop duplicates & empty rows/cols, adjust dtypes,..., in one single command.
Sometimes, some aditional cleanning is required.
Merging Datasets with Pandas
In order to analize correlations between features and the target, we should merge all relations, in this case, based on 'id'.
Klib Correlation Analysis
The last usefull Klib feature that I used is the Pearson Correlation.
From the correlation plot we can notice that the churn is somehow correlated with cons_12m, cons_gas_12m, cons_last_month, pow max, margin_gross_pow_ele, and margin_net_pow_ele.
But some prior modelling and feature importance is needed in order to make an accurate assumption.
There are other Klib features to test, especially the ones oriented to pipelines, feature selection and modelling.
Hope you discovered something useful here!