Evaluating Data Drift and ML Model Performance with Evidently and Deepnote
How to quickly build beautiful interactive reports on your model
Why ML monitoring matters
The work of a data scientist does not stop when you develop and deploy a machine learning model. Once the model is up, you need to ensure that it performs as expected.
A lot of things can go wrong here.
To detect and resolve such issues on time, one needs to have visibility into model performance. You can do this by having regular performance checks on top of the model application logs.
Here is how you can do this in Deepnote, using the Evidently open-source library.
Data Drift: early monitoring of the model performance
1. Let’s open the Deepnote notebook and install Evidently:
2. Next, we import a few libraries. We want to evaluate the Data Drift, so we choose a particular Tab we need.
3. As a test demonstration, we’ll use the Breast Cancer dataset from sklearn. To analyze your specific model, you can swap this for the actual model prediction logs.
4. Now, let us calculate the drift report! All we need to do is to specify what we compare to what. In practice, one might want to compare the current production data to the data you used in training or compare data for two different periods, for example, the latest month for the previous one. In our example, we will treat the first 200 rows as reference data:
5. Let’s have a look! We can display the results of the analysis directly in Deepnote:
Collaborative reporting
Detecting the change is one part of the puzzle.
To investigate and resolve the root cause of the model decay or interpret the data shift, one often needs to share the model performance with other stakeholders. Since Deepnote allows easy sharing of the interactive notebooks, you can use this as a tool for collaborative debugging of the model performance or to present and share the current model results.