Deepnote
IntegrationsPricingFor TeamsFor EducationJoin usDocs
Log in
← all posts

Announcements

–– by Simon on February 16, 2021

Reactive notebooks in Deepnote

Introducing reactivity and how it solves the main challenges of data science notebooks

Traditional computational notebooks like Jupyter receive a lot of criticism. It’s
too easy to create a non-reproducible notebook with out-of-order
execution or hidden state. We believe that reactive notebooks address
some of these issues head on. This article outlines how reactivity works
in Deepnote.

Reactive notebooks are like Excel spreadsheets — you just input the data and
code, and the whole document updates itself accordingly. No need to
click to manually run cells. This makes reactive notebooks easier to
reason about and more reproducible.

If you want to try out a reactive notebook for yourself, just open the demo project

Key challenges of notebooks today

Out-of-order execution

Traditional notebooks are executed cell-by-cell, allowing for cell to be executed
out-of-order or even repeatedly. While this is very powerful, it also
has the potential to create hidden state that is extremely difficult to
reason about.

Let’s take a look at an example:

1*MMGpqLFwiR05nSckXpZi2g.png

Here, the cells were executed in the order: 1st, 2nd, 3rd, 2nd. This created
an out-of-order computational narrative that is very unintuitive.
Someone who just reads the notebook (e.g. a supervisor or a colleague)
usually does it in a top-to-bottom fashion, so notebook executions are
usually interpreted to happen in this order, too.

More hidden state

Every time a notebook is edited but not executed, it becomes stale, as
executing it again would create different outputs than the ones already
present. To build on the above example, let’s say I delete the last
cell, and the notebook now becomes this:

1*BOb8hrTbJx6NXLQ8LEFpYw.png

This notebook obviously isn’t reproducible anymore, even if we account for out-of-order execution.

Iterating on notebooks in this manner is extremely common when doing exploratory programming, which is what notebooks are intended for. So these kinds of issues are very common.

What is reproducibility?

A notebook is reproducible when someone else can take all of its code,
run it on a different computer than the author’s, and reliably get the
same results — in this case, the same cell outputs. Reproducibility is
very important in science, where it’s a part of a standard peer review
process. Since data science is also a science (as the name suggests),
it’s very important for any data science work to also be reproducible.

A recent paper found that only 25% of Jupyter notebooks on Github were reproducible.

It is not easy to create a notebook that is reproducible — it’s not just
about executing your notebook from top to bottom. A reproducible
notebook also includes a reproducible environment, and code that is
deterministic. These are all the issues we’re solving at Deepnote.

Reactive notebooks

A reactive notebook is a notebook that is always kept up-to-date.
Whenever its code is changed or a cell is deleted or moved, the
notebook’s outputs are automatically updated as if the notebook was
executed fresh, from top to bottom.

This is what reactive notebooks look like in Deepnote:

At
the same time, reactive notebooks also aid exploratory programming by
making iteration loops tighter. When building e.g. charts, updating the
code automatically updates the visualizations so there’s an instant
feedback loop on the changes in the code and data. This truly makes
reactive notebooks a powerful tool for exploration.

The ceiling of reactivity

Reactivity has its limitations. While a reactive environment is a perfect model
for small and fast scripts, there are situations in which a reactive
notebook doesn’t shine: e.g. when training a model on a large dataset.
Training a non-trivial model can often take a few minutes or even hours,
so a reactive notebook wouldn’t have enough time to update itself
between edits.

Once your use case becomes complex and your bottleneck is not the notebook
but the actual execution of code, you can simply opt out of reactivity
and go back to a conventional notebook.

At Deepnote, we start all new projects with reactive execution turned on
to speed up the iteration cycles and feedback loops. As the projects
grow, the reactivity, users can easily switch to the traditional
non-reactive environment and execute each cell manually.

Reactive notebooks are ✨

Everyone who works with a computer is familiar with spreadsheet tools like
Excel. This makes it the most popular computational document. It
contains cells with data, cells with code, and it’s a reactive document —
everything is kept up-to-date after edits. Data science notebooks
deserve the same level of convenience and ease-of-use.

You can give Deepnote reactive notebooks a spin today! We’ve recently opened up our public beta so sign up and try it for free.

This article was originally published on Deepnote.

Made with 💙 by the Deepnote team. Deepnote is a new kind of data science
notebook — Jupyter-compatible with real-time collaboration and running
in your browser. Try it out for free.

Share this post

Twitter icon

Simon Sotak

Get started with Deepnote today - for free.

Get started with Deepnote
Deepnote
Use cases