By Megan on October 17, 2023
IPYNB to PDF: The easiest way to export your Jupyter notebooks
Why did the data analyst convert .ipynb to PDF? Because they wanted to file it away for later.
Painful joke? Absolutely. As painful as going back and forth swapping file types during your exploratory data analysis? Debatable.
Data professionals spend a large chunk of their days transforming analyses into more business-friendly formats (i.e., wasting valuable time). Let’s look at why that is and how to fix it.
Why we convert .ipynbs to PDFs in the first place
Why are we converting Jupyter notebook files into PDFs to begin with?
Because data work usually ends up as a deliverable. And the people waiting on it aren’t always data people.
Non-data folks (aka your average business stakeholder) don’t have Jupyter installed on their computers. So when it comes time to wow them with a new insight, we’re forced to put our analysis into a format they can work with. Whether it’s a chart, a dashboard, or a full-on manifesto about why it’s time to change an existing business strategy, these deliverables are where data exploration turns into data collaboration.
So the real question becomes: Is a static PDF the best way for teams to collaborate?
When easy isn’t so easy
Converting a .ipynb file into a PDF seems reasonable (at the very least, it feels more professional than sending screenshots over Slack).
PDFs can be opened on nearly every device, so file compatibility issues go away. But so does all interactivity. Notebooks allow you to manipulate code, visualize data, and use text to walk others through your analysis. They’re built to be interactive. But PDFs are static. Code can’t be executed, charts can’t be updated, and context can’t be added.
Sure, it’s rare that a business stakeholder will want to play around with your code. But it’s also rare for exploratory analysis to hit the bullseye on the first try.
Data collaboration is a process — one that happens early and often for data teams. Multiple stakeholders are generally involved, each with opinions on what they’re trying to achieve and how to get there.
Just ask the data team at audience interaction platform Slido, which struggled to deliver metrics to business teams in an efficient and user-friendly way. Data professionals would bring business users into the fold at the end only to discover that their work didn’t cut it.
“Since metrics require a lot of input from subject matter experts, data consumers, and business stakeholders to define and align on definitions, we needed a collaborative layer where we could get immediate feedback,” said Slido’s Head of Analytics Engineering Michal Koláček.
Without that collaborative layer, the normal workflow looks something like this: You convert your notebook into a PDF and share it. Questions, concerns, and suggestions start flooding in across email and Slack. You go back to the drawing board to query, code, visualize, and contextualize. You rerun the notebook, re-export it, resend it, and hope it hits the mark. And maybe it does! Or maybe there’s a speed bump: a time series is off, a metric has been defined incorrectly, a different type of chart is required.
This creates a frustrating feedback loop. Iteration isn’t fast and smooth — it’s slow and complicated. And after everyone finally agrees on the final product, the data is stale. Time to start the whole process over again in order to refresh it.
Choose a solution over a stopgap
The breakdown in data collaboration is why many companies turn to business intelligence platforms, but BI tools come with their own problems.
However, they’re built for productionized projects, not rapid, iterative exploration. They’re no help for your typical ad hoc request.
Then there are the technologies that attempt to bridge the gap between notebook files and PDFs. GitHub lets you host notebooks for others to view, but again, they’re static (and not too many business stakeholders have a GitHub account). You could use tools like nbviewer and Binder to display notebooks, but true collaboration (e.g., pair programming, commenting, etc.) isn’t possible. You’re just sharing a PDF by a different name.
The goal is to remove as many barriers as possible between data exploration, data collaboration, and data sharing. And that’s where modern, cloud-based data notebooks come in.
Cloud-based notebooks aren’t stuck on a single machine — they’re shared across your team. You don’t have to switch back and forth between different formats to view, comment, code, or share. The medium you use to do your work is the same one you use to collaborate and deliver the final product.
Workspaces give teams a single place to organize notebooks. Permissions give teams a way to maintain role-based access to them. And the notebooks themselves can be shared as links or published as articles, dashboards, and applications. All the normal back-and-forth tweaking can be done in real time in a shared environment (or asynchronously via comments).
A fully accessible and interactive medium is miles better than a static document (not to mention much faster). Sharing work in an accessible format is good, but being able to do it when it matters is great.
The ability to bring everyone into a single space during the exploratory phase of data analysis is what drives companies — including no-code website development platform Webflow — to use modern notebooks.
“Most other tools are made for the final version of a product — they skip past intermediate steps where feedback and buy-in are critical,” said Webflow’s Senior Manager of Data Science & Analytics Allie Russell. “To be able to bring people along with the data work, especially remotely, is hugely valuable.”
And it’s not just business stakeholders that benefit. Shared environments open up the door to much faster and easier collaboration between technical team members as well (no more downloading .ipynb files, firing them up on your machine, and immediately running into environment configuration roadblocks).
“Currently, I have a team member on leave,” Russell said. “Deepnote allows me to look into her work without having to understand her environment or running into a bunch of errors, and that’s pretty powerful.”
When your goal is uncovering insights that will power the entire company, collaboration is not just a nice-to-have — it’s a must-have. Tools and processes that slow you down, create unnecessary complexity, or force you into suboptimal workflows shouldn’t be part of it.
Remove the monkey wrench and rely on tools that are built from the ground up for collaboration: modern, cloud-based data notebooks.
Get off the IPYNB to PDF merry-go-round with Deepnote
Get started for free to see how you can hop off the PDF crazy train and achieve true data collaboration.
Share this post
Data Advocate @ Deepnote
Try Deepnote today
Start on a Team plan trial to see what Deepnote can do. On a big team? Book a call with us!