– by Jakub on January 23, 2023
Exploratory programming: what it is, why it matters & what it requires
Technical teams are very familiar with software engineering methodologies. After all, many have adopted them as their own.
But data teams are different. The tools and tactics that help us build software aren’t designed for exploring data and sharing insights. Exploratory programming is.
Let’s look at why exploratory programming is tailor-made for data teams — and how we can unlock its value.
The story of exploratory programming
In 1956, during the first wave of artificial intelligence research, people began to realize that the tools they used for writing software weren’t a great fit for work that was, by its very nature, highly exploratory.
Two paradigms have always existed in computer science: one for building and one for exploring. For a long time, there was no need to put a name to them. Then came Beau Shiel.
Shiel was a manager working on Xerox’s AI Systems, and he was running into a problem. He was using tools and methodologies that relied on a linear roadmap, one where each step led toward an expected outcome. But Shiel didn’t know what the outcome was. He didn’t even know what the steps were. Like many data teams today, Shiel wasn’t building. He was exploring.
In 1983, he wrote a paper called “Power Tools for Programmers” and described his work in a new way: exploratory programming.
It may not seem earth-shattering, but you can draw a line straight from Shiel to the rise of data science, the era of the spreadsheet, and today’s business-critical data analyst.
Yet Shiel’s definition of exploratory programming — “the conscious intertwining of system design and implementation” — was just the beginning. A minimum viable product.
In the 2017 paper “Exploring Exploratory Programming,” Mary Beth Kery and Brad A. Myers, both of Carnegie Mellon University’s Human–Computer Interaction Institute, built on it.
Kery and Myers proposed that exploratory programming has two essential features. Simply put, those are:
- Writing code to prototype or experiment
- Allowing the end goal to evolve throughout the process
Sounds familiar, right? When data teams are tasked with an assignment, they don’t always know what they’re going to find — or how to find it. To paraphrase Shiel, sometimes they just have to try and see what works.
There are five characteristics common in exploratory programming:
1. Needs for exploration
Some projects demand exploration. Maybe you’re analyzing data or building a machine learning model. Either way, you’re navigating uncharted territory. And as you dive deeper and iterate, your destination changes.
2. Code quality tradeoffs
The goal of data exploration isn’t optimizing for code quality — it’s optimizing for time to insight. You may go back and polish up your work after the fact, but your aim is to extract knowledge quickly.
3. Ease or difficulty of exploration
The languages, libraries, and tools on hand dictate how much time and effort is required. When you’re exploring data, you use the tools that will help you cut out unnecessary detours and work faster (or at least try to).
4. Exploration process
Data exploration doesn’t follow a straight line — it usually means backtracking to tweak the work in progress. You continually update variables and parameters, run different versions of code, and refer to code history to inform your next steps.
5. Group or individual exploration
Data projects are a team sport. You may start solo, but you’ll eventually need to coordinate your experiments and findings with others. And odds are you’ll face obstacles when doing so. Exploration is messy, and it’s hard to follow its progress as a team.
If you work in data, you probably recognize all these characteristics. They may even describe your average workday. But getting there — actually doing the work — is easier said than done.
Exploratory programming requires the right tools
In their paper, Kery and Myers call out what’s standing in the way of successful exploratory programming:
“Although exploratory programming is prevalent across many applications today, there is currently a lack of tool support for experimentation, including a lack of support for recording and sensemaking of exploration history, and a lack of support for exploration by groups of people.”
Put another way: Data teams don’t have the right tools for the job.
Data professionals have been forced to adopt the software engineer’s toolset. And in the process, they’ve adopted the same frameworks and processes.
Agile software development, team sprints, Git versioning, continuous integration, deployment models — there’s no shortage of best practices for software engineers. They’re great for building software, but not for exploring data.
Software engineers work to ship a product. Data teams work to uncover insights. Sometimes that results in a dashboard or an app. But sometimes projects end as soon as knowledge is uncovered (take a look at the abandoned project graveyard on your average data professional’s computer to see what I mean).
Maybe that knowledge is shared in a slide deck. Or maybe an insight ends up as a screenshot that’s sent through Slack. Or maybe there’s no output at all — just an improved understanding of the business to be stored away and used for a different project.
At the end of the day, software engineers are builders. But data teams are made up of explorers.
The 2019 paper “Supporting Data Workers To Perform Exploratory Programming” makes the case that the tools many data teams use (i.e., IDEs) aren’t up to snuff. Using tools built for software development, data teams end up hoarding and cloning code, as well as struggling with constant context switching.
The result is sluggish progress and limited insights — not to mention the inability to collaborate with teammates.
Data notebooks are built for exploratory programming
Most of the time exploratory programming is trial and error. It’s SQL query after SQL query, visualization after visualization. The tools and tactics of software engineering are too rigid for this kind of work. Data teams can’t operate in a straight line toward a solution because they often don’t know what the solution is.
That’s where a data notebook — a tool built from the ground up for exploratory programming — comes in.
As opposed to traditional code editors, data notebooks allow users to run queries, write code, visualize data sets, and document thought processes as prose, all in a single place.
If data professionals are explorers, notebooks are their compasses, spyglasses, and journals — all in one.
And they significantly lower the barrier to entry, allowing even beginners to start querying and charting in a medium that scales to their needs as they write code and introduce more complex models.
There was a time when data notebooks were siloed, solitary tools, but the future of notebooks is collaborative — user-friendly workspaces that run in the cloud and make real-time collaboration as easy as sharing a link.
Software engineering workflows aren’t designed for the day-to-day work of data teams. They’ve left data professionals trying to fit a square peg into a round hole.
It’s time to embrace exploratory programming. And to do that, data teams need tools that are designed for their unique workflows and will help them get where they need to go — quickly, easily, and collaboratively — no matter their destination.
See how Deepnote supports exploratory programming
Get started for free to see how Deepnote helps you explore, collaborate on, and share data with ease.
Share this post
Join the world's best data teams and get started with Deepnote
No credit card required. Run your first notebook in seconds.