Bridging the exploratory analytics gap with Deepnote for Snowflake
Snowflake offers a cloud data warehousing platform with out-of-the-box features including scalable compute, data sharing, and data cloning. Products within Snowflake’s ecosystem, such as Snowsight, allow analysts to explore and visualize their data for ad-hoc use cases.
But what if you’d like to turn your ad-hoc queries into compelling narratives and easily shareable assets? With Deepnote — now a Snowflake Select Technology Partner — your team can do just that.
In this article, we'll walk you through the key pillars that make the Deepnote and Snowflake experience delightful and show how you can bring both into your toolkit to make your querying and exploration more straightforward, more secure, and more collaborative.
👉 For a quick start, we built out a template project to show you the ropes. Use our playground warehouse or connect to your own Snowflake instance using the
1. Turning your notebook into a SQL editor
Whether you're a seasoned analyst or new to the whole notebooks thing, you’ve likely encountered
SQL Alchemy API, which enables connection to data sources from within your notebook environment. This is a very common way of extracting data and preparing it for manipulation within a DataFrame. However, while generally accepted, it also comes with a few pain points:
Multi-line SQL queries tend to require
\ characters to be appended at the end of each line to help the notebook environment recognize it's all one string to be queried.
Sometimes it's easier to write an overarching query that extracts all records from one table and then break this down in a filtered manner as you proceed with your analysis. This is commonly done to reduce the amount of SQL that needs to be written due to the less-than-ideal experience of writing SQL in a non-SQL-friendly IDE.
Extracting large volumes of data from a Snowflake warehouse and combining it with a standard notebook environment for data manipulation can quickly result in memory constraint errors due to the volume of information required or transformation requests that can strain your notebook environment. This will naturally degrade your notebook experience and likely lead to searching for better options.
Here’s what we did to alleviate these concerns. In Deepnote, there's no need to directly use a connector to extract data from your Snowflake warehouse. Through the power of SQL blocks, in Deepnote you can run queries directly on top of your Snowflake warehouse using your notebook. All you have to do is add your Snowflake credentials and voilà — your notebook is now an interface for exploring your Snowflake data (and in a collaborative way should you choose to do so!).
In addition to a first-class SQL experience, Deepnote helps you eliminate memory constraints. You can simply switch over to a more powerful virtual machine or spin down dynamically based on your compute needs.
2. Supercharging your exploration speed
Combining access to a Snowflake warehouse with the exploratory powers of a notebook means you can write queries, save the results into pandas DataFrames, and visualize them all in one go. You can even inject Python variables into your SQL queries with jinjasql to build out more complex queries and move seamlessly between SQL and Python.
With Deepnote's built-in DataFrame viewer, you can examine the data set for missing values, the most common categorical values, distributions of numeric columns, and more. The built-in filters and sorting make it easy to gain a deeper understanding of the data and relationships that might impact a future model's predictions. Lastly, no-code charts allow you to examine your target variable as a function of other features without having to write any additional code. We can seamlessly switch from code to visualizations and then go right back into querying as needed.
3. Keeping your Snowflake data secure
As data teams move to hosted notebooks, it can present security risks with respect to metadata and authentication credentials. Ongoing Git commits and pushes to shared repositories increase the chance of leaked credentials and potential security breaches.
At Deepnote, we set out to build a notebook experience that lets you collaborate effectively and build better data products faster — all in a secure way. All files in Deepnote are encrypted at rest, whether they're files you create or upload. For sensitive information (such as database integrations or environment variables), we apply a layer of AES-256-CBC encryption before storing them in our database, where the decryption keys are stored separately.
Better yet, you can configure the connection at the workspace level and decide whether to make it available to all members and shared projects or just a specific project. As soon as the Snowflake connection is set up, your data is secure and you don’t have to worry about reconfiguring things. Say goodbye to your multiple JSON files for sharing credentials and simply share access to your project.
4. Unlocking cross-team collaboration
At Deepnote, we believe collaboration is key to the notebook experience. At the end of the day, data work is a team sport. We put a lot of thought into improving in-team collaboration for data teams, but equally, we want to make sure their work is accessible to others: subject matter experts, leaders, and data consumers. The emphasis on collaboration corresponds to different steps in your exploratory journey:
Configure your Snowflake integration once and never worry about it again. The ability to set up an environment for your workspace with granular role-based permissions means that everyone has access to the right things.
The whole exploratory analytics process can be made collaborative. You can share your project in one click, with all the logic, data, and dependencies in one place. You can bring others to share feedback right in the notebook via comments or troubleshoot things together in real time.
While you can easily share your notebook with one click at any point, you can go a step further and make your production-ready work accessible and interactive for anyone. You can turn your notebook into a simple yet powerful data app by parameterizing your notebook to introduce interactive user inputs and scheduling the notebook so that you’re always pulling in fresh data from Snowflake. Once that’s done, all you need to do is just hit publish.
Share this post