Bringing ClickHouse performance to the comfort of data science notebooks
ClickHouse is an open-source OLAP database that's 100-1000x faster than traditional databases, allowing users to handle thousands of sub-second queries per second on petabyte-scale datasets. When we first took ClickHouse for a spin using Deepnote, we were blown away by the performance. After experiencing lightning-fast queries in a notebook environment, we knew we needed to build an integration to bring the power of ClickHouse to Deepnote.
That integration is now here! With Deepnote and ClickHouse, data teams can efficiently query very large datasets, extract relevant data, and start analyzing and modeling data—all within the comfort of their notebook environment.
In this article, we'll walk through what makes this experience so unique and show you how to leverage Deepnote's native ClickHouse integration to perform analytics at scale.
If you’re eager to jump right down the deep end, here is a playground Deepnote notebook with a ClickHouse integration set up for you!
Query with SQL and analyze with Python in one place
Deepnote provides first-class support for SQL. This means you get to query your ClickHouse database right from your notebook. Transitions between Python and SQL are seamless as there’s no need for a Python connector. With Deepnote, you get all the bells and whistles of a SQL editor right in your notebook, including formatting, autocomplete, and linting.
Using ClickHouse SQL blocks, you can write queries, save the results into Pandas DataFrames, and visualize it all in one go. And if you're feeling bold, you can even inject Python variables into your SQL queries with jinjasql. This allows you to build out more complex queries with conditionality, for loops, and more. If you'd like to venture beyond SQL, you can also use the automatically stored Pandas DataFrame and perform some advanced analytics.
Explore ClickHouse data with no code and a dash of ✨
Thanks to some built-in magic, a great deal of the exploratory work in Deepnote happens without writing code. This allows for rapid exploration and prototyping on top of your ClickHouse data. With Deepnote's built-in DataFrame viewer, we can quickly examine the dataset for missing values, the most common categorical values, distributions of numeric columns, and more. The built-in filters and sorting make it easy to gain a deeper understanding of the data and relationships that might impact our future model's predictions.
Built-in no-code charts allow us to examine our target variable as a function of other features without having to write any additional code. We can seamlessly switch from code to visualizations, and go right back into querying as needed.
Present your insights
Next, if you want to turn your queries, code, and charts into impactful data products, ClickHouse and Deepnote have you covered. While you can easily share your notebook with one click, you can go a step further and make your work more accessible and interactive for anyone, even without code.
The goal is to turn your notebook into a simple yet powerful data app. You can do so by parameterizing your notebook to introduce interactive user inputs and schedule notebooks so that you're always pulling in fresh data from ClickHouse. Once that's done, all you need to is publish your work.
Keep your ClickHouse data secure
With Deepnote’s native ClickHouse integration, your connection is always secure and allows us to easily query data without exposing confidential data. We can configure the connection at the Workspace level and decide whether to make it available to all members and shared projects or just a specific project. As soon as the connection is set up, your ClickHouse data is secure, and we don’t have to worry about re-configuring things again.
Connect to your own ClickHouse instance using the
Integrationsmenu within your Deepnote workspace, or just use the playground credentials in our template.
In our workflow, we used a Deepnote notebook as a control plane for our ClickHouse instance. If we wanted to create a fully closed loop, we could have pulled the data into Deepnote to train a machine learning model. From there, we'd deploy our model back into ClickHouse while using Deepnote's data apps to visualize and share the results.
Share this post
Join the world's best data teams and get started with Deepnote
No credit card required. Run your first notebook in seconds.