dbt Semantic Layer for data notebooks
Today, we’re announcing our native integration with the dbt Semantic Layer in Deepnote (now available in public beta) and its impact on data producers, data consumers, and their experience.
In this post, we will look under the hood of the dbt Semantic Layer, how it’s useful in the context of data exploration and notebooks, and share use cases around how our community and customers are getting value from it.
What is the Semantic Layer?
TL;DR — with the Semantic Layer, you can be confident that your metrics come from a single truth source. And you can reach a unified view of data across the organization and within the data tools used across every team.
As data-ists, our goal is to interpret data in a way that represents reality and helps our organizations make better decisions. Metrics are an essential piece of the puzzle — they help us isolate signal from the noise and rally everyone around what’s important.
Metrics, however, are not static. They evolve as the organization matures. As teams learn through data and grow in size, our definitions of core business metrics and how we consume them quickly change. And as the underlying source data becomes more complex, the chain of logic we apply to convert this data to insights also changes.
An example from within
We can draw upon our own experience at Deepnote to illustrate an example. As we went from pre-seed to series A (from 15 to 55 people), and from a relatively flat organization to cross-functional teams, what used to live in a handful of Amplitude dashboards and Deepnote notebooks is now developed and consumed in a variety of ways.
Our growing go-to-market team introduced 3 new tools (Chartmogul, Endgame, and HubSpot) to capture and predict revenue. We’ve grown from 1 → 4 product teams who report on metrics using Amplitude. And engineers run queries directly on the top of the production databases in Deepnote.
How we define customer lifetime value (LTV) provides a good example of how our core business metrics have changed. This once simple metric now reflects 1) a more complex pricing model, 2) predicts expansion as we’ve accumulated more historical data, and 3) bakes in customer acquisition cost from the channels we’re investing in.
As our metrics of truth evolve, they need to be consistent across the tools our teams use for reporting and decision-making. If our ARR from Chartmogul doesn’t match the ARR in Stripe, what good is it? Which tool should we trust to make decisions? How do we reconcile the two?
This is where the dbt Semantic Layer offers a solution — giving us a consistent, reliable way of defining and consuming metrics. Where before metrics were defined and redefined directly in data science, BI, or data loading tools, they now live centrally in dbt. Metrics defined in dbt are the source of truth and are vetted by our data team. Any changes to the way metrics are defined propagate to downstream tools, and the logic remains consistent across the stack. You no longer need to manually search and update every downstream dependency anytime the metric changes.
With metrics defined in dbt, teams get consistent results across different warehouse-connected data applications.
What does this look like in practice?
When we set out to shape our integration with the dbt Semantic Layer together with our community, we spent a lot of time on discovery to answer: What are the potential core use cases? How does this look in practice? What value might the dbt Semantic Layer combined with Deepnote provide?
And over the past couple of months, we’ve built and iterated on the integration as our community and users leverage it for several use cases. Here are three common use cases that stand out:
- Ad hoc reporting
- Mixing historical reporting & predictive analytics
- Dashboard prototyping
Use case 1: Ad hoc reporting
Let’s say you have a set of defined metrics, living happily in vetted dashboards and data apps. But there are always those times when you have to “pull some numbers real quick” for the board or executives. Doing so is a headache if your metrics are pre-aggregated in models. Producing a report which should take minutes, takes hours and hundreds of lines of code to reproduce metrics from your model.
Our friends at Slido had a similar situation on their hands. They have over 200 metrics aggregated, grouped by weeks, and split by region, stored in the OLAP cube. As definitions of those metrics lived in pre-aggregated models, getting a different cut in the BI tool was impossible for the consumer, shifting the weight onto the data teams’ shoulders. As the company grew, the number of metrics requested by different teams grew, resulting in a lengthy backlog of ad hoc queries for the analytics engineers to battle.
Switching to defining metrics in dbt allowed the Slido team to anchor definitions in one place. Every metric is versioned, and changes propagate to all different downstream tools, including Deepnote.
On the Deepnote end, the team can run SQL queries directly on the top of the source tables in the Snowflake warehouse using Deepnote’s SQL blocks, and create visualizations rapidly with native charting blocks.
SQL blocks in Deepnote now support “dbt-SQL”, allowing you to query your metrics and models defined in dbt.
Adopting the Semantic Layer with Deepnote notebooks helped the Slido data team drive efficiency, reduce time-to-insight, and free up the time of analytics engineers for more strategic work.
As Benn put it in his talk at Coalesce last year, this paradigm shift in the metrics layer is all about pairing up the data team’s architectural roadmap with an experiential one.
Use case 2: Mixing historical & predictive reporting
BI tools and dashboards offer an ideal medium to get a quick grasp of the current state. But they top out quickly if you want to dig in or predict what’s ahead.
Because Deepnote allows you to combine SQL and Python in one interface, wrangling data and applying logic together, in a linear flow, is really easy.
At Deepnote, we internally use this capability to report and model the usage of different hardware by our users. Being able to capture the machine usage as-is, understand the usage patterns, and project those to the future helps us forecast our costs and revenue with more confidence.
Our Product Manager, Robert, built a notebook using the dbt Semantic Layer that maps out the logic and shows you how to do just that.
Use case 3: Dashboard prototyping
Let’s say your team reports dozens of metrics, all of which various stakeholders feel are equally important. And you’re tasked to build a dashboard to help these stakeholders self-serve and digest insights faster. How do you go about designing a starting point that’s intuitive, useful, and that mitigates user error?
This is a use case where notebooks shine. The ability of rapid data querying and visualization, coupled with direct access to your metrics hub, gives you a canvas to quickly build proof-of-concept for teams to interact with. This allows your team to think about the minimum viable version of the desired data product, push it out and test the adoption and response from stakeholders, before moving it to production in something like Looker or PowerBI.
Launching this integration in public beta today is the first step towards designing a delightful experience around utilizing metrics in your exploratory and prototyping workflows. Right now, our dbt integration is available in public beta to all Deepnote users on the Team plan, on Snowflake instances.
As announced at Coalesce, future iterations of the dbt Semantic Layer will include expansion to other warehouses (BigQuery and Redshift are in the works), as well as support for full semantic relationships, like joins between models, virtualized dimensions and more. If you have any questions about metrics in general, please feel free to post in the
#dbt-metrics-and-server channel on dbt Slack.
Share this post
Join the world's best data teams and get started with Deepnote
No credit card required. Run your first notebook in seconds.