Chart blocks allow you to create charts from Pandas DataFrames without using code. They are perfect for fast exploratory analysis, or for those who are not familiar with how to make charts using Python code.
To get started, you can watch our quick demo on chart block configuration.
At Type, you can choose from four different ‘marks’ for your chart (bar, line, area and point charts). These types define the basic geometrical representation of your data. They act as starting points from which you can build the most common exploratory chart types. For instance, from a bar chart, you can create histograms, stacked bar charts, grouped bar charts and even heatmaps by using the various settings.
At the X and Y axis, you can select the specific columns you wish to visualize from your dataset.
As an alternative to manual setup, you can also kickstart your chart building flow with chart recommendations.
When you create a chart block, you can click on the Explore button to see a selection of chart recommendations based on your DataFrame.
The suggestions gallery includes the most commonly used chart types for data summarization. Once you click on a chart, you can pick it as your current one (Replace chart) or create a new chart block from it (Add new chart).
Chart recommendations can greatly accelerate your exploratory workflow. Instead of starting from scratch, you can swiftly browse through a quick overview of data summaries, then pick your starting point and build your chart from there.
On bar, line and area charts, SUM() aggregation is applied automatically when you select a quantitative column.
You can override the default bin size by typing a value in the text box.
The selected columns on the charts are automatically assigned one of three data types: categorical, quantitative and time. You can see the current data type indicated as an icon on left side of the column name.
Time units make it easy to adjust the granularity of time series analysis on the fly. When you add a temporal column to an axis, you can change the displayed time unit. Pick from multiple time intervals/formats in the dropdown menu and the temporal axis will be grouped by the selected unit.
Ascending and descending will sort the categories in alphabetical order. Ascending and descending by [column] will sort the categories according to the values of your applied quantitative column.
Stacked creates a stacked bar chart. Stacked %-normalized will create a stacked bar chart with the values normalized to a percentage scale.
Besides group by, the Color selector can also be used to apply a color scale to your chart based on the values in a quantitative column. You can pick a different color scale by using the color picker.
Size provides an alternative option for visual encoding by mapping the values in a quantitative column to:
- width of bars (bar chart)
- width of lines (line chart)
- size of points (point chart)
In certain cases, you may wish to visualize multiple quantitive columns (e.g a secondary axis) from your dataset within a single chart. You can do that by adding multiple layers to you chart.
You can either duplicate an existing layer or create a new one from scratch.
You can create a color legend for multi-layered charts by selecting Measure Name at Color. This option will automatically display the name of the selected quantitative column in the legend and will assign a unique color to it. You can also customize the name of the legend item by typing in the text field at the Measure Name option.
You can access various formatting options for the given chart layer by clicking on the Format tab. The settings listed here are based on the selected columns (X and Y axis) on the Data tab.
You can modify the axis titles with a custom label or optionally hide them. For quantitative axes, you can also choose a different scale (e.g. logarithmic) and apply a custom range to the visualized values.
You can also toggle Value labels to display the numerical values of individual data points.
You can add a custom title to the chart. You can select the placement of the chart legend or hide it optionally. You can also turn off the tooltips (displayed when you hover over a data point) and the grid lines.
You can slice and dice your data right from the charts with interactive filtering. Select data points to filter in/out in one of two ways:
- highlight a range of data points with mouse-select;
- click on individual series in the Color legend.
Once you selected some data, press the Filter button to include or exclude the selected data points on your chart. You can also combine multiple filtering steps to drill down even further.
If you wish to quickly investigate the underlying data behind a selected part of your chart you have the option to create a new Dataframe from the filtered data. Simply click the DF button and a new code block will be added below your chart. This block contains the pandas code for producing a new filtered dataframe.
If you need more flexibility to customize your charts, you can duplicate your chart block into code by selecting the option in the block actions menu.
This adds a new Python code block to your notebook, containing the configuration of your chart in the Vega-lite specification format. Vega-lite is very powerful and fairly easy to learn so it’s a great option if you need to create a finely customized or super advanced visualisation.
Chart blocks can display a maximum of 10 thousand rows of data. If you plot datasets larger than this, only the first 10k rows will be visualized. You can aggregate or filter the data prior to visualization to get under this limit.