Sign inGet started
← Back to all guides

How to explore netCDF datasets using xarray

By Deepnote team

Updated on November 23, 2023

This tutorial offers a deep dive into handling netCDF datasets with xarray, demonstrating techniques for efficient data manipulation, subsetting, and visualization in Python.

This tutorial will guide you through the basics of using the xarray package to work with netCDF datasets. The xarray package simplifies the handling of multidimensional datasets in Python, capitalizing on the capabilities of pandas for handling labeled data and netCDF4 behind the scenes for reading data from files or services like ERDDAP. If you're transitioning from netCDF4 to xarray, you'll find that xarray provides a more high-level and Pythonic interface.

Overview

  • xarray basics: Understanding the xarray package.
  • Reading netCDF datasets: How to read netCDF datasets into xarray data structures.
  • Exploring netCDF data: Discovering dataset dimensions, variables, and attributes.
  • NumPy arrays and xarray: Manipulating netCDF variable data in NumPy array format.

Prerequisites

Before using xarray, ensure you have it installed. If not, you can use the following command to install it along with its dependencies:

$ conda install xarray netCDF4 bottleneck

For Python versions earlier than 3.5, you might also need to install cyordereddict for better performance—this is not needed as of Python 3.5 and later.

Getting started with xarray

To use xarray, let's import it alongside NumPy:

import numpy as np
import xarray as xr

Reading datasets with xarray

Use xr.open_dataset() to load a netCDF dataset from a local file or a URL. For example:

ds = xr.open_dataset('https://salishsea.eos.ubc.ca/erddap/griddap/ubcSSnBathymetry2V1')

Or if you have the dataset locally:

lds = xr.open_dataset('../../NEMO-forcing/grid/bathy_meter_SalishSea2.nc')

Exploring the dataset structure

An xarray Dataset is analogous to a dict of DataArray objects with aligned dimensions—it's like an in-memory representation of a netCDF file:

print(ds)

You will find detailed metadata (attributes) of the dataset and its variables, including dimensions (dims), data variables (data_vars), and coordinates (coords):

  • dims: The names and lengths of dataset dimensions.
  • data_vars: Variables held in the dataset, accessible as DataArrays.
  • coords: Labels for points in data variables, also DataArrays.

For example, to see the dimensions:

ds.dims

To check the variables:

ds.data_vars

And to examine the coordinates:

ds.coords

Attributes: Metadata about your data

Both the dataset and variables have attributes, which are stored metadata describing the dataset. Let's look at the dataset's attributes:

ds.attrs

And the attributes of the longitude DataArray:

ds.longitude

Data variables and NumPy arrays

Data variable values in xarray are stored as NumPy arrays. This means you can use NumPy's indexing and slicing to work with them.

For example, to access the latitudes and longitudes at the corners of the domain:

# Shape of the latitude variable
ds.latitude.shape

# Latitudes and longitudes at domain corners
print('Latitudes and longitudes of domain corners:')
print('  0, 0:        ', ds.latitude.values[0, 0], ds.longitude.values[0, 0])
print('  0, x-max:    ', ds.latitude.values[0, -1], ds.longitude.values[0, -1])
print('  y-max, 0:    ', ds.latitude.values[-1, 0], ds.longitude.values[-1, 0])
print('  y-max, x-max:', ds.latitude.values[-1, -1], ds.longitude.values[-1, -1])

Slicing for subsets of data

You can use slicing to pull out specific subsets of data. For example:

# First two values in both dimensions
ds.longitude.values[:2, :2]

# Last two values in both dimensions
ds.latitude.values[-2:, -2:]

Conclusion

Xarray provides a powerful and convenient way to read, explore, and manipulate netCDF datasets. It uses an intuitive, pandas-like approach to work with labeled multidimensional data. With xarray, you can handle complex datasets in a more Pythonic, efficient manner.

Whether you're working on climate modeling, oceanographic data, or any other field that uses multidimensional datasets, learning xarray can significantly streamline your data analysis workflow. Happy data wrangling!

Footer

Product

  • Integrations
  • Pricing
  • Documentation
  • Changelog
  • Security

Company

Comparisons

Resources

  • Privacy
  • Terms

© Deepnote