API reference#

DataTree#

Creating a DataTree#

Methods of creating a datatree.

DataTree([data, parent, children, name])

A tree-like hierarchical collection of xarray objects.

DataTree.from_dict(d[, name])

Create a datatree from a dictionary of data objects, organised by paths into the tree.

Tree Attributes#

Attributes relating to the recursive tree-like structure of a DataTree.

DataTree.parent

Parent of this node.

DataTree.children

Child nodes of this node, stored under a mapping via their names.

DataTree.name

The name of this node.

DataTree.path

Return the file-like path from the root to this node.

DataTree.root

Root node of the tree

DataTree.is_root

Whether this node is the tree root.

DataTree.is_leaf

Whether this node is a leaf node.

DataTree.leaves

All leaf nodes.

DataTree.level

Level of this node.

DataTree.depth

Maximum level of this tree.

DataTree.width

Number of nodes at this level in the tree.

DataTree.subtree

An iterator over all nodes in this tree, including both self and all descendants.

DataTree.descendants

Child nodes and all their child nodes.

DataTree.siblings

Nodes with the same parent as this node.

DataTree.lineage

All parent nodes and their parent nodes, starting with the closest.

DataTree.parents

All parent nodes and their parent nodes, starting with the closest.

DataTree.ancestors

All parent nodes and their parent nodes, starting with the most distant.

DataTree.groups

Return all netCDF4 groups in the tree, given as a tuple of path-like strings.

Data Contents#

Interface to the data objects (optionally) stored inside a single DataTree node. This interface echoes that of xarray.Dataset.

DataTree.dims

Mapping from dimension names to lengths.

DataTree.sizes

Mapping from dimension names to lengths.

DataTree.data_vars

Dictionary of DataArray objects corresponding to data variables

DataTree.coords

Dictionary of xarray.DataArray objects corresponding to coordinate variables

DataTree.attrs

Dictionary of global attributes on this node object.

DataTree.encoding

Dictionary of global encoding attributes on this node object.

DataTree.indexes

Mapping of pandas.Index objects used for label based indexing.

DataTree.nbytes

DataTree.ds

An immutable Dataset-like view onto the data in this node.

DataTree.to_dataset()

Return the data in this node as a new xarray.Dataset object.

DataTree.has_data

Whether or not there are any data variables in this node.

DataTree.has_attrs

Whether or not there are any metadata attributes in this node.

DataTree.is_empty

False if node contains any data or attrs.

DataTree.is_hollow

True if only leaf nodes contain data.

Dictionary Interface#

DataTree objects also have a dict-like interface mapping keys to either xarray.DataArray``s or to child ``DataTree nodes.

DataTree.__getitem__(key)

Access child nodes, variables, or coordinates stored anywhere in this tree.

DataTree.__setitem__(key, value)

Add either a child node or an array to the tree, at any position.

DataTree.__delitem__(key)

Remove a child node from this tree object.

DataTree.update(other)

Update this node's children and / or variables.

DataTree.get(key[, default])

Access child nodes, variables, or coordinates stored in this node.

DataTree.items()

DataTree.keys()

DataTree.values()

Tree Manipulation#

For manipulating, traversing, navigating, or mapping over the tree structure.

DataTree.orphan()

Detach this node from its parent.

DataTree.same_tree(other)

True if other node is in the same tree as this node.

DataTree.relative_to(other)

Compute the relative path from this node to node other.

DataTree.iter_lineage()

Iterate up the tree, starting from the current node.

DataTree.find_common_ancestor(other)

Find the first common ancestor of two nodes in the same tree.

DataTree.map_over_subtree(func, *args, **kwargs)

Apply a function to every dataset in this subtree, returning a new tree which stores the results.

map_over_subtree(func)

Decorator which turns a function which acts on (and returns) Datasets into one which acts on and returns DataTrees.

DataTree.pipe(func, *args, **kwargs)

Apply func(self, *args, **kwargs)

DataTree.match(pattern)

Return nodes with paths matching pattern.

DataTree.filter(filterfunc)

Filter nodes according to a specified condition.

Pathlib-like Interface#

DataTree objects deliberately echo some of the API of pathlib.PurePath.

DataTree.name

The name of this node.

DataTree.parent

Parent of this node.

DataTree.parents

All parent nodes and their parent nodes, starting with the closest.

DataTree.relative_to(other)

Compute the relative path from this node to node other.

Missing:

DataTree.glob DataTree.joinpath DataTree.with_name DataTree.walk DataTree.rename DataTree.replace

DataTree Contents#

Manipulate the contents of all nodes in a tree simultaneously.

DataTree.copy([deep])

Returns a copy of this subtree.

DataTree.assign_coords([coords])

Assign new coordinates to this object.

DataTree.merge(datatree)

Merge all the leaves of a second DataTree into this one.

DataTree.rename([name_dict])

Returns a new object with renamed variables, coordinates and dimensions.

DataTree.rename_vars([name_dict])

Returns a new object with renamed variables including coordinates

DataTree.rename_dims([dims_dict])

Returns a new object with renamed dimensions only.

DataTree.swap_dims([dims_dict])

Returns a new object with swapped dimensions.

DataTree.expand_dims([dim, axis])

Return a new object with an additional axis (or axes) inserted at the corresponding position in the array shape.

DataTree.drop_vars(names, *[, errors])

Drop variables from this dataset.

DataTree.drop_dims(drop_dims, *[, errors])

Drop dimensions and associated variables from this dataset.

DataTree.set_coords(names)

Given names of one or more variables, set them as coordinates

DataTree.reset_coords([names, drop])

Given names of coordinates, reset them to become variables

DataTree Node Contents#

Manipulate the contents of a single DataTree node.

DataTree.assign([items])

Assign new data variables or child nodes to a DataTree, returning a new object with all the original items in addition to the new ones.

DataTree.drop_nodes(names, *[, errors])

Drop child nodes from this node.

Comparisons#

Compare one DataTree object to another.

DataTree.isomorphic(other[, from_root, ...])

Two DataTrees are considered isomorphic if every node has the same number of children.

DataTree.equals(other[, from_root])

Two DataTrees are equal if they have isomorphic node structures, with matching node names, and if they have matching variables and coordinates, all of which are equal.

DataTree.identical(other[, from_root])

Like equals, but will also check all dataset attributes and the attributes on all variables and coordinates.

Indexing#

Index into all nodes in the subtree simultaneously.

DataTree.isel([indexers, drop, missing_dims])

Returns a new dataset with each array indexed along the specified dimension(s).

DataTree.sel([indexers, method, tolerance, drop])

Returns a new dataset with each array indexed by tick labels along the specified dimension(s).

DataTree.drop_sel([labels, errors])

Drop index labels from this dataset.

DataTree.drop_isel([indexers])

Drop index positions from this Dataset.

DataTree.head([indexers])

Returns a new dataset with the first n values of each array for the specified dimension(s).

DataTree.tail([indexers])

Returns a new dataset with the last n values of each array for the specified dimension(s).

DataTree.thin([indexers])

Returns a new dataset with each array indexed along every n-th value for the specified dimension(s)

DataTree.squeeze([dim, drop, axis])

Return a new object with squeezed data.

DataTree.interp([coords, method, ...])

Interpolate a Dataset onto new coordinates

DataTree.interp_like(other[, method, ...])

Interpolate this object onto the coordinates of another object, filling the out of range values with NaN.

DataTree.reindex([indexers, method, ...])

Conform this object onto a new set of indexes, filling in missing values with fill_value.

DataTree.reindex_like(other[, method, ...])

Conform this object onto the indexes of another object, for indexes which the objects share.

DataTree.set_index([indexes, append])

Set Dataset (multi-)indexes using one or more existing coordinates or variables.

DataTree.reset_index(dims_or_levels, *[, drop])

Reset the specified index(es) or multi-index level(s).

DataTree.reorder_levels([dim_order])

Rearrange index levels using input order.

DataTree.query([queries, parser, engine, ...])

Return a new dataset with each array indexed along the specified dimension(s), where the indexers are given as strings containing Python expressions to be evaluated against the data variables in the dataset.

Missing: DataTree.loc

Missing Value Handling#

DataTree.isnull([keep_attrs])

Test each value in the array for whether it is a missing value.

DataTree.notnull([keep_attrs])

Test each value in the array for whether it is not a missing value.

DataTree.combine_first(other)

Combine two Datasets, default to data_vars of self.

DataTree.dropna(dim, *[, how, thresh, subset])

Returns a new dataset with dropped labels for missing values along the provided dimension.

DataTree.fillna(value)

Fill missing values in this object.

DataTree.ffill(dim[, limit])

Fill NaN values by propagating values forward

DataTree.bfill(dim[, limit])

Fill NaN values by propagating values backward

DataTree.interpolate_na([dim, method, ...])

Fill in NaNs by interpolating according to different methods.

DataTree.where(cond[, other, drop])

Filter elements from this object according to a condition.

DataTree.isin(test_elements)

Tests each value in the array for whether it is in test elements.

Computation#

Apply a computation to the data in all nodes in the subtree simultaneously.

DataTree.map(func[, keep_attrs, args])

Apply a function to each data variable in this dataset

DataTree.reduce(func[, dim, keep_attrs, ...])

Reduce this dataset by applying func along some dimension(s).

DataTree.diff(dim[, n, label])

Calculate the n-th order discrete difference along given axis.

DataTree.quantile(q[, dim, method, ...])

Compute the qth quantile of the data along the specified dimension.

DataTree.differentiate(coord[, edge_order, ...])

Differentiate with the second order accurate central differences.

DataTree.integrate(coord[, datetime_unit])

Integrate along the given coordinate using the trapezoidal rule.

DataTree.map_blocks(func[, args, kwargs, ...])

Apply a function to each block of this Dataset.

DataTree.polyfit(dim, deg[, skipna, rcond, ...])

Least squares polynomial fit.

DataTree.curvefit(coords, func[, ...])

Curve fitting optimization for arbitrary functions.

Aggregation#

Aggregate data in all nodes in the subtree simultaneously.

DataTree.all([dim, keep_attrs])

Reduce this Dataset's data by applying all along some dimension(s).

DataTree.any([dim, keep_attrs])

Reduce this Dataset's data by applying any along some dimension(s).

DataTree.argmax([dim])

Indices of the maxima of the member variables.

DataTree.argmin([dim])

Indices of the minima of the member variables.

DataTree.idxmax([dim, skipna, fill_value, ...])

Return the coordinate label of the maximum value along a dimension.

DataTree.idxmin([dim, skipna, fill_value, ...])

Return the coordinate label of the minimum value along a dimension.

DataTree.max([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying max along some dimension(s).

DataTree.min([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying min along some dimension(s).

DataTree.mean([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying mean along some dimension(s).

DataTree.median([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying median along some dimension(s).

DataTree.prod([dim, skipna, min_count, ...])

Reduce this Dataset's data by applying prod along some dimension(s).

DataTree.sum([dim, skipna, min_count, ...])

Reduce this Dataset's data by applying sum along some dimension(s).

DataTree.std([dim, skipna, ddof, keep_attrs])

Reduce this Dataset's data by applying std along some dimension(s).

DataTree.var([dim, skipna, ddof, keep_attrs])

Reduce this Dataset's data by applying var along some dimension(s).

DataTree.cumsum([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying cumsum along some dimension(s).

DataTree.cumprod([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying cumprod along some dimension(s).

ndarray methods#

Methods copied from numpy.ndarray objects, here applying to the data in all nodes in the subtree.

DataTree.argsort([axis, kind, order])

Returns the indices that would sort this array.

DataTree.astype(dtype, *[, order, casting, ...])

Copy of the xarray object, with data cast to a specified type.

DataTree.clip([min, max, keep_attrs])

Return an array whose values are limited to [min, max].

DataTree.conj()

Complex-conjugate all elements.

DataTree.conjugate()

Return the complex conjugate, element-wise.

DataTree.round(*args, **kwargs)

Round an array to the given number of decimals.

DataTree.rank(dim, *[, pct, keep_attrs])

Ranks the data.

Reshaping and reorganising#

Reshape or reorganise the data in all nodes in the subtree.

DataTree.transpose(*dims[, missing_dims])

Return a new Dataset object with all array dimensions transposed.

DataTree.stack([dimensions, create_index, ...])

Stack any number of existing dimensions into a single new dimension.

DataTree.unstack([dim, fill_value, sparse])

Unstack existing dimensions corresponding to MultiIndexes into multiple new dimensions.

DataTree.shift([shifts, fill_value])

Shift this dataset by an offset along one or more dimensions.

DataTree.roll([shifts, roll_coords])

Roll this dataset by an offset along one or more dimensions.

DataTree.pad([pad_width, mode, stat_length, ...])

Pad this dataset along one or more dimensions.

DataTree.sortby(variables[, ascending])

Sort object by labels or values (along an axis).

DataTree.broadcast_like(other[, exclude])

Broadcast this DataArray against another Dataset or DataArray.

Plotting#

I/O#

Open a datatree from an on-disk store or serialize the tree.

open_datatree(filename_or_obj[, engine])

Open and decode a dataset from a file or file-like object, creating one Tree node for each group in the file.

DataTree.to_dict()

Create a dictionary mapping of absolute node paths to the data contained in those nodes.

DataTree.to_netcdf(filepath[, mode, ...])

Write datatree contents to a netCDF file.

DataTree.to_zarr(store[, mode, encoding, ...])

Write datatree contents to a Zarr store.

Missing: open_mfdatatree

Tutorial#

Testing#

Test that two DataTree objects are similar.

testing.assert_isomorphic(a, b[, from_root])

Two DataTrees are considered isomorphic if every node has the same number of children.

testing.assert_equal(a, b[, from_root])

Two DataTrees are equal if they have isomorphic node structures, with matching node names, and if they have matching variables and coordinates, all of which are equal.

testing.assert_identical(a, b[, from_root])

Like assert_equals, but will also check all dataset attributes and the attributes on all variables and coordinates.

Exceptions#

Exceptions raised when manipulating trees.

TreeIsomorphismError

Error raised if two tree objects do not share the same node structure.

InvalidTreeError

Raised when user attempts to create an invalid tree in some way.

NotFoundInTreeError

Raised when operation can't be completed because one node is part of the expected tree.

Advanced API#

Relatively advanced API for users or developers looking to understand the internals, or extend functionality.

DataTree.variables

Low level interface to node contents as dict of Variable objects.

register_datatree_accessor(name)

Register a custom accessor on DataTree objects.

Missing: DataTree.set_close