datatree.DataTree#

class datatree.DataTree(data: Optional[Union[xarray.core.dataset.Dataset, xarray.core.dataarray.DataArray]] = None, parent: Optional[datatree.datatree.DataTree] = None, children: Optional[Mapping[str, datatree.datatree.DataTree]] = None, name: Optional[str] = None)[source][source]#

A tree-like hierarchical collection of xarray objects.

Attempts to present an API like that of xarray.Dataset, but methods are wrapped to also update all the tree’s child nodes.

__init__(data: Optional[Union[xarray.core.dataset.Dataset, xarray.core.dataarray.DataArray]] = None, parent: Optional[datatree.datatree.DataTree] = None, children: Optional[Mapping[str, datatree.datatree.DataTree]] = None, name: Optional[str] = None)[source][source]#

Create a single node of a DataTree.

The node may optionally contain data in the form of data and coordinate variables, stored in the same way as data is stored in an xarray.Dataset.

Parameters
  • data (Dataset, DataArray, or None, optional) – Data to store under the .ds attribute of this node. DataArrays will be promoted to Datasets. Default is None.

  • parent (DataTree, optional) – Parent node to this node. Default is None.

  • children (Mapping[str, DataTree], optional) – Any child nodes of this node. Default is None.

  • name (str, optional) – Name for this node of the tree. Default is None.

Returns

DataTree

Methods

__init__([data, parent, children, name])

Create a single node of a DataTree.

all([dim, keep_attrs])

Reduce this Dataset's data by applying all along some dimension(s).

any([dim, keep_attrs])

Reduce this Dataset's data by applying any along some dimension(s).

argmax([dim])

Indices of the maxima of the member variables.

argmin([dim])

Indices of the minima of the member variables.

argsort([axis, kind, order])

Returns the indices that would sort this array.

as_array()

as_numpy()

Coerces wrapped data and coordinates into numpy arrays, returning a Dataset.

assign([items])

Assign new data variables or child nodes to a DataTree, returning a new object with all the original items in addition to the new ones.

assign_coords([coords])

Assign new coordinates to this object.

astype(dtype, *[, order, casting, subok, ...])

Copy of the xarray object, with data cast to a specified type.

bfill(dim[, limit])

Fill NaN values by propagating values backward

broadcast_like(other[, exclude])

Broadcast this DataArray against another Dataset or DataArray.

chunk([chunks, name_prefix, token, lock, ...])

Coerce all arrays in this dataset into dask arrays with the given chunks.

clip([min, max, keep_attrs])

Return an array whose values are limited to [min, max].

close()

Release any resources linked to this object.

combine_first(other)

Combine two Datasets, default to data_vars of self.

compute(**kwargs)

Manually trigger loading and/or computation of this dataset's data from disk or a remote source into memory and return a new dataset.

conj()

Complex-conjugate all elements.

conjugate()

Return the complex conjugate, element-wise.

copy([deep])

Returns a copy of this subtree.

cumprod([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying cumprod along some dimension(s).

cumsum([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying cumsum along some dimension(s).

cumulative_integrate(coord[, datetime_unit])

Integrate along the given coordinate using the trapezoidal rule.

curvefit(coords, func[, reduce_dims, ...])

Curve fitting optimization for arbitrary functions.

diff(dim[, n, label])

Calculate the n-th order discrete difference along given axis.

differentiate(coord[, edge_order, datetime_unit])

Differentiate with the second order accurate central differences.

drop_dims(drop_dims, *[, errors])

Drop dimensions and associated variables from this dataset.

drop_isel([indexers])

Drop index positions from this Dataset.

drop_nodes(names, *[, errors])

Drop child nodes from this node.

drop_sel([labels, errors])

Drop index labels from this dataset.

drop_vars(names, *[, errors])

Drop variables from this dataset.

dropna(dim, *[, how, thresh, subset])

Returns a new dataset with dropped labels for missing values along the provided dimension.

equals(other[, from_root])

Two DataTrees are equal if they have isomorphic node structures, with matching node names, and if they have matching variables and coordinates, all of which are equal.

expand_dims([dim, axis])

Return a new object with an additional axis (or axes) inserted at the corresponding position in the array shape.

ffill(dim[, limit])

Fill NaN values by propagating values forward

fillna(value)

Fill missing values in this object.

filter(filterfunc)

Filter nodes according to a specified condition.

filter_by_attrs(**kwargs)

Returns a Dataset with variables that match specific conditions.

find_common_ancestor(other)

Find the first common ancestor of two nodes in the same tree.

from_dict(d[, name])

Create a datatree from a dictionary of data objects, organised by paths into the tree.

get(key[, default])

Access child nodes, variables, or coordinates stored in this node.

head([indexers])

Returns a new dataset with the first n values of each array for the specified dimension(s).

identical(other[, from_root])

Like equals, but will also check all dataset attributes and the attributes on all variables and coordinates.

idxmax([dim, skipna, fill_value, keep_attrs])

Return the coordinate label of the maximum value along a dimension.

idxmin([dim, skipna, fill_value, keep_attrs])

Return the coordinate label of the minimum value along a dimension.

info([buf])

Concise summary of a Dataset variables and attributes.

integrate(coord[, datetime_unit])

Integrate along the given coordinate using the trapezoidal rule.

interp([coords, method, assume_sorted, ...])

Interpolate a Dataset onto new coordinates

interp_like(other[, method, assume_sorted, ...])

Interpolate this object onto the coordinates of another object, filling the out of range values with NaN.

interpolate_na([dim, method, limit, ...])

Fill in NaNs by interpolating according to different methods.

isel([indexers, drop, missing_dims])

Returns a new dataset with each array indexed along the specified dimension(s).

isin(test_elements)

Tests each value in the array for whether it is in test elements.

isnull([keep_attrs])

Test each value in the array for whether it is a missing value.

isomorphic(other[, from_root, strict_names])

Two DataTrees are considered isomorphic if every node has the same number of children.

items()

iter_lineage()

Iterate up the tree, starting from the current node.

keys()

load(**kwargs)

Manually trigger loading and/or computation of this dataset's data from disk or a remote source into memory and return this dataset.

map(func[, keep_attrs, args])

Apply a function to each data variable in this dataset

map_blocks(func[, args, kwargs, template])

Apply a function to each block of this Dataset.

map_over_subtree(func, *args, **kwargs)

Apply a function to every dataset in this subtree, returning a new tree which stores the results.

map_over_subtree_inplace(func, *args, **kwargs)

Apply a function to every dataset in this subtree, updating data in place.

match(pattern)

Return nodes with paths matching pattern.

max([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying max along some dimension(s).

mean([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying mean along some dimension(s).

median([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying median along some dimension(s).

merge(datatree)

Merge all the leaves of a second DataTree into this one.

merge_child_nodes(*paths, new_path)

Merge a set of child nodes into a single new node.

min([dim, skipna, keep_attrs])

Reduce this Dataset's data by applying min along some dimension(s).

notnull([keep_attrs])

Test each value in the array for whether it is not a missing value.

orphan()

Detach this node from its parent.

pad([pad_width, mode, stat_length, ...])

Pad this dataset along one or more dimensions.

persist(**kwargs)

Trigger computation, keeping data as dask arrays

pipe(func, *args, **kwargs)

Apply func(self, *args, **kwargs)

plot()

polyfit(dim, deg[, skipna, rcond, w, full, cov])

Least squares polynomial fit.

prod([dim, skipna, min_count, keep_attrs])

Reduce this Dataset's data by applying prod along some dimension(s).

quantile(q[, dim, method, numeric_only, ...])

Compute the qth quantile of the data along the specified dimension.

query([queries, parser, engine, missing_dims])

Return a new dataset with each array indexed along the specified dimension(s), where the indexers are given as strings containing Python expressions to be evaluated against the data variables in the dataset.

rank(dim, *[, pct, keep_attrs])

Ranks the data.

reduce(func[, dim, keep_attrs, keepdims, ...])

Reduce this dataset by applying func along some dimension(s).

reindex([indexers, method, tolerance, copy, ...])

Conform this object onto a new set of indexes, filling in missing values with fill_value.

reindex_like(other[, method, tolerance, ...])

Conform this object onto the indexes of another object, for indexes which the objects share.

relative_to(other)

Compute the relative path from this node to node other.

rename([name_dict])

Returns a new object with renamed variables, coordinates and dimensions.

rename_dims([dims_dict])

Returns a new object with renamed dimensions only.

rename_vars([name_dict])

Returns a new object with renamed variables including coordinates

render()

Print tree structure, including any data stored at each node.

reorder_levels([dim_order])

Rearrange index levels using input order.

reset_coords([names, drop])

Given names of coordinates, reset them to become variables

reset_index(dims_or_levels, *[, drop])

Reset the specified index(es) or multi-index level(s).

roll([shifts, roll_coords])

Roll this dataset by an offset along one or more dimensions.

round(*args, **kwargs)

Round an array to the given number of decimals.

same_tree(other)

True if other node is in the same tree as this node.

sel([indexers, method, tolerance, drop])

Returns a new dataset with each array indexed by tick labels along the specified dimension(s).

set_coords(names)

Given names of one or more variables, set them as coordinates

set_index([indexes, append])

Set Dataset (multi-)indexes using one or more existing coordinates or variables.

shift([shifts, fill_value])

Shift this dataset by an offset along one or more dimensions.

sortby(variables[, ascending])

Sort object by labels or values (along an axis).

squeeze([dim, drop, axis])

Return a new object with squeezed data.

stack([dimensions, create_index, index_cls])

Stack any number of existing dimensions into a single new dimension.

std([dim, skipna, ddof, keep_attrs])

Reduce this Dataset's data by applying std along some dimension(s).

sum([dim, skipna, min_count, keep_attrs])

Reduce this Dataset's data by applying sum along some dimension(s).

swap_dims([dims_dict])

Returns a new object with swapped dimensions.

tail([indexers])

Returns a new dataset with the last n values of each array for the specified dimension(s).

thin([indexers])

Returns a new dataset with each array indexed along every n-th value for the specified dimension(s)

to_dataset()

Return the data in this node as a new xarray.Dataset object.

to_dict()

Create a dictionary mapping of absolute node paths to the data contained in those nodes.

to_netcdf(filepath[, mode, encoding, ...])

Write datatree contents to a netCDF file.

to_zarr(store[, mode, encoding, consolidated])

Write datatree contents to a Zarr store.

transpose(*dims[, missing_dims])

Return a new Dataset object with all array dimensions transposed.

unify_chunks()

Unify chunk size along all chunked dimensions of this Dataset.

unstack([dim, fill_value, sparse])

Unstack existing dimensions corresponding to MultiIndexes into multiple new dimensions.

update(other)

Update this node's children and / or variables.

values()

var([dim, skipna, ddof, keep_attrs])

Reduce this Dataset's data by applying var along some dimension(s).

where(cond[, other, drop])

Filter elements from this object according to a condition.

Attributes

ancestors

All parent nodes and their parent nodes, starting with the most distant.

attrs

Dictionary of global attributes on this node object.

children

Child nodes of this node, stored under a mapping via their names.

coords

Dictionary of xarray.DataArray objects corresponding to coordinate variables

data_vars

Dictionary of DataArray objects corresponding to data variables

depth

Maximum level of this tree.

descendants

Child nodes and all their child nodes.

dims

Mapping from dimension names to lengths.

ds

An immutable Dataset-like view onto the data in this node.

encoding

Dictionary of global encoding attributes on this node object.

groups

Return all netCDF4 groups in the tree, given as a tuple of path-like strings.

has_attrs

Whether or not there are any metadata attributes in this node.

has_data

Whether or not there are any data variables in this node.

indexes

Mapping of pandas.Index objects used for label based indexing.

is_empty

False if node contains any data or attrs.

is_hollow

True if only leaf nodes contain data.

is_leaf

Whether this node is a leaf node.

is_root

Whether this node is the tree root.

leaves

All leaf nodes.

level

Level of this node.

lineage

All parent nodes and their parent nodes, starting with the closest.

name

The name of this node.

nbytes

parent

Parent of this node.

parents

All parent nodes and their parent nodes, starting with the closest.

path

Return the file-like path from the root to this node.

root

Root node of the tree

siblings

Nodes with the same parent as this node.

sizes

Mapping from dimension names to lengths.

subtree

An iterator over all nodes in this tree, including both self and all descendants.

variables

Low level interface to node contents as dict of Variable objects.

width

Number of nodes at this level in the tree.

xindexes

Mapping of xarray Index objects used for label based indexing.