treedata.TreeData

treedata.TreeData#

class treedata.TreeData(X=None, obs=None, var=None, uns=None, *, obsm=None, obst=None, varm=None, vart=None, layers=None, raw=None, dtype=None, shape=None, filename=None, filemode=None, asview=False, label='tree', alignment='leaves', allow_overlap=True, obsp=None, varp=None, oidx=None, vidx=None)#

AnnData with trees.

TreeData is a light-weight wrapper around AnnData which adds two additional attributes, obst and vart, to store trees for observations and variables. A TreeData object can be used just like an AnnData object and stores a data matrix X together with annotations of observations obs (obsm, obsp, obst), variables var (varm, varp, vart), and unstructured annotations uns.

Parameters:

X (ndarray | spmatrix | sparray | DataFrame | None (default: None)) – A #observations × #variables data matrix. A view of the data is used if the data type matches, otherwise, a copy is made.
obs (DataFrame | Mapping[str, Iterable[Any]] | None (default: None)) – Key-indexed one-dimensional observations annotation of length #observations.
var (DataFrame | Mapping[str, Iterable[Any]] | None (default: None)) – Key-indexed one-dimensional variables annotation of length #variables.
uns (Mapping[str, Any] | None (default: None)) – Key-indexed unstructured annotation.
obsm (ndarray | Mapping[str, Sequence[Any]] | None (default: None)) – Key-indexed multi-dimensional observations annotation of length #observations. If passing a ndarray, it needs to have a structured datatype.
obst (Mapping[str, DiGraph] | None (default: None)) – Key-indexed DiGraph trees leaf nodes in the observations axis.
varm (ndarray | Mapping[str, Sequence[Any]] | None (default: None)) – Key-indexed multi-dimensional variables annotation of length #variables. If passing a ndarray, it needs to have a structured datatype.
vart (Mapping[str, DiGraph] | None (default: None)) – Key-indexed DiGraph trees leaf nodes in the variables axis.
layers (Mapping[str, ndarray | spmatrix | sparray] | None (default: None)) – Key-indexed multi-dimensional arrays aligned to dimensions of X.
dtype (dtype | type | str | None (default: None)) –

Deprecated since version The: dtype argument is deprecated and will be removed in a future version.
shape (tuple[int, int] | None (default: None)) – Shape tuple (#observations, #variables). Can only be provided if X is None.
filename (PathLike | None (default: None)) – Name of backing file. See h5py.File.
filemode (Literal['r', 'r+'] | None (default: None)) – Open mode of backing file. See h5py.File.
asview (bool (default: False)) – Initialize as view. X has to be an TreeData object.
label (str | None (default: 'tree')) – Columns in .obs and .var to place tree key in. Default is “tree”. If it’s None, no column is added.
alignment (Literal['leaves', 'nodes', 'subset'] (default: 'leaves')) –
Alignment between trees and observations/variables. One of the following:
- leaves: All leaf names are present in the observation/variable names.
- nodes: All leaf and internal node names are present in the observation/variable names.
- subset: A subset of leaf and internal node names are present in the observation/variable names.
allow_overlap (bool (default: True)) – Whether trees containing overlapping sets of leaves or nodes are allowed. Default is True.

Attributes table#

`T`	Transpose whole object
`X`	Data matrix of shape `n_obs` × `n_vars`.
`alignment`	Mapping between trees and observations/variables.
`allow_overlap`	Whether overlapping trees are allowed.
`filename`	Change to backing mode by setting the filename of a `.h5ad` file.
`has_overlap`	Flag indicating whether stored trees contain overlapping nodes.
`is_view`	`True` if object is view of another TreeData object, `False` otherwise.
`isbacked`	`True` if object is backed on disk, `False` otherwise.
`isview`	Whether or not this object is a view.
`label`	Column in `.obs` and .`obs` with tree keys
`layers`	A `property` that creates an ephemeral AlignedMapping.
`n_obs`	Number of observations.
`n_vars`	Number of variables/features.
`obs`	One-dimensional annotation of observations (`pd.DataFrame`).
`obs_names`	Names of observations (alias for `.obs.index`).
`obsm`	A `property` that creates an ephemeral AlignedMapping.
`obsp`	A `property` that creates an ephemeral AlignedMapping.
`obst`	Tree annotation of observations
`raw`	Store raw version of `X` and `var` as `.raw.X` and `.raw.var`.
`shape`	Shape of data matrix (`n_obs`, `n_vars`).
`uns`	Unstructured annotation (ordered dictionary).
`var`	One-dimensional annotation of variables/ features (`pd.DataFrame`).
`var_names`	Names of variables (alias for `.var.index`).
`varm`	A `property` that creates an ephemeral AlignedMapping.
`varp`	A `property` that creates an ephemeral AlignedMapping.
`vart`	Tree annotation of variables

Methods table#

`chunk_X`([select, replace])	Return a chunk of the data matrix `X` with random or specified indices.
`chunked_X`([chunk_size])	Return an iterator over the rows of the data matrix `X`.
`concatenate`()	Concatenate deprecated, use `treedata.concat` instead.
`copy`([filename])	Full copy, optionally on disk.
`obs_keys`()	List keys of observation annotation `obs`.
`obs_names_make_unique`([join])	Makes the index unique by appending a number string to each duplicate index element: '1', '2', etc.
`obs_vector`(k, /, *[, layer])	Convenience function for returning a 1 dimensional ndarray of values from `X`, `layers[k]`, or `obs`.
`obsm_keys`()	List keys of observation annotation `obsm`.
`obst_keys`()	List keys of variable annotation `obst`.
`rename_categories`(key, categories)	Rename categories of annotation `key` in `obs`, `var`, and `uns`.
`strings_to_categoricals`([df])	Transform string annotations to categoricals.
`to_adata`()	Convert this TreeData object to an AnnData object.
`to_df`([layer])	Generate shallow `DataFrame`.
`to_memory`([copy])	Return a new AnnData object with all backed arrays loaded into memory.
`transpose`()	Transpose whole object
`uns_keys`()	List keys of unstructured annotation.
`unwriteable`(*[, store_type])	Whether or not an `AnnData` object can be written to disk for a given store type.
`var_keys`()	List keys of variable annotation `var`.
`var_names_make_unique`([join])	Makes the index unique by appending a number string to each duplicate index element: '1', '2', etc.
`var_vector`(k, /, *[, layer])	Convenience function for returning a 1 dimensional ndarray of values from `X`, `layers[k]`, or `obs`.
`varm_keys`()	List keys of variable annotation `varm`.
`vart_keys`()	List keys of variable annotation `vart`.
`write`([filename, compression, compression_opts])	Write `.h5td`-formatted hdf5 file.
`write_csvs`(dirname, *[, skip_data, sep])	Write annotation to `.csv` files.
`write_h5ad`([filename, ...])	Write `.h5ad`-formatted hdf5 file.
`write_h5td`([filename, compression, ...])	Write `.h5td`-formatted hdf5 file.
`write_loom`(filename, *[, write_obsm_varm])	Write `.loom`-formatted hdf5 file.
`write_zarr`(store[, chunks])	Write a hierarchical Zarr array store.

Attributes#

property T: TreeData#

Transpose whole object

Data matrix is transposed, observations and variables are interchanged. Ignores .raw.

property X: InMemoryArray | Dataset | Array | ZappyArray | CSRDataset | CSCDataset | None#: Data matrix of shape n_obs × n_vars.

property alignment: Literal['leaves', 'nodes', 'subset']#: Mapping between trees and observations/variables.

property allow_overlap: bool#: Whether overlapping trees are allowed.

property filename: Path | None#

Change to backing mode by setting the filename of a .h5ad file.

Setting the filename writes the stored data to disk.
Setting the filename when the filename was previously another name moves the backing file from the previous file to the new file. If you want to copy the previous file, use copy(filename='new_filename').

property has_overlap: bool#

Flag indicating whether stored trees contain overlapping nodes.

Returns:: bool - True when any stored trees share nodes, False otherwise.

property is_view: bool#: True if object is view of another TreeData object, False otherwise.

property isbacked: bool#: True if object is backed on disk, False otherwise.

property isview: bool#: Whether or not this object is a view.

Deprecated since version 0.7.2: Use is_view instead of isview.

property label: str | None#: Column in .obs and .`obs` with tree keys

property layers: Layers | LayersView#

A property that creates an ephemeral AlignedMapping.

The actual data is stored as f'_{self.name}' in the parent object.

property n_obs: int#: Number of observations.

property n_vars: int#: Number of variables/features.

property obs: DataFrame | Dataset2D#: One-dimensional annotation of observations (pd.DataFrame).

property obs_names: Index#: Names of observations (alias for .obs.index).

property obsm: AxisArrays | AxisArraysView#

A property that creates an ephemeral AlignedMapping.

The actual data is stored as f'_{self.name}' in the parent object.

property obsp: PairwiseArrays | PairwiseArraysView#

A property that creates an ephemeral AlignedMapping.

The actual data is stored as f'_{self.name}' in the parent object.

property obst: AxisTrees | AxisTreesView#

Tree annotation of observations

Stores for each key a DiGraph with leaf nodes in obs_names. Is subset and pruned with data but behaves otherwise like a alignment.

property raw: Raw#

Store raw version of X and var as .raw.X and .raw.var.

The raw attribute is initialized with the current content of an object by setting:

adata.raw = adata.copy()

Its content can be deleted:

adata.raw = None
# or
del adata.raw

Upon slicing an AnnData object along the obs (row) axis, raw is also sliced. Slicing an AnnData object along the vars (columns) axis leaves raw unaffected. Note that you can call:

adata.raw[:, 'orig_variable_name'].X

to retrieve the data associated with a variable that might have been filtered out or “compressed away” in X.

property shape: tuple[int, int]#: Shape of data matrix (n_obs, n_vars).

property uns: MutableMapping#: Unstructured annotation (ordered dictionary).

property var: DataFrame | Dataset2D#: One-dimensional annotation of variables/ features (pd.DataFrame).

property var_names: Index#: Names of variables (alias for .var.index).

property varm: AxisArrays | AxisArraysView#

A property that creates an ephemeral AlignedMapping.

The actual data is stored as f'_{self.name}' in the parent object.

property varp: PairwiseArrays | PairwiseArraysView#

A property that creates an ephemeral AlignedMapping.

The actual data is stored as f'_{self.name}' in the parent object.

property vart: AxisTrees | AxisTreesView#

Tree annotation of variables

Stores for each key a DiGraph with leaf nodes in var_names. Is subset and pruned with data but behaves otherwise like a alignment.

Methods#

chunk_X(select=1000, *, replace=True)#

Return a chunk of the data matrix X with random or specified indices.

Parameters:

select (int | Sequence[int] | ndarray (default: 1000)) –
Depending on the type:

int
A random chunk with select rows will be returned.

sequence (e.g. a list, tuple or numpy array) of int
A chunk with these indices will be returned.
replace (bool (default: True)) – If select is an integer then True means random sampling of indices with replacement, False without replacement.

chunked_X(chunk_size=None)#

Return an iterator over the rows of the data matrix X.

Parameters:: chunk_size (int | None (default: None)) – Row size of a single chunk.

concatenate()#

Concatenate deprecated, use treedata.concat instead.

Return type:: None

copy(filename=None)#

Full copy, optionally on disk.

Return type:: TreeData

obs_keys()#

List keys of observation annotation obs.

Deprecated since version 0.12.3: Use obs instead of obs_keys. (e.g. k in adata.obs or str(adata.obs.columns.tolist()))

Return type:: list[str]

obs_names_make_unique(join='-')#

Makes the index unique by appending a number string to each duplicate index element: ‘1’, ‘2’, etc.

If a tentative name created by the algorithm already exists in the index, it tries the next integer in the sequence.

The first occurrence of a non-unique value is ignored.

Parameters:: join (str (default: '-')) – The connecting string between name and integer.
Return type:: None

Examples

>>> from anndata import AnnData
>>> adata = AnnData(np.ones((2, 3)), var=pd.DataFrame(index=["a", "a", "b"]))
>>> adata.var_names.astype("string")
Index(['a', 'a', 'b'], dtype='string')
>>> adata.var_names_make_unique()
>>> adata.var_names.astype("string")
Index(['a', 'a-1', 'b'], dtype='string')

obs_vector(k, /, *, layer=None)#

Convenience function for returning a 1 dimensional ndarray of values from X, layers[k], or obs.

Deprecated since version 0.13: Use anndata.acc.A instead of obs_vector. E.g. vec = adata[A.obs['foo']] or vec = adata[A.layers['l']['bar', :]]

Made for convenience, not performance. Intentionally permissive about arguments, for easy iterative use.

Parameters:

k (str) – Key to use. Should be in var_names or obs.columns.
layer (str | None (default: None)) – What layer values should be returned from. If None, X is used.

Return type:

ndarray

Returns:

A one dimensional ndarray, with values for each obs in the same order as obs_names.

obsm_keys()#

List keys of observation annotation obsm.

Deprecated since version 0.12.3: Use obsm instead of obsm_keys. (e.g. k in adata.obsm or adata.obsm.keys() | {'u'})

Return type:: list[str]

obst_keys()#

List keys of variable annotation obst.

Return type:: list[str]

rename_categories(key, categories)#

Rename categories of annotation key in obs, var, and uns.

Only supports passing a list/array-like categories argument.

Besides calling self.obs[key].cat.categories = categories – similar for var - this also renames categories in unstructured annotation that uses the categorical annotation key.

Parameters:

key (str) – Key for observations or variables annotation.
categories (Sequence[Any]) – New categories, the same number as the old categories.

strings_to_categoricals(df=None)#

Transform string annotations to categoricals.

Only affects string annotations that lead to less categories than the total number of observations.

Parameters:: df (DataFrame | None (default: None)) – If df is None, modifies both obs and var, otherwise modifies df inplace.

Notes

Turns the view of an AnnData into an actual AnnData.

to_adata()#

Convert this TreeData object to an AnnData object.

Return type:: AnnData

to_df(layer=None)#

Generate shallow DataFrame.

The data matrix X is returned as DataFrame, where obs_names initializes the index, and var_names the columns.

No annotations are maintained in the returned object.
The data matrix is densified in case it is sparse.

Parameters:: layer (str | None (default: None)) – Key for .layers.
Return type:: DataFrame
Returns:: Pandas DataFrame of specified data matrix.

to_memory(copy=False)#

Return a new AnnData object with all backed arrays loaded into memory.

Parameters:: copy (default: False) – Whether the arrays that are already in-memory should be copied.
Return type:: TreeData

transpose()#

Transpose whole object

Data matrix is transposed, observations and variables are interchanged. Ignores .raw.

Return type:: TreeData

uns_keys()#

List keys of unstructured annotation.

Deprecated since version 0.13: Use uns instead of uns_keys. (e.g. k in adata.uns or sorted(adata.uns))

Return type:: list[str]

unwriteable(*, store_type=None)#

Whether or not an AnnData object can be written to disk for a given store type.

Parameters:: store_type (Literal['h5', 'zarr'] | None (default: None)) – Which backing store - None indicates that it can be writeable to either.
Return type:: bool
Returns:: Whether or not this object is writeable. While the return type may change to include richer output about which elements cannot be written, this new type’s evaluation as a boolean will not change from the current behavior i.e., bool(adata.unwriteable()) will always evaluate the same.

var_keys()#

List keys of variable annotation var.

Deprecated since version 0.12.3: Use var instead of var_keys. (e.g. k in adata.var or str(adata.var.columns.tolist()))

Return type:: list[str]

var_names_make_unique(join='-')#

Makes the index unique by appending a number string to each duplicate index element: ‘1’, ‘2’, etc.

If a tentative name created by the algorithm already exists in the index, it tries the next integer in the sequence.

The first occurrence of a non-unique value is ignored.

Parameters:: join (str (default: '-')) – The connecting string between name and integer.
Return type:: None

Examples

>>> from anndata import AnnData
>>> adata = AnnData(np.ones((2, 3)), var=pd.DataFrame(index=["a", "a", "b"]))
>>> adata.var_names.astype("string")
Index(['a', 'a', 'b'], dtype='string')
>>> adata.var_names_make_unique()
>>> adata.var_names.astype("string")
Index(['a', 'a-1', 'b'], dtype='string')

var_vector(k, /, *, layer=None)#

Convenience function for returning a 1 dimensional ndarray of values from X, layers[k], or obs.

Deprecated since version 0.13: Use anndata.acc.A instead of var_vector. E.g. vec = adata[A.var['foo']] or vec = adata[A.layers['l'][:, 'bar']]

Made for convenience, not performance. Intentionally permissive about arguments, for easy iterative use.

Parameters:

k (str) – Key to use. Should be in obs_names or var.columns.
layer (str | None (default: None)) – What layer values should be returned from. If None, X is used.

Return type:

ndarray

Returns:

A one dimensional ndarray, with values for each var in the same order as var_names.

varm_keys()#

List keys of variable annotation varm.

Deprecated since version 0.12.3: Use varm instead of varm_keys. (e.g. k in adata.varm or adata.varm.keys() | {'u'})

Return type:: list[str]

vart_keys()#

List keys of variable annotation vart.

Return type:: list[str]

write(filename=None, compression=None, compression_opts=None, **kwargs)#

Write .h5td-formatted hdf5 file.

Parameters:

filename (PathLike | None (default: None)) – Filename of data file. Defaults to backing file.
compression (Literal['gzip', 'lzf'] | None (default: None)) – [lzf, gzip], see the h5py Filter pipeline.
compression_opts (int | Any (default: None)) – [lzf, gzip], see the h5py Filter pipeline.

write_csvs(dirname, *, skip_data=True, sep=',')#

Write annotation to .csv files.

It is not possible to recover the full AnnData from these files. Use write() for this.

Parameters:

dirname (PathLike[str] | str) – Name of directory to which to export.
skip_data (bool (default: True)) – Skip the data matrix X.
sep (str (default: ',')) – Separator for the data.

write_h5ad(filename=None, *, convert_strings_to_categoricals=True, compression=None, compression_opts=None, as_dense=())#

Write .h5ad-formatted hdf5 file.

Note

Setting compression to 'gzip' can save disk space but will slow down writing and subsequent reading. Prior to v0.6.16, this was the default for parameter compression.

Generally, if you have sparse data that are stored as a dense matrix, you can dramatically improve performance and reduce disk space by converting to a csr_matrix:

from scipy.sparse import csr_matrix
adata.X = csr_matrix(adata.X)

Parameters:

filename (PathLike[str] | str | None (default: None)) – Filename of data file. Defaults to backing file.
convert_strings_to_categoricals (bool (default: True)) – Convert string columns to categorical.
compression (Literal['gzip', 'lzf'] | None (default: None)) –
For [lzf, gzip], see the h5py Filter pipeline.

Alternative compression filters such as zstd can be passed from the hdf5plugin library. Experimental.

Usage example:
```
import hdf5plugin
adata.write_h5ad(
    filename,
    compression=hdf5plugin.FILTERS["zstd"]
)
```
Note

Datasets written with hdf5plugin-provided compressors cannot be opened without first loading the hdf5plugin library using import hdf5plugin. When using alternative compression filters such as zstd, consider writing to zarr format instead of h5ad, as the zarr library provides a more transparent compression pipeline.
compression_opts (int | Any (default: None)) –
For [lzf, gzip], see the h5py Filter pipeline.

Alternative compression filters such as zstd can be configured using helpers from the hdf5plugin library. Experimental.

Usage example (setting zstd compression level to 5):
```
import hdf5plugin
adata.write_h5ad(
    filename,
    compression=hdf5plugin.FILTERS["zstd"],
    compression_opts=hdf5plugin.Zstd(clevel=5).filter_options
)
```
as_dense (Sequence[str] (default: ())) – Sparse arrays in AnnData object to write as dense. Currently only supports X and raw/X.

write_h5td(filename=None, compression=None, compression_opts=None, **kwargs)#

Write .h5td-formatted hdf5 file.

Parameters:

filename (PathLike | None (default: None)) – Filename of data file. Defaults to backing file.
compression (Literal['gzip', 'lzf'] | None (default: None)) – [lzf, gzip], see the h5py Filter pipeline.
compression_opts (int | Any (default: None)) – [lzf, gzip], see the h5py Filter pipeline.

write_loom(filename, *, write_obsm_varm=False)#

Write .loom-formatted hdf5 file.

Deprecated since version 0.13: Deprecated in favor of other formats, e.g. write_h5ad. Loom isn’t well-maintained and supports only a subset of anndata features.

Parameters:: filename (PathLike[str] | str) – The filename.

write_zarr(store, chunks=None, **kwargs)#

Write a hierarchical Zarr array store.

Parameters:

store (MutableMapping | PathLike) – The filename, a MutableMapping, or a Zarr storage class.
chunks (tuple[int, ...] | None (default: None)) – Chunk shape.

chunk_X(select=1000, *, replace=True)#

Return a chunk of the data matrix X with random or specified indices.

Parameters:

select (int | Sequence[int] | ndarray (default: 1000)) –
Depending on the type:

int
A random chunk with select rows will be returned.

sequence (e.g. a list, tuple or numpy array) of int
A chunk with these indices will be returned.
replace (bool (default: True)) – If select is an integer then True means random sampling of indices with replacement, False without replacement.

chunked_X(chunk_size=None)#

Return an iterator over the rows of the data matrix X.

Parameters:: chunk_size (int | None (default: None)) – Row size of a single chunk.

concatenate()#

Concatenate deprecated, use treedata.concat instead.

Return type:: None

copy(filename=None)#

Full copy, optionally on disk.

Return type:: TreeData

obs_keys()#

List keys of observation annotation obs.

Deprecated since version 0.12.3: Use obs instead of obs_keys. (e.g. k in adata.obs or str(adata.obs.columns.tolist()))

Return type:: list[str]

obs_names_make_unique(join='-')#

Makes the index unique by appending a number string to each duplicate index element: ‘1’, ‘2’, etc.

If a tentative name created by the algorithm already exists in the index, it tries the next integer in the sequence.

The first occurrence of a non-unique value is ignored.

Parameters:: join (str (default: '-')) – The connecting string between name and integer.
Return type:: None

Examples

>>> from anndata import AnnData
>>> adata = AnnData(np.ones((2, 3)), var=pd.DataFrame(index=["a", "a", "b"]))
>>> adata.var_names.astype("string")
Index(['a', 'a', 'b'], dtype='string')
>>> adata.var_names_make_unique()
>>> adata.var_names.astype("string")
Index(['a', 'a-1', 'b'], dtype='string')

obs_vector(k, /, *, layer=None)#

Convenience function for returning a 1 dimensional ndarray of values from X, layers[k], or obs.

Deprecated since version 0.13: Use anndata.acc.A instead of obs_vector. E.g. vec = adata[A.obs['foo']] or vec = adata[A.layers['l']['bar', :]]

Made for convenience, not performance. Intentionally permissive about arguments, for easy iterative use.

Parameters:

k (str) – Key to use. Should be in var_names or obs.columns.
layer (str | None (default: None)) – What layer values should be returned from. If None, X is used.

Return type:

ndarray

Returns:

A one dimensional ndarray, with values for each obs in the same order as obs_names.

obsm_keys()#

List keys of observation annotation obsm.

Deprecated since version 0.12.3: Use obsm instead of obsm_keys. (e.g. k in adata.obsm or adata.obsm.keys() | {'u'})

Return type:: list[str]

obst_keys()#

List keys of variable annotation obst.

Return type:: list[str]

rename_categories(key, categories)#

Rename categories of annotation key in obs, var, and uns.

Only supports passing a list/array-like categories argument.

Besides calling self.obs[key].cat.categories = categories – similar for var - this also renames categories in unstructured annotation that uses the categorical annotation key.

Parameters:

key (str) – Key for observations or variables annotation.
categories (Sequence[Any]) – New categories, the same number as the old categories.

strings_to_categoricals(df=None)#

Transform string annotations to categoricals.

Only affects string annotations that lead to less categories than the total number of observations.

Parameters:: df (DataFrame | None (default: None)) – If df is None, modifies both obs and var, otherwise modifies df inplace.

Notes

Turns the view of an AnnData into an actual AnnData.

to_adata()#

Convert this TreeData object to an AnnData object.

Return type:: AnnData

to_df(layer=None)#

Generate shallow DataFrame.

The data matrix X is returned as DataFrame, where obs_names initializes the index, and var_names the columns.

No annotations are maintained in the returned object.
The data matrix is densified in case it is sparse.

Parameters:: layer (str | None (default: None)) – Key for .layers.
Return type:: DataFrame
Returns:: Pandas DataFrame of specified data matrix.

to_memory(copy=False)#

Return a new AnnData object with all backed arrays loaded into memory.

Parameters:: copy (default: False) – Whether the arrays that are already in-memory should be copied.
Return type:: TreeData

transpose()#

Transpose whole object

Data matrix is transposed, observations and variables are interchanged. Ignores .raw.

Return type:: TreeData

uns_keys()#

List keys of unstructured annotation.

Deprecated since version 0.13: Use uns instead of uns_keys. (e.g. k in adata.uns or sorted(adata.uns))

Return type:: list[str]

unwriteable(*, store_type=None)#

Whether or not an AnnData object can be written to disk for a given store type.

Parameters:: store_type (Literal['h5', 'zarr'] | None (default: None)) – Which backing store - None indicates that it can be writeable to either.
Return type:: bool
Returns:: Whether or not this object is writeable. While the return type may change to include richer output about which elements cannot be written, this new type’s evaluation as a boolean will not change from the current behavior i.e., bool(adata.unwriteable()) will always evaluate the same.

var_keys()#

List keys of variable annotation var.

Deprecated since version 0.12.3: Use var instead of var_keys. (e.g. k in adata.var or str(adata.var.columns.tolist()))

Return type:: list[str]

var_names_make_unique(join='-')#

Makes the index unique by appending a number string to each duplicate index element: ‘1’, ‘2’, etc.

If a tentative name created by the algorithm already exists in the index, it tries the next integer in the sequence.

The first occurrence of a non-unique value is ignored.

Parameters:: join (str (default: '-')) – The connecting string between name and integer.
Return type:: None

Examples

>>> from anndata import AnnData
>>> adata = AnnData(np.ones((2, 3)), var=pd.DataFrame(index=["a", "a", "b"]))
>>> adata.var_names.astype("string")
Index(['a', 'a', 'b'], dtype='string')
>>> adata.var_names_make_unique()
>>> adata.var_names.astype("string")
Index(['a', 'a-1', 'b'], dtype='string')

var_vector(k, /, *, layer=None)#

Convenience function for returning a 1 dimensional ndarray of values from X, layers[k], or obs.

Deprecated since version 0.13: Use anndata.acc.A instead of var_vector. E.g. vec = adata[A.var['foo']] or vec = adata[A.layers['l'][:, 'bar']]

Made for convenience, not performance. Intentionally permissive about arguments, for easy iterative use.

Parameters:

k (str) – Key to use. Should be in obs_names or var.columns.
layer (str | None (default: None)) – What layer values should be returned from. If None, X is used.

Return type:

ndarray

Returns:

A one dimensional ndarray, with values for each var in the same order as var_names.

varm_keys()#

List keys of variable annotation varm.

Deprecated since version 0.12.3: Use varm instead of varm_keys. (e.g. k in adata.varm or adata.varm.keys() | {'u'})

Return type:: list[str]

vart_keys()#

List keys of variable annotation vart.

Return type:: list[str]

write(filename=None, compression=None, compression_opts=None, **kwargs)#

Write .h5td-formatted hdf5 file.

Parameters:

filename (PathLike | None (default: None)) – Filename of data file. Defaults to backing file.
compression (Literal['gzip', 'lzf'] | None (default: None)) – [lzf, gzip], see the h5py Filter pipeline.
compression_opts (int | Any (default: None)) – [lzf, gzip], see the h5py Filter pipeline.

write_csvs(dirname, *, skip_data=True, sep=',')#

Write annotation to .csv files.

It is not possible to recover the full AnnData from these files. Use write() for this.

Parameters:

dirname (PathLike[str] | str) – Name of directory to which to export.
skip_data (bool (default: True)) – Skip the data matrix X.
sep (str (default: ',')) – Separator for the data.

write_h5ad(filename=None, *, convert_strings_to_categoricals=True, compression=None, compression_opts=None, as_dense=())#

Write .h5ad-formatted hdf5 file.

Note

Setting compression to 'gzip' can save disk space but will slow down writing and subsequent reading. Prior to v0.6.16, this was the default for parameter compression.

Generally, if you have sparse data that are stored as a dense matrix, you can dramatically improve performance and reduce disk space by converting to a csr_matrix:

from scipy.sparse import csr_matrix
adata.X = csr_matrix(adata.X)

Parameters:

filename (PathLike[str] | str | None (default: None)) – Filename of data file. Defaults to backing file.
convert_strings_to_categoricals (bool (default: True)) – Convert string columns to categorical.
compression (Literal['gzip', 'lzf'] | None (default: None)) –
For [lzf, gzip], see the h5py Filter pipeline.

Alternative compression filters such as zstd can be passed from the hdf5plugin library. Experimental.

Usage example:
```
import hdf5plugin
adata.write_h5ad(
    filename,
    compression=hdf5plugin.FILTERS["zstd"]
)
```
Note

Datasets written with hdf5plugin-provided compressors cannot be opened without first loading the hdf5plugin library using import hdf5plugin. When using alternative compression filters such as zstd, consider writing to zarr format instead of h5ad, as the zarr library provides a more transparent compression pipeline.
compression_opts (int | Any (default: None)) –
For [lzf, gzip], see the h5py Filter pipeline.

Alternative compression filters such as zstd can be configured using helpers from the hdf5plugin library. Experimental.

Usage example (setting zstd compression level to 5):
```
import hdf5plugin
adata.write_h5ad(
    filename,
    compression=hdf5plugin.FILTERS["zstd"],
    compression_opts=hdf5plugin.Zstd(clevel=5).filter_options
)
```
as_dense (Sequence[str] (default: ())) – Sparse arrays in AnnData object to write as dense. Currently only supports X and raw/X.

write_h5td(filename=None, compression=None, compression_opts=None, **kwargs)#

Write .h5td-formatted hdf5 file.

Parameters:

filename (PathLike | None (default: None)) – Filename of data file. Defaults to backing file.
compression (Literal['gzip', 'lzf'] | None (default: None)) – [lzf, gzip], see the h5py Filter pipeline.
compression_opts (int | Any (default: None)) – [lzf, gzip], see the h5py Filter pipeline.

write_loom(filename, *, write_obsm_varm=False)#

Write .loom-formatted hdf5 file.

Deprecated since version 0.13: Deprecated in favor of other formats, e.g. write_h5ad. Loom isn’t well-maintained and supports only a subset of anndata features.

Parameters:: filename (PathLike[str] | str) – The filename.

write_zarr(store, chunks=None, **kwargs)#

Write a hierarchical Zarr array store.

Parameters:

store (MutableMapping | PathLike) – The filename, a MutableMapping, or a Zarr storage class.
chunks (tuple[int, ...] | None (default: None)) – Chunk shape.

property T: TreeData#

Transpose whole object

Data matrix is transposed, observations and variables are interchanged. Ignores .raw.

property X: _XDataType | None#: Data matrix of shape n_obs × n_vars.

property alignment: Literal['leaves', 'nodes', 'subset']#: Mapping between trees and observations/variables.

property allow_overlap: bool#: Whether overlapping trees are allowed.

property filename: Path | None#

Change to backing mode by setting the filename of a .h5ad file.

Setting the filename writes the stored data to disk.
Setting the filename when the filename was previously another name moves the backing file from the previous file to the new file. If you want to copy the previous file, use copy(filename='new_filename').

property has_overlap: bool#

Flag indicating whether stored trees contain overlapping nodes.

Returns:: bool - True when any stored trees share nodes, False otherwise.

property is_view: bool#: True if object is view of another TreeData object, False otherwise.

property isbacked: bool#: True if object is backed on disk, False otherwise.

property isview: bool#: Whether or not this object is a view.

Deprecated since version 0.7.2: Use is_view instead of isview.

property label: str | None#: Column in .obs and .`obs` with tree keys

property n_obs: int#: Number of observations.

property n_vars: int#: Number of variables/features.

property obs: DataFrame | Dataset2D#: One-dimensional annotation of observations (pd.DataFrame).

property obs_names: Index#: Names of observations (alias for .obs.index).

property obst: AxisTrees | AxisTreesView#

Tree annotation of observations

Stores for each key a DiGraph with leaf nodes in obs_names. Is subset and pruned with data but behaves otherwise like a alignment.

property raw: Raw#

Store raw version of X and var as .raw.X and .raw.var.

The raw attribute is initialized with the current content of an object by setting:

adata.raw = adata.copy()

Its content can be deleted:

adata.raw = None
# or
del adata.raw

Upon slicing an AnnData object along the obs (row) axis, raw is also sliced. Slicing an AnnData object along the vars (columns) axis leaves raw unaffected. Note that you can call:

adata.raw[:, 'orig_variable_name'].X

to retrieve the data associated with a variable that might have been filtered out or “compressed away” in X.

property shape: tuple[int, int]#: Shape of data matrix (n_obs, n_vars).

property uns: MutableMapping#: Unstructured annotation (ordered dictionary).

property var: DataFrame | Dataset2D#: One-dimensional annotation of variables/ features (pd.DataFrame).

property var_names: Index#: Names of variables (alias for .var.index).

property vart: AxisTrees | AxisTreesView#

Tree annotation of variables

Stores for each key a DiGraph with leaf nodes in var_names. Is subset and pruned with data but behaves otherwise like a alignment.

treedata.TreeData

Contents

treedata.TreeData#

Attributes table#

Methods table#

Attributes#

Methods#