Amin1 API

This page provides documentation for functions in the malariagen_data Python package for accessing Anopheles minimus data.

Amin1()

malariagen_data.Amin1(**kwargs)

sample_metadata()

Amin1.sample_metadata()

Access sample metadata.

Returns
dfpandas.DataFrame

genome_sequence()

Amin1.genome_sequence(region, inline_array=True, chunks='native')

Access the reference genome sequence.

Parameters
region: str or list of str or Region

Contig (e.g., “KB663610”), gene name (e.g., “AMIN002150”), genomic region defined with coordinates (e.g., “KB663610:1-100000”) or a named tuple with genomic location Region(contig, start, end). Multiple values can be provided as a list, in which case data will be concatenated.

inline_arraybool, optional

Passed through to dask.array.from_array().

chunksstr, optional

If ‘auto’ let dask decide chunk size. If ‘native’ use native zarr chunks. Also can be a target size, e.g., ‘200 MiB’.

Returns
ddask.array.Array

geneset()

Amin1.geneset(attributes=('ID', 'Parent', 'Name', 'description'))

Access genome feature annotations.

Parameters
attributeslist of str, optional

Attribute keys to unpack into columns. Provide “*” to unpack all attributes.

Returns
dfpandas.DataFrame

snp_calls()

Amin1.snp_calls(region, site_mask=False, inline_array=True, chunks='native')

Access SNP sites, site filters and genotype calls.

Parameters
region: str or list of str or Region

Contig (e.g., “KB663610”), gene name (e.g., “AMIN002150”), genomic region defined with coordinates (e.g., “KB663610:1-100000”) or a named tuple with genomic location Region(contig, start, end). Multiple values can be provided as a list, in which case data will be concatenated.

site_maskbool

Apply site filters.

inline_arraybool, optional

Passed through to dask.array.from_array().

chunksstr, optional

If ‘auto’ let dask decide chunk size. If ‘native’ use native zarr chunks. Also can be a target size, e.g., ‘200 MiB’.

Returns
dsxarray.Dataset