Ag3 API#

This page provides a curated list of functions and properties available in the malariagen_data API relating to Anopheles gambiae data.

Basic data access#

client_location

config

lookup_release(sample_set)

Find which release a sample set was included in.

open_file(path)

read_files(paths[, on_error])

releases

results_cache_get(*, name, params)

results_cache_set(*, name, params, results)

sample_sets([release])

Access a dataframe of sample sets.

Sample metadata access#

add_extra_metadata(data[, on])

Add extra sample metadata, e.g., including additional columns which you would like to use to query and group samples.

aim_metadata([sample_sets])

Access ancestry-informative marker (AIM) metadata for one or more sample sets.

clear_extra_metadata()

Clear any extra metadata previously added.

cohorts_metadata([sample_sets])

Access cohort membership metadata for one or more sample sets.

count_samples([sample_sets, sample_query, ...])

Create a pivot table showing numbers of samples available by space, time and taxon.

general_metadata([sample_sets])

Read general sample metadata for one or more sample sets into a pandas DataFrame.

lookup_sample(sample[, sample_set])

Get the metadata for a specific sample and sample set.

plot_samples_bar(x[, color, sort, ...])

Plot a bar chart showing the number of samples available, grouped by some variable such as country or year.

plot_samples_interactive_map([sample_sets, ...])

Plot an interactive map showing sampling locations using ipyleaflet.

sample_metadata([sample_sets, sample_query, ...])

Access sample metadata for one or more sample sets.

wgs_data_catalog(sample_set)

Load a data catalog providing URLs for downloading BAM, VCF and Zarr files for samples in a given sample set.

SNP data access#

is_accessible(region[, site_mask, ...])

Compute genome accessibility array.

open_site_annotations()

Open site annotations zarr.

open_site_filters(mask)

Open site filters zarr.

open_snp_genotypes(sample_set)

Open SNP genotypes zarr for a given sample set.

open_snp_sites()

Open SNP sites zarr.

plot_snps(region[, sample_sets, ...])

Plot SNPs in a given genome region.

plot_snps_track(region[, sample_sets, ...])

Plot SNPs in a given genome region.

site_annotations(region[, site_mask, ...])

Load site annotations.

site_filters(region, mask[, field, ...])

Access SNP site filters.

site_mask_ids

Identifiers for the different site masks that are available.

snp_allele_counts(region[, sample_sets, ...])

Compute SNP allele counts.

snp_calls(region[, sample_sets, ...])

Access SNP sites, site filters and genotype calls.

snp_dataset(*args, **kwargs)

Deprecated, this method has been renamed to snp_calls().

snp_genotypes(region[, sample_sets, ...])

Access SNP genotypes and associated data.

snp_sites(region, field[, site_mask, ...])

Access SNP site data (positions or alleles).

snp_variants(region[, site_mask, ...])

Access SNP sites and site filters.

Haplotype data access#

haplotypes(region[, analysis, sample_sets, ...])

Access haplotype data.

open_haplotypes(sample_set[, analysis])

Open haplotypes zarr.

open_haplotype_sites([analysis])

Open haplotype sites zarr.

phasing_analysis_ids

Identifiers for the different phasing analyses that are available.

CNV data access#

coverage_calls_analysis_ids

Identifiers for the different coverage calls analyses that are available.

cnv_coverage_calls(region, sample_set, analysis)

Access CNV HMM data from genome-wide CNV discovery and filtering.

cnv_discordant_read_calls(contig[, ...])

Access CNV discordant read calls data.

cnv_hmm(region[, sample_sets, sample_query, ...])

Access CNV HMM data from CNV calling.

open_cnv_coverage_calls(sample_set, analysis)

Open CNV coverage calls zarr.

open_cnv_discordant_read_calls(sample_set)

Open CNV discordant read calls zarr.

open_cnv_hmm(sample_set)

Open CNV HMM zarr.

plot_cnv_hmm_coverage(sample, region[, ...])

Plot CNV HMM data for a single sample, together with a genes track, using bokeh.

plot_cnv_hmm_coverage_track(sample, region)

Plot CNV HMM data for a single sample, using bokeh.

plot_cnv_hmm_heatmap(region[, sample_sets, ...])

Plot CNV HMM data for multiple samples as a heatmap, with a genes track, using bokeh.

plot_cnv_hmm_heatmap_track(region[, ...])

Plot CNV HMM data for multiple samples as a heatmap, using bokeh.