malariagen_data.ag3.Ag3.aim_calls#

Ag3.aim_calls(aims: str, sample_sets: Sequence[str] | str | None = None, sample_query: str | None = None, sample_query_options: dict | None = None) Dataset#

Access ancestry informative marker SNP sites, alleles and genotype calls.

Parameters#

aimsstr

Identifier for a set of ancestry informative markers to use. For possible values see the aim_ids property.

sample_setssequence of str or str or None, optional

List of sample sets and/or releases. Can also be a single sample set or release.

sample_querystr or None, optional

A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data.

sample_query_optionsdict or None, optional

A dictionary of arguments that will be passed through to pandas query() or eval(), e.g. parser, engine, local_dict, global_dict, resolvers.

Returns#

Dataset

A dataset with 4 dimensions: variants the number of AIMs sites, samples the number of samples, ploidy the ploidy (2), and alleles which will always be 2, each representing one of the species. It contains 3 coordinates: sample_id has samples values and contains the identifier of each sample, variant_contig has variants values and contains the chromosome arm of each AIM, and variant_position has variants values and contains the position of each AIM. It contains 2 data variables: call_genotype has (variants, samples, ploidy) values and contains both calls for each sample and each AIM, variant_allele has (variants, allele) values and contains the discriminating alleles for each AIM.