malariagen_data.adir1.Adir1.sample_metadata#

Adir1.sample_metadata(sample_sets: Sequence[str] | str | None = None, sample_query: str | None = None, sample_query_options: dict | None = None, sample_indices: List[int] | None = None) DataFrame#

Access sample metadata for one or more sample sets.

Parameters#

sample_setssequence of str or str or None, optional

List of sample sets and/or releases. Can also be a single sample set or release.

sample_querystr or None, optional

A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data. E.g., “country == ‘Uganda’”. If the query returns zero results, a warning will be emitted with fuzzy-match suggestions for possible typos or case mismatches.

sample_query_optionsdict or None, optional

A dictionary of arguments that will be passed through to pandas query() or eval(), e.g. parser, engine, local_dict, global_dict, resolvers.

sample_indiceslist of int or None, optional

Advanced usage parameter. A list of indices of samples to select, corresponding to the order in which the samples are found within the sample metadata. Either provide this parameter or sample_query, not both.

Returns#

DataFrame

A dataframe of sample metadata, one row per sample.

Notes#

Some samples in the dataset are lab crosses — mosquitoes bred in the laboratory that have no real collection date. These samples use year=-1 and month=-1 as sentinel values. They may cause unexpected results in date-based analyses (e.g., pd.to_datetime will fail on negative year values).

To exclude lab cross samples, use:

df = api.sample_metadata(sample_query="year >= 0")