malariagen_data.af1.Af1.h1x_gwss#

Af1.h1x_gwss(contig: str, window_size: int, cohort1_query: str, cohort2_query: str, analysis: str = 'default', sample_sets: Sequence[str] | str | None = None, cohort_size: int | None = None, min_cohort_size: int | None = 15, max_cohort_size: int | None = 50, random_seed: int = 42) → Tuple[ndarray, ndarray]#

Run a H1X genome-wide scan to detect genome regions with shared selective sweeps between two cohorts.

Parameters#

contigstr: Reference genome contig name. See the contigs property for valid contig names.
window_sizeint: The size of windows (number of SNPs) used to calculate statistics within.
cohort1_querystr: A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data.
cohort2_querystr: A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data.
analysisstr, optional, default: ‘default’: Which haplotype phasing analysis to use. See the phasing_analysis_ids property for available values.
sample_setssequence of str or str or None, optional: List of sample sets and/or releases. Can also be a single sample set or release.
cohort_sizeint or None, optional: Randomly down-sample to this value if the number of samples in the cohort is greater. Raise an error if the number of samples is less than this value.
min_cohort_sizeint or None, optional, default: 15: Minimum cohort size. Raise an error if the number of samples is less than this value.
max_cohort_sizeint or None, optional, default: 50: Randomly down-sample to this value if the number of samples in the cohort is greater.
random_seedint, optional, default: 42: Random seed used for reproducible down-sampling.

Returns#

xndarray: An array containing the window centre point genomic positions.
h1xndarray: An array with H1X statistic values for each window.