malariagen_data.af1.Af1.fst_gwss#
- Af1.fst_gwss(contig: str, window_size: int, cohort1_query: str, cohort2_query: str, sample_sets: Sequence[str] | str | None = None, site_mask: str | None = 'default', cohort_size: int | None = None, min_cohort_size: int | None = 15, max_cohort_size: int | None = 50, random_seed: int = 42, inline_array: bool = True, chunks: str | Tuple[int, ...] | Callable[[Tuple[int, ...]], Tuple[int, ...]] = 'native') Tuple[ndarray, ndarray]#
- Run a Fst genome-wide scan to investigate genetic differentiation between two cohorts. - Parameters#- contigstr
- Reference genome contig name. See the contigs property for valid contig names. 
- window_sizeint
- The size of windows (number of sites) used to calculate statistics within. 
- cohort1_querystr
- A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data. 
- cohort2_querystr
- A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data. 
- sample_setssequence of str or str or None, optional
- List of sample sets and/or releases. Can also be a single sample set or release. 
- site_maskstr or None, optional, default: ‘default’
- Which site filters mask to apply. See the site_mask_ids property for available values. 
- cohort_sizeint or None, optional
- Randomly down-sample to this value if the number of samples in the cohort is greater. Raise an error if the number of samples is less than this value. 
- min_cohort_sizeint or None, optional, default: 15
- Minimum cohort size. Raise an error if the number of samples is less than this value. 
- max_cohort_sizeint or None, optional, default: 50
- Randomly down-sample to this value if the number of samples in the cohort is greater. 
- random_seedint, optional, default: 42
- Random seed used for reproducible down-sampling. 
- inline_arraybool, optional, default: True
- Passed through to dask from_array(). 
- chunksstr or tuple of int or Callable[[typing.Tuple[int, …]], tuple of int], optional, default: ‘native’
- If ‘auto’ let dask decide chunk size. If ‘native’ use native zarr chunks. Also, can be a target size, e.g., ‘200 MiB’, or a tuple of integers. 
 - Returns#- xndarray
- An array containing the window centre point genomic positions. 
- fstndarray
- An array with Fst statistic values for each window.