
Ag3.pca(region: str | Region | Mapping | List[str | Region | Mapping] | Tuple[str | Region | Mapping, ...], n_snps: int, thin_offset: int = 0, sample_sets: Sequence[str] | str | None = None, sample_query: str | None = None, site_mask: str | None = 'default', min_minor_ac: int = 2, max_missing_an: int = 0, n_components: int = 20) Tuple[DataFrame, ndarray]#

Run a principal components analysis (PCA) using biallelic SNPs from the selected genome region and samples.


regionstr or Region or Mapping or list of str or Region or Mapping or tuple of str or Region or Mapping

Region of the reference genome. Can be a contig name, region string (formatted like “{contig}:{start}-{end}”), or identifier of a genome feature such as a gene or transcript. Can also be a sequence (e.g., list) of regions.


The desired number of SNPs to use when running the analysis. SNPs will be evenly thinned to approximately this number.

thin_offsetint, optional, default: 0

Starting index for SNP thinning. Change this to repeat the analysis using a different set of SNPs.

sample_setssequence of str or str or None, optional

List of sample sets and/or releases. Can also be a single sample set or release.

sample_querystr or None, optional

A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data.

site_maskstr or None, optional, default: ‘default’

Which site filters mask to apply. See the site_mask_ids property for available values.

min_minor_acint, optional, default: 2

The minimum minor allele count. SNPs with a minor allele count below this value will be excluded prior to thinning.

max_missing_anint, optional, default: 0

The maximum number of missing allele calls to accept. SNPs with more than this value will be excluded prior to thinning. Set to 0 (default) to require no missing calls.

n_componentsint, optional, default: 20

Number of components to return.



A dataframe of sample metadata, with columns “PC1”, “PC2”, “PC3”, etc., added.


An array of explained variance ratios, one per component.


This computation may take some time to run, depending on your computing environment. Results of this computation will be cached and re-used if the results_cache parameter was set when instantiating the Ag3 class.