malariagen_data.ag3.Ag3.roh_hmm#
- Ag3.roh_hmm(sample: str | int, region: str | Region | Mapping, window_size: int = 20000, site_mask: str | None = 'default', sample_set: str | None = None, phet_roh: float = 0.001, phet_nonroh: Tuple[float, ...] = (0.003, 0.01), transition: float = 0.001) DataFrame #
Infer runs of homozygosity for a single sample over a genome region.
Parameters#
- samplestr or int
Sample identifier or index within sample set.
- regionstr or Region or Mapping
Region of the reference genome. Can be a contig name, region string (formatted like “{contig}:{start}-{end}”), or identifier of a genome feature such as a gene or transcript.
- window_sizeint, optional, default: 20000
Number of sites per window.
- site_maskstr or None, optional, default: ‘default’
Which site filters mask to apply. See the site_mask_ids property for available values.
- sample_setstr or None, optional
Sample set identifier.
- phet_rohfloat, optional, default: 0.001
Probability of observing a heterozygote in a ROH.
- phet_nonrohtuple of float, optional, default: (0.003, 0.01)
One or more probabilities of observing a heterozygote outside a ROH.
- transitionfloat, optional, default: 0.001
Probability of moving between states. A larger window size may call for a larger transitional probability.
Returns#
- DataFrame
A DataFrame where each row provides data about a single run of homozygosity.