malariagen_data.ag3.Ag3.plot_njt#
- Ag3.plot_njt(region: str | Region | Mapping | List[str | Region | Mapping] | Tuple[str | Region | Mapping, ...], n_snps: int, color: str | Mapping | None = None, symbol: str | Mapping | None = None, metric: Literal['cityblock', 'euclidean', 'sqeuclidean'] = 'cityblock', distance_sort: bool | None = None, count_sort: bool | None = None, center_x=0, center_y=0, arc_start=0, arc_stop=6.283185307179586, width: int | None = 800, height: int | None = 600, show: bool = True, renderer: str | None = None, render_mode: Literal['auto', 'svg', 'webgl'] = 'svg', title: str | bool | None = True, title_font_size: int = 14, line_width: int | float = 0.5, marker_size: int | float = 5, color_discrete_sequence: List | None = None, color_discrete_map: Mapping | None = None, category_orders: List | Mapping | None = None, edge_legend: bool = False, leaf_legend: bool = True, legend_sizing: Literal['constant', 'trace'] = 'constant', thin_offset: int = 0, sample_sets: Sequence[str] | str | None = None, sample_query: str | None = None, sample_indices: List[int] | None = None, site_mask: str | None = 'default', site_class: str | None = None, min_minor_ac: int | None = None, max_missing_an: int | None = None, cohort_size: int | None = None, min_cohort_size: int | None = None, max_cohort_size: int | None = None, random_seed: int = 42, inline_array: bool = True, chunks: str | Tuple[int, ...] | Callable[[Tuple[int, ...]], Tuple[int, ...]] = 'native') Figure | None #
Plot an unrooted neighbour-joining tree, computed from pairwise distances between samples using biallelic SNP genotypes.
The tree is displayed as an unrooted tree using the equal angles layout.
Parameters#
- regionstr or Region or Mapping or list of str or Region or Mapping or tuple of str or Region or Mapping
Region of the reference genome. Can be a contig name, region string (formatted like “{contig}:{start}-{end}”), or identifier of a genome feature such as a gene or transcript. Can also be a sequence (e.g., list) of regions.
- n_snpsint
The desired number of SNPs to use when running the analysis. SNPs will be evenly thinned to approximately this number.
- colorstr or Mapping or None, optional
Name of variable to use to color the markers.
- symbolstr or Mapping or None, optional
Name of the variable to use to choose marker symbols.
- metric{‘cityblock’, ‘euclidean’, ‘sqeuclidean’}, optional, default: ‘cityblock’
The metric to compute distance between genotypes in two samples.
- distance_sortbool or None, optional
If True, for each node n, if True, the child with the minimum distance between is plotted first. Note distance_sort and count_sort cannot both be True.
- count_sortbool or None, optional
If True, for each node n, the child with the minimum number of descendants is plotted first. Note distance_sort and count_sort cannot both be True.
- center_xoptional, default: 0
X coordinate where plotting is centered.
- center_yoptional, default: 0
Y coordinate where plotting is centered.
- arc_startoptional, default: 0
Angle where tree layout begins.
- arc_stopoptional, default: 6.283185307179586
Angle where tree layout ends.
- widthint or None, optional, default: 800
Plot width in pixels (px).
- heightint or None, optional, default: 600
Plot height in pixels (px).
- showbool, optional, default: True
If true, show the plot. If False, do not show the plot, but return the figure.
- rendererstr or None, optional
The name of the renderer to use.
- render_mode{‘auto’, ‘svg’, ‘webgl’}, optional, default: ‘svg’
The type of rendering backend to use. See also https://plotly.com/python/webgl-vs-svg/.
- titlestr or bool or None, optional, default: True
If True, attempt to use metadata from input dataset as a plot title. Otherwise, use supplied value as a title.
- title_font_sizeint, optional, default: 14
Font size for the plot title.
- line_widthint or float, optional, default: 0.5
Line width.
- marker_sizeint or float, optional, default: 5
Marker size.
- color_discrete_sequenceList or None, optional
Provide a list of colours to use.
- color_discrete_mapMapping or None, optional
Provide an explicit mapping from values to colours.
- category_ordersList or Mapping or None, optional
Control the order in which values appear in the legend.
- edge_legendbool, optional, default: False
Show legend entries for the different edge (line) colors.
- leaf_legendbool, optional, default: True
Show legend entries for the different leaf node (scatter) colors and symbols.
- legend_sizing{‘constant’, ‘trace’}, optional, default: ‘constant’
Controls sizing of items in legends, either ‘trace’ or ‘constant’.
- thin_offsetint, optional, default: 0
Starting index for SNP thinning. Change this to repeat the analysis using a different set of SNPs.
- sample_setssequence of str or str or None, optional
List of sample sets and/or releases. Can also be a single sample set or release.
- sample_querystr or None, optional
A pandas query string to be evaluated against the sample metadata, to select samples to be included in the returned data.
- sample_indiceslist of int or None, optional
Advanced usage parameter. A list of indices of samples to select, corresponding to the order in which the samples are found within the sample metadata. Either provide this parameter or sample_query, not both.
- site_maskstr or None, optional, default: ‘default’
Which site filters mask to apply. See the site_mask_ids property for available values.
- site_classstr or None, optional
Select sites belonging to one of the following classes: CDS_DEG_4, (4-fold degenerate coding sites), CDS_DEG_2_SIMPLE (2-fold simple degenerate coding sites), CDS_DEG_0 (non-degenerate coding sites), INTRON_SHORT (introns shorter than 100 bp), INTRON_LONG (introns longer than 200 bp), INTRON_SPLICE_5PRIME (intron within 2 bp of 5’ splice site), INTRON_SPLICE_3PRIME (intron within 2 bp of 3’ splice site), UTR_5PRIME (5’ untranslated region), UTR_3PRIME (3’ untranslated region), INTERGENIC (intergenic, more than 10 kbp from a gene).
- min_minor_acint or None, optional
The minimum minor allele count. SNPs with a minor allele count below this value will be excluded.
- max_missing_anint or None, optional
The maximum number of missing allele calls to accept. SNPs with more than this value will be excluded. Set to 0 to require no missing calls.
- cohort_sizeint or None, optional
Randomly down-sample to this value if the number of samples in the cohort is greater. Raise an error if the number of samples is less than this value.
- min_cohort_sizeint or None, optional
Minimum cohort size. Raise an error if the number of samples is less than this value.
- max_cohort_sizeint or None, optional
Randomly down-sample to this value if the number of samples in the cohort is greater.
- random_seedint, optional, default: 42
Random seed used for reproducible down-sampling.
- inline_arraybool, optional, default: True
Passed through to dask from_array().
- chunksstr or tuple of int or Callable[[typing.Tuple[int, …]], tuple of int], optional, default: ‘native’
If ‘auto’ let dask decide chunk size. If ‘native’ use native zarr chunks. Also, can be a target size, e.g., ‘200 MiB’, or a tuple of integers.
Returns#
- Figure or None
A plotly figure (only returned if show=False).