Ag3 cohorts analysis version 20240924
A new cohorts analysis version 20240924 has been released for the
Ag3 data resource. This is now the default cohorts analysis version
when using the malariagen_data Ag3
API. This
cohorts analysis will be available for datasets up to and including Ag3.13.
Please note that the new cohorts analysis may change the values of
sample metadata columns including taxon, admin1_iso,
admin1_name, admin2_name, and derived columns beginning cohorts_
relative to previous cohorts analysis versions.
To pin this cohorts analysis when accessing data:
import malariagen_data
ag3 = malariagen_data.Ag3(
cohorts_analysis="20240924",
)This new version introduces some key changes:
- Samples previously labeled as
gcx1in thetaxonfield have been renamed tobissau:gcx(gambiae complex cryptic taxa) labels serve as placeholders for groups outside our usual taxonomic assignment- Following Caputo et al. (2024), the
gcx1group has been renamed toBissau molecular form - 291 samples previously assigned as
gcx1, are now labeled asbissau. - 5 previously
unassignedsamples are also relabeled asbissau. - Cohort names have been updated, e.g.
GM-M_gcx1_2019is nowGM-M_biss_2019
-
36 previously
unassignedsamples have been reclassified as: 32melas, 2gambiae, 1fontenillei, 1arabiensis. - A location metadata error affecting the administrative region (level 1) of 119 samples has been corrected:
admin1_isoupdated fromUG-EtoKE-04admin1_nameupdated fromEastern RegiontoBusia- Cohort names have been updated, e.g.
UG-E_arab_2013has now been relabeled toKE-04_arab_2013
If you need to access the previous version of the cohorts analysis, you can pin it using the code in here.