vobs updates

Technical and scientific updates from the Malaria Vector Genome Observatory.

14 August 2024 | data

Terms of use metadata

Some new metadata columns have been added to help identify which sample sets have usage restrictions (e.g., publication embargo) and which are available for unrestricted use. The metadata columns are:

  • terms_of_use_expiry_date - Gives the date on which any terms of use will expire. After this date, there will be no usage restrictions on data relating to the sample set.
  • terms_of_use_url - Gives the address of a web page that describes any usage restrictions which apply to the sample set.

If the value of either of these fields is empty then there are no terms of use applying to the sample set.

These new metadata columns can be accessed via the malariagen_data Python API. The API also addes an additional computed field:

  • unrestricted_use - This is a computed column which is added for convenience. The value is True if the terms of use have expired, or if there were never any usage restrictions applied.

The metadata columns are available in the dataframes returned by the sample_sets() and sample_metadata() functions. Below are some examples for data from the Anopheles gambiae complex accessed via the Ag3 API. Similar code can be used for Anopheles funestus samples via the Af1 API.

import malariagen_data
ag3 = malariagen_data.Ag3()

Sample set metadata:

df_sample_sets = ag3.sample_sets()
df_sample_sets[["sample_set", "terms_of_use_expiry_date", "unrestricted_use"]]
sample_set terms_of_use_expiry_date unrestricted_use
0 AG1000G-AO 2025-01-01 False
1 AG1000G-BF-A 2025-01-01 False
2 AG1000G-BF-B 2025-01-01 False
3 AG1000G-BF-C 2025-01-01 False
4 AG1000G-CD 2025-01-01 False
... ... ... ...
78 1323-VO-GM-NGWA-VMF00235 2026-04-09 False
79 1323-VO-GM-NGWA-VMF00242 2026-04-09 False
80 1329-VO-GA-CHRISTOPHE-VMF00228 2026-04-09 False
81 bergey-2019 NaN True
82 campos-2021 NaN True

83 rows × 3 columns

Query to find sample sets with no usage restrictions:

df_sample_sets.query("unrestricted_use")
sample_set sample_count study_id study_url terms_of_use_expiry_date terms_of_use_url release unrestricted_use
32 fontaine-2015-rebuild 72 fontaine-2015-rebuild https://doi.org/10.1126/science.1258524 NaN https://www.science.org/doi/10.1126/science.12... 3.10 True
34 1237-VO-BJ-DJOGBENOU-VMF00050 90 1237-VO-BJ-DJOGBENOU https://www.malariagen.net/partner_study/1237-... 2024-07-22 https://malariagen.github.io/vector-data/ag3/a... 3.2 True
35 1237-VO-BJ-DJOGBENOU-VMF00067 142 1237-VO-BJ-DJOGBENOU https://www.malariagen.net/partner_study/1237-... 2024-07-22 https://malariagen.github.io/vector-data/ag3/a... 3.2 True
36 1244-VO-GH-YAWSON-VMF00051 666 1244-VO-GH-YAWSON https://www.malariagen.net/partner_study/1244-... 2024-07-22 https://malariagen.github.io/vector-data/ag3/a... 3.2 True
37 1245-VO-CI-CONSTANT-VMF00054 38 1245-VO-CI-CONSTANT https://www.malariagen.net/partner_study/1245-... 2024-07-22 https://malariagen.github.io/vector-data/ag3/a... 3.2 True
38 1253-VO-TG-DJOGBENOU-VMF00052 179 1253-VO-TG-DJOGBENOU https://www.malariagen.net/partner_study/1253-... 2024-07-22 https://malariagen.github.io/vector-data/ag3/a... 3.2 True
39 1178-VO-UG-LAWNICZAK-VMF00025 57 1178-VO-UG-LAWNICZAK https://www.malariagen.net/partner_study/1178-... 2023-10-26 https://malariagen.github.io/vector-data/ag3/a... 3.3 True
65 barron-2019 4 barron-2019 https://doi.org/10.1038/s41598-019-49065-5 NaN https://www.nature.com/articles/s41598-019-490... 3.7 True
66 crawford-2016 25 crawford-2016 https://doi.org/10.1111/mec.13572 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... 3.7 True
72 tennessen-2021 208 tennessen-2021 https://doi.org/10.1111/mec.15756 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... 3.8 True
81 bergey-2019 113 bergey-2019 https://doi.org/10.1111/eva.12878 NaN https://onlinelibrary.wiley.com/doi/10.1111/ev... 3.9 True
82 campos-2021 163 campos-2021 https://doi.org/10.1038/s42003-021-02168-0, ht... NaN https://www.nature.com/articles/s42003-021-021... 3.9 True

Sample metadata:

df_samples = ag3.sample_metadata()
df_samples[["sample_id", "sample_set", "terms_of_use_expiry_date", "terms_of_use_url", "unrestricted_use"]]
sample_id sample_set terms_of_use_expiry_date terms_of_use_url unrestricted_use
0 VBS00256-4651STDY7017184 1177-VO-ML-LEHMANN-VMF00004 2025-11-17 https://malariagen.github.io/vector-data/ag3/a... False
1 VBS00257-4651STDY7017185 1177-VO-ML-LEHMANN-VMF00004 2025-11-17 https://malariagen.github.io/vector-data/ag3/a... False
2 VBS00259-4651STDY7017186 1177-VO-ML-LEHMANN-VMF00004 2025-11-17 https://malariagen.github.io/vector-data/ag3/a... False
3 VBS00262-4651STDY7017187 1177-VO-ML-LEHMANN-VMF00004 2025-11-17 https://malariagen.github.io/vector-data/ag3/a... False
4 VBS00277-4651STDY7017189 1177-VO-ML-LEHMANN-VMF00004 2025-11-17 https://malariagen.github.io/vector-data/ag3/a... False
... ... ... ... ... ...
19766 SAMN15222632 tennessen-2021 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... True
19767 SAMN15222633 tennessen-2021 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... True
19768 SAMN15222634 tennessen-2021 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... True
19769 SAMN15222635 tennessen-2021 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... True
19770 SAMN15222636 tennessen-2021 NaN https://onlinelibrary.wiley.com/doi/10.1111/me... True

19771 rows × 5 columns

Query to find samples with no usage restrictions:

df_samples.query("unrestricted_use")
sample_id partner_sample_id contributor country location year month latitude longitude sex_call ... admin1_name admin1_iso admin2_name taxon cohort_admin1_year cohort_admin1_month cohort_admin1_quarter cohort_admin2_year cohort_admin2_month cohort_admin2_quarter
670 VBS10116-4954STDY7089644 UG4A2016A1_96 Mara Lawniczak Uganda Busia 2013 1 0.466 34.089 F ... Eastern Region UG-E Busia gambiae UG-E_gamb_2013 UG-E_gamb_2013_01 UG-E_gamb_2013_Q1 UG-E_Busia_gamb_2013 UG-E_Busia_gamb_2013_01 UG-E_Busia_gamb_2013_Q1
671 VBS10117-4954STDY7089645 UG4A2016B1_95 Mara Lawniczak Uganda Busia 2016 6 0.466 34.089 F ... Eastern Region UG-E Busia gambiae UG-E_gamb_2016 UG-E_gamb_2016_06 UG-E_gamb_2016_Q2 UG-E_Busia_gamb_2016 UG-E_Busia_gamb_2016_06 UG-E_Busia_gamb_2016_Q2
672 VBS10118-4954STDY7089646 UG4A2016C1_94 Mara Lawniczak Uganda Busia 2016 6 0.466 34.089 F ... Eastern Region UG-E Busia gambiae UG-E_gamb_2016 UG-E_gamb_2016_06 UG-E_gamb_2016_Q2 UG-E_Busia_gamb_2016 UG-E_Busia_gamb_2016_06 UG-E_Busia_gamb_2016_Q2
673 VBS10119-4954STDY7089647 UG4A2016D1_93 Mara Lawniczak Uganda Busia 2016 6 0.466 34.089 F ... Eastern Region UG-E Busia gambiae UG-E_gamb_2016 UG-E_gamb_2016_06 UG-E_gamb_2016_Q2 UG-E_Busia_gamb_2016 UG-E_Busia_gamb_2016_06 UG-E_Busia_gamb_2016_Q2
674 VBS10120-4954STDY7089648 UG4A2016E1_92 Mara Lawniczak Uganda Busia 2016 6 0.466 34.089 F ... Eastern Region UG-E Busia gambiae UG-E_gamb_2016 UG-E_gamb_2016_06 UG-E_gamb_2016_Q2 UG-E_Busia_gamb_2016 UG-E_Busia_gamb_2016_06 UG-E_Busia_gamb_2016_Q2
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
19766 SAMN15222632 D342 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19767 SAMN15222633 D343 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19768 SAMN15222634 D346 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19769 SAMN15222635 D347 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19770 SAMN15222636 D348 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016

1757 rows × 57 columns

Example query to combine with other filters:

df_samples.query("country == 'Burkina Faso' and unrestricted_use")
sample_id partner_sample_id contributor country location year month latitude longitude sex_call ... admin1_name admin1_iso admin2_name taxon cohort_admin1_year cohort_admin1_month cohort_admin1_quarter cohort_admin2_year cohort_admin2_month cohort_admin2_quarter
19466 SAMN03299607 GOUND_0022 Jacob E Crawford Burkina Faso Goundry 2008 11 12.518 -1.341 UKN ... Plateau Central BF-11 Oubritenga arabiensis BF-11_arab_2008 BF-11_arab_2008_11 BF-11_arab_2008_Q4 BF-11_Oubritenga_arab_2008 BF-11_Oubritenga_arab_2008_11 BF-11_Oubritenga_arab_2008_Q4
19467 SAMN03299611 GOUND_0103 Jacob E Crawford Burkina Faso Goundry 2008 11 12.518 -1.341 F ... Plateau Central BF-11 Oubritenga arabiensis BF-11_arab_2008 BF-11_arab_2008_11 BF-11_arab_2008_Q4 BF-11_Oubritenga_arab_2008 BF-11_Oubritenga_arab_2008_11 BF-11_Oubritenga_arab_2008_Q4
19468 SAMN03299612 GOUND_0105 Jacob E Crawford Burkina Faso Goundry 2008 11 12.518 -1.341 UKN ... Plateau Central BF-11 Oubritenga arabiensis BF-11_arab_2008 BF-11_arab_2008_11 BF-11_arab_2008_Q4 BF-11_Oubritenga_arab_2008 BF-11_Oubritenga_arab_2008_11 BF-11_Oubritenga_arab_2008_Q4
19469 SAMN03299614 GOUND_0137 Jacob E Crawford Burkina Faso Goundry 2008 11 12.518 -1.341 F ... Plateau Central BF-11 Oubritenga arabiensis BF-11_arab_2008 BF-11_arab_2008_11 BF-11_arab_2008_Q4 BF-11_Oubritenga_arab_2008 BF-11_Oubritenga_arab_2008_11 BF-11_Oubritenga_arab_2008_Q4
19470 SAMN03299615 KODOU_0009 Jacob E Crawford Burkina Faso Kodougou 2008 11 12.520 -3.607 F ... Boucle du Mouhoun BF-01 Kossi arabiensis BF-01_arab_2008 BF-01_arab_2008_11 BF-01_arab_2008_Q4 BF-01_Kossi_arab_2008 BF-01_Kossi_arab_2008_11 BF-01_Kossi_arab_2008_Q4
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
19766 SAMN15222632 D342 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19767 SAMN15222633 D343 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19768 SAMN15222634 D346 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19769 SAMN15222635 D347 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016
19770 SAMN15222636 D348 Jacob Tennessen Burkina Faso Tengrela 2016 -1 10.700 -4.800 F ... Cascades BF-02 Comoe coluzzii BF-02_colu_2016 BF-02_colu_2016 BF-02_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016 BF-02_Comoe_colu_2016

243 rows × 57 columns

Note that all sample sets in the vector observatory can be accessed and analysed at any time for public health purposes. If any terms of use apply, they may restrict the public communication of any analysis results (publication embargo) for a period of time.

If you have any questions about usage restrictions, please get in touch via support@malariagen.net.