Vector Observatory Data Access#

MalariaGEN data resources provide an integrated view of malaria vector genomes from across the globe. These data are available to everyone to benefit the science and surveillance of malaria. You can find more information on the vector data resources here.

Vector Observatory data are stored in Google Cloud Storage (GCS). The current set-up requires users to request access and authenticate prior to accessing data.

Please note that although all data are available for immediate access for public health and educational purposes, the releases accessible through the Vector Observatory are subject to different terms of use, including an embargo on public communications, which encompasses academic publications. Each release, has specific terms of use attached, which are described within each release page.


To access data from the Vector Observatory, you will need to follow these steps:

Step 1. Make sure you have a Google Account#

To allow us to configure data access permissions, you will need to provide us with an email address that is associated with a Google account. This could be a standard Google (i.e., GMail) account, or alternatively it could be your work email if your employer uses Google Workspace.

Step 2. Fill out the data access request form#

Please fill out and submit the following form:

All requests for data access will be granted, subject to verification checks and agreement to reasonable use. This is to ensure that the data resources remain accessible to everyone. Submitting this form will allow us to configure storage permissions and monitor storage for excessive network usage in future.

Step 3. Ensure you are using the latest version of the malariagen_data Python package#

If you access data via the malariagen_data Python package, please upgrade to version 9.0 or higher. These versions will automatically use your authentication credentials when accessing data in Google Cloud.

Step 4. Set up Google Cloud authentication credentials#

If you are only accessing data via the malariagen_data Python package from within Google Colab, you can skip this step, because authentication credentials will be obtained automatically.

If you are accessing data from any other location, you will need to authenticate with Google Cloud. To do this, you will need to:

  1. Install the Google Cloud CLI. See the details in the Google Documentation here.

  2. Check gcloud is installed correctly:

gcloud help
  1. Authenticate using gcloud:

  • If you need to authenticate within the malariagen_data package, you will need to use the following command:

gcloud auth application-default login
  • If you need to authenticate to access Google Cloud Storage from the command line using gsutil, you will need to use the following command:

gcloud auth login

If you have any questions, please contact us at: mailto:support@malariagen.net