Using malariagen_data on Google Colab (TPU Runtime)#

Overview#

When using a Google Colab v2-8 TPU runtime, installing malariagen_data may fail due to a dependency conflict with a preinstalled system package.

Colab TPU runtimes ship with:

  • blinker==1.4 installed via distutils/system packages

During installation, dashFlask requires:

  • blinker>=1.6.2

Because the preinstalled version is a distutils-installed package, pip cannot uninstall it, and installation fails with:

error: uninstall-distutils-installed-package

× Cannot uninstall blinker 1.4
╰─> It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

This issue appears specific to the TPU runtime image.

Reproducing the Issue#

  1. Open Google Colab

  2. Select Runtime → Change runtime type

  3. Choose TPU

  4. Run:

pip install malariagen_data

Installation fails due to the blinker conflict described above.

Cloud Data Access (GCS)#

Most datasets are hosted on Google Cloud Storage.

If you see errors such as:

403: Permission denied on storage.objects.get

Authenticate your Colab session:

from google.colab import auth
auth.authenticate_user()

You may also need to request access to certain datasets: https://forms.gle/d1NV3aL3EoVQGSHYA

Troubleshooting#

Check which version of blinker is installed:

pip show blinker
python -c "import blinker; print(blinker.__version__)"

If version 1.4 is installed under /usr/lib/python3/dist-packages, this indicates the TPU system package.