Dask clusters#

This page describes how to create, scale, use and shut down Dask clusters.

For an example of using Dask clusters to analyse data from Ag3, see the video below.

Accessing the gateway#

@@TODO explain

from dask_gateway import Gateway
gateway = Gateway()

@@TODO explain, selecting environment, selecting profile

import os
conda_environment = os.environ['CONDA_PREFIX'].split('/')[-1]
conda_environment

'binder-v3.0.0'

cluster_options = gateway.cluster_options()
cluster_options

cluster_options._fields['profile'].options

('Standard Worker',
 'High Memory Worker',
 'Very High Memory Worker',
 'Standard Worker Not Preemptible')

cluster = gateway.new_cluster(environment=conda_environment, profile="Standard Worker")
cluster

@@TODO explain

cluster.scale(10)

cluster.scale(0)

cluster.adapt(minimum=0, maximum=10)

@@TODO explain

client = cluster.get_client()

import dask.array as da
x = da.random.random((500_000, 50000), chunks=(1000, 1000))
y = x.mean(axis=1)
y

# watch the Dask dashboard
y.compute()

array([0.50043175, 0.50108079, 0.50329407, ..., 0.50033652, 0.49935371,
       0.49858524])

@@TODO explain

cluster.shutdown()

@@TODO

gateway.list_clusters()

[ClusterReport<name=dev.600f7e441656459ca1327e7287db951b, status=RUNNING>]