Dask clusters
Contents
Dask clusters#
This page describes how to create, scale, use and shut down Dask clusters.
For an example of using Dask clusters to analyse data from Ag3, see the video below.
Accessing the gateway#
@@TODO explain
from dask_gateway import Gateway
gateway = Gateway()
Creating a cluster#
@@TODO explain, selecting environment, selecting profile
import os
conda_environment = os.environ['CONDA_PREFIX'].split('/')[-1]
conda_environment
'binder-v3.0.0'
cluster_options = gateway.cluster_options()
cluster_options
cluster_options._fields['profile'].options
('Standard Worker',
'High Memory Worker',
'Very High Memory Worker',
'Standard Worker Not Preemptible')
cluster = gateway.new_cluster(environment=conda_environment, profile="Standard Worker")
cluster
Scaling a cluster#
@@TODO explain
cluster.scale(10)
cluster.scale(0)
cluster.adapt(minimum=0, maximum=10)
Connecting to a cluster#
@@TODO explain
client = cluster.get_client()
Running a computation#
import dask.array as da
x = da.random.random((500_000, 50000), chunks=(1000, 1000))
y = x.mean(axis=1)
y
|
# watch the Dask dashboard
y.compute()
array([0.50043175, 0.50108079, 0.50329407, ..., 0.50033652, 0.49935371,
0.49858524])
Shutting down a cluster#
@@TODO explain
cluster.shutdown()
Reconnecting to an existing cluster#
@@TODO
gateway.list_clusters()
[ClusterReport<name=dev.600f7e441656459ca1327e7287db951b, status=RUNNING>]