Skip to content

Add Dask documentation#529

Open
mohalkh5 wants to merge 7 commits intoResearchComputing:mainfrom
mohalkh5:dask
Open

Add Dask documentation#529
mohalkh5 wants to merge 7 commits intoResearchComputing:mainfrom
mohalkh5:dask

Conversation

@mohalkh5
Copy link
Copy Markdown
Contributor

This PR adds new documentation for using Dask on Alpine, motivated by Issue 433. The goal of this addition is to provide a clear, end-to-end guide covering:

  • Core Dask concepts
  • Setting up and using a distributed cluster
  • Accessing the Dask dashboard in Open OnDemand
  • Practical examples for common workflows

Copy link
Copy Markdown
Contributor

@SchneiderCode SchneiderCode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great overview of utilizing Dask. Most of my comments below focus either on accessibility or personal preferences around readability/organization.

Comment thread docs/programming/dask.md

![](./dask_images/distributed_cluster.png)

### Setting up a Local Cluster
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add a Pre-Step that shows or references the Jupyter Notebooks configuration in Open OnDemand. This is important because the "n_workers=4" is likely related to the provided configurations (i.e. 4 cores, 4 hours).

This doesn't have to be a thorough discussion - we can also just have a link that points to the relevant OnDemand page in the RTD docs.

Comment thread docs/programming/dask.md Outdated
- Uses CPU cores and memory
- Can run on the same node or across multiple nodes

![](./dask_images/distributed_cluster.png)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general note - we'll need to add alternative text for all of the screenshots. Also, for screenshots that include notes / commands, we'll need to ensure that information is still accessible in the tutorial. For example, we could add a caption that details this information (if not already covered in the nearby instructions).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If time allows, consider reformatting as a split column layout. With the description of Dask on the left and the image on the right. This could be done with sphinx grids:
https://sphinx-design.readthedocs.io/en/stable/grids.html

Comment thread docs/programming/dask.md Outdated
Comment thread docs/programming/dask.md Outdated
Comment thread docs/programming/dask.md Outdated
Comment thread docs/programming/dask.md
Comment thread docs/programming/dask.md Outdated
```
This will generate a random array, and it will automatically create the tasks, and from there the sums will be parallelised. This is similar to what you would see in MPI, but much easier to implement.

![](./dask_images/dask-array-output2.png)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessibility: Alt Text

Check if Dask provides alt-text for the visualization.

Comment thread docs/programming/dask.md Outdated
Comment thread docs/programming/dask.md Outdated
Comment thread docs/programming/dask.md Outdated
mohalkh5 and others added 6 commits May 3, 2026 22:44
Co-authored-by: Michael Schneider <m.schneider.programmer@gmail.com>
Co-authored-by: Michael Schneider <m.schneider.programmer@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants