• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

dask / dask / 15781
53%

Build:
DEFAULT BRANCH: master
Ran 27 Oct 2020 04:16AM UTC
Jobs 1
Files 113
Run time 17s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
15781

push

travis-ci

web-flow
Begin experimenting with parallel prefix scan for cumsum and cumprod (#6675)

* Begin experimenting with parallel prefix scan for cumsum and cumprod in dask.array

This is a WIP and needs benchmarked.  I think it's interesting, though, and want to share.
It's been a while since I've worked on dask.array, so feedback is most welcome.

This is a work-efficient parallel prefix scan.  It uses a Brent-Kung construction and
is known as the Blelloch algorithm.  We adapt it to work on chunks.

Previously, to do a cumsum across N chunks would require N levels of dependencies.
This PR takes approximately 2 * lg(N) levels of dependencies.  It exposes parallelism.
It is work-efficient and only requires a third more tasks than the previous method.
Scans on floating point values should also be more accurate.

A parallel cumsum works by first taking the sum of each block, then do a binary tree
merge followed by a fan-out (i.e., the Brent-Kung pattern).  We then take the cumsum
of each block and add the sum of the previous blocks.

NumPy calculates cumsum and cumprod very fast, but it calculates sum and prod
significantly faster.  This is why I think this approach will be faster.
Exposing parallelism and an efficient communication pattern is another reason I think
this should be faster (especially when communication costs are significant).

I also think this will be an interesting test for `dask.order` and the scheduler.

Q: Should we allow users to choose which method to use (i.e., prev or new in this PR)?
Does the answer to this depend on benchmarks?

Benchmarks and graph diagrams are forthcoming :)

* Choose cumsum/cumprod with `method=` keyword argument.

Current choices are "sequential", "blelloch", and "blelloch-split".
Default is "sequential".  I need to document these.

* black

* Add docstrings for "blelloch" method for cumsum/cumprod

80 of 80 new or added lines in 2 files covered. (100.0%)

21241 of 22596 relevant lines covered (94.0%)

0.94 hits per line

Jobs
ID Job ID Ran Files Coverage
3 15781.3 (PYTHON_VERSION=3.8 ENV_FILE=continuous_integration/environment-3.8.yaml TEST='true' COVERAGE='true' PARALLEL='false' XTRATESTARGS= TEST_IMPORTS='true' NUMPY_EXPERIMENTAL_ARRAY_FUNCTION='1') 27 Oct 2020 04:16AM UTC 0
94.0
Travis Job 15781.3
Source Files on build 15781
Detailed source file information is not available for this build.
  • Back to Repo
  • Travis Build #15781
  • a8329c20 on github
  • Prev Build on master (#15777)
  • Next Build on master (#15784)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc