• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

mrocklin / dask
92%
master: 94%

Build:
Build:
LAST BUILD BRANCH: astype-passthrough
DEFAULT BRANCH: master
Repo Added 19 Apr 2015 07:56PM UTC
Files 106
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH order-sorted
branch: order-sorted
CHANGE BRANCH
x
Reset
  • order-sorted
  • 0.18.0
  • 0.7.1
  • 3.5
  • 32bit
  • 32bit-compat
  • API-language-simplification
  • LinearOperator
  • add-authors
  • add-diversity-coc-develop
  • add-s3fs
  • add-supporters
  • add-testing-notes
  • add-visualize-to-diagnostics
  • anaconda-rename
  • api
  • apply-and-enforce-error-message
  • appveyor-version-bump
  • argreductions-unknown
  • array-arg-reduction-axis
  • array-atop-subs-type
  • array-best-practices
  • array-block-accessor
  • array-block-info
  • array-cache2
  • array-core-optimizations
  • array-dataframe-mixed
  • array-ellipsis
  • array-extend
  • array-extend-2
  • array-from-delayed-meta
  • array-hdf5
  • array-meta-xarray
  • array-randomstate-extensible
  • array-relax-meta
  • array-remove-ghost
  • array-repr-html
  • array-repr-name
  • array-setitem-error
  • array-to-dataframe
  • array-various
  • array-warnings
  • array_ufunc
  • asanyarray-datafrmae
  • asarray
  • asarray-chunks
  • assert-eq-namespace
  • astype-dtype
  • astype-passthrough
  • async-debug
  • async-exceptions
  • atop
  • atop-asarray
  • atop-error-check
  • atop-fixup
  • atop-fuse
  • atop-literals
  • atop-no-dependencies
  • atop-reduction-fuse
  • atop-validate-inputs
  • authors-scott
  • auto-chunks-creation
  • auto-chunks-limit
  • auto-load-protocols
  • avoid-fuse-array
  • avoid-warnings
  • await
  • backports-lzma
  • bag-concat
  • bag-coverage
  • bag-docs
  • bag-flatten
  • bag-fold
  • bag-frequencies-sorted
  • bag-from-castra
  • bag-hdfs
  • bag-join-delayed
  • bag-large-files
  • bag-map-partitions-repeat
  • bag-name
  • bag-no-partitions
  • bag-range
  • bag-reify-optimize
  • bag-shuffle-keyword
  • bag-shuffle-task
  • bag-storage-options
  • bag-textfiles-empty
  • bag-to-textfiles-get
  • base-ravel-tokenize
  • bcolz-optional-lock
  • best-practices
  • big-array
  • bincount
  • blaze-org
  • blockwise
  • blockwise-root-fuse
  • bokeh-no-resize
  • bokeh-palette
  • broadcast-arrays-unify-chunks
  • bump-0.15.2
  • bump-2.0
  • bytes
  • bz2-hdfs
  • cache-key
  • cache-normalize-function
  • cache-options
  • cachey
  • callback-check
  • castra-columnstore
  • castra-fix
  • castra-selections
  • catch-warnings
  • changelog-2.2
  • chat-link
  • check-meta-typename
  • chunk-integer
  • chunks-error
  • chunks-normalize-dict
  • cite
  • clean-map-partitions
  • cleanup-rearrange-by-column-tasks
  • cleanup-threads
  • close-workers-waits
  • coarsen-dtype
  • coarsen-excess
  • collections
  • community-documentation
  • comopse
  • compatibility-collections
  • compose
  • compress
  • compute-literals
  • compute-meta
  • compute-module
  • concat
  • concatenate-dtypes
  • concatenate-errors
  • concatenate-unknown-chunksizes
  • config
  • config-array-defaults
  • config-merge-options
  • config-no-deepcopy
  • config-update-defaults
  • config-update-none
  • constructor-plugins
  • copy
  • count-no-lambda
  • cov
  • coverage
  • csv-error-message
  • csv-hdf-fixes
  • csv-header-none
  • csv-nrows
  • csv-sample-size
  • cudf-concat
  • cudf-registration
  • cull-docs
  • cum-agg-serialize
  • cumulative-reductions
  • cupy
  • custom-optimizations
  • custom-serialization
  • custom-serialization3
  • da-optimize
  • daily-stock-seed
  • dask-array-names
  • dask-array-normed
  • dask-gdf-compatibility
  • dask-interface-update
  • dask-io-get
  • dask-kubernetes-docs
  • dask.delayed-performance
  • dask.org-links
  • dataframe-accessor
  • dataframe-apply
  • dataframe-assignment
  • dataframe-atop
  • dataframe-auto-sort-set-inex
  • dataframe-best-practices
  • dataframe-creation
  • dataframe-csv-header
  • dataframe-docs
  • dataframe-docs-2
  • dataframe-eval
  • dataframe-fixes
  • dataframe-fixes-3
  • dataframe-from-imperative
  • dataframe-fuse
  • dataframe-groupby-accel
  • dataframe-groupby-reduction-obj
  • dataframe-groupby-sort
  • dataframe-hash
  • dataframe-keynames
  • dataframe-keynames-2
  • dataframe-len-optimiation
  • dataframe-like-inutils
  • dataframe-meta-err
  • dataframe-no-ix
  • dataframe-partitions
  • dataframe-pop
  • dataframe-query
  • dataframe-reduction-arithmetic
  • dataframe-repartition
  • dataframe-repeated-divisions
  • dataframe-sample-random-state
  • dataframe-scalar-finalize
  • dataframe-shuffle-dtype
  • dataframe-sorted-index
  • dataframe-std
  • dataframe-to-array
  • dataframe-var-zero
  • dataframe.drop
  • dataframe2
  • dataframe4
  • dataframes-binder
  • datasets
  • datasets-people
  • datetime-build-pd
  • dd-demo
  • dd-metadata-fillin
  • dd-repartition-same-limits
  • dd.read_hdf5-lock
  • dd.repartition
  • dealias-keys
  • debug-docs
  • deepsource-fix-46c144f8
  • delayed-attr
  • delayed-csv-names
  • delayed-function
  • delayed-motivation-docs
  • delayed-name
  • delayed-postcompute-first
  • delayed-sharedict-cleanup
  • demo-default-timeseries
  • demo-google
  • demo-name
  • demo-non-overlapping-partitions
  • demote-graph
  • deprecate-distributed
  • derived-from-fixup
  • derived_from
  • development-guidelines
  • df-split
  • diagnostics-api
  • dist-setup
  • distributed
  • distributed-api
  • distributed-cleanup
  • distributed-fix
  • distributed-import-star
  • distributed-nthreads
  • distributed-redirect
  • distributed-single-machine-docs
  • distributed-state
  • distributed-threads
  • distributed_tests
  • divide
  • do-delayed
  • doc-add-examples
  • doc-best-practices
  • doc-best-practices-large-chunks
  • doc-delayed-best-practices
  • doc-gpus
  • doc-phases-of-computation
  • doc-spark-revert
  • docs
  • docs-adaptive
  • docs-asking-for-help
  • docs-best-practices-load-data-dask
  • docs-bytes
  • docs-changelog
  • docs-cleanup
  • docs-compute-best-practices
  • docs-conda-defaults
  • docs-configuration-example
  • docs-dashboard
  • docs-dataframe-joins
  • docs-examples-redirect
  • docs-ghost
  • docs-gpu-autoplay-off
  • docs-groupby-aggregation
  • docs-gufunc
  • docs-helm-stable
  • docs-hpc-dask-jobqueue
  • docs-mcve
  • docs-meeting
  • docs-prometheus
  • docs-remote-data
  • docs-screencast-coordination
  • docs-try-now
  • docs-update
  • docs-update-2
  • docs-user-interface
  • docs-why
  • docs-zoom-meeting
  • document-order-loss
  • document-release-procedure
  • dont-fuse-numpy-arrays
  • dot-attributes
  • dot-fixes
  • dot-install
  • dot-ipy-image
  • dot-xfail-jpeg
  • drop-2.6
  • drop-duplicates-args
  • drop-new-axes
  • dtype-shapes
  • dtyped-reductions
  • effective-get
  • elemwise-stacked-with-lists
  • embarrassing
  • empty-quantiles
  • empty-rechunk
  • end-callback
  • eq
  • error-multi-index
  • external-packages-docs
  • fancy-indexing
  • fast-op
  • fastparquet-dtypes
  • fft-dtype
  • fix-1254
  • fix-3925
  • fix-faq-capitalization
  • flake
  • flake8-docs
  • fold-list
  • foo-like-name-false
  • fragment
  • from-array-auto-chunks
  • from-array-getitem
  • from-bcolz-column-order
  • from-delayed
  • from-delayed-meta
  • from-pandas-name
  • from-pandas-parallel-types
  • from-s3
  • from_array-lock
  • fsspec-docs
  • funcname-truncate
  • fuse-getitem
  • fuse-then-optimize
  • future-imports
  • futures-docs
  • gcsfs-bytes-protocol
  • get-non-recursive
  • get-raises
  • get-scheduler-test
  • getarray
  • gh-1962
  • gh-624
  • gh-872
  • ghost-optimize
  • github-issue-template
  • groupby-dataframe
  • groupby-docs
  • groupby-var-object
  • h5py-names
  • h5py-newaxis
  • handle-slow-scheduler-tests
  • has-parallel-type
  • hash-arays
  • high-level-graphs
  • html-repr-chunk-type
  • ignore-numpy-warnings
  • ignore-pil-optimize
  • ignore-progress-stdout-error
  • imperative-optimize
  • import-py2
  • import-skip-sparse
  • imports
  • imports-2
  • imread
  • imread2
  • index-dt-properties
  • index-setter
  • inline-docstring
  • institutional-faq
  • institutional-faq-2
  • is-dataframe-like
  • is-dataframe-like-attributes
  • is-dataframe-like-cudf
  • is-dataframe-type
  • is-partition-type
  • isin
  • issue-template
  • join-fixes
  • kubernetes-docs
  • landing-page
  • landing-update
  • learn
  • learn-design
  • limitations
  • list-arguments
  • list-list
  • loc
  • loc-series
  • lock-store
  • long-slice
  • machine-learning-docs
  • map-blocks-big-objects
  • map-blocks-chunks2
  • map-blocks-name
  • map-overlap
  • map-overlap-opt
  • map-overlap-ordering
  • map-overlap-shared-keys
  • map-partition-names
  • map-partitions-enforce
  • map-partitions-names
  • map-partitions-return-type
  • map_blocks_many
  • map_blocks_pandas
  • mark-slow-array-tests
  • master
  • mean-dict
  • memoize-package-of
  • memory-usage
  • merge-dispatch
  • merge-ensure-dict
  • merge-inner
  • mixed-processes
  • more-branching
  • more-register-cudf
  • more-shuffle-tests
  • multi
  • multinomial
  • mwta-warning
  • namespaces
  • nbytes
  • nearest-neighbor
  • negative-axes-reductions
  • nfs-local-docs
  • nlargest-series
  • nlp-bag
  • no-readme
  • no-write-config
  • normalize-chunks-dtype
  • normalize-chunks-none
  • normalize-function-threadsafe
  • npartitions
  • nuke-distributed
  • nuke_distributed
  • numexpr-scipy-tests
  • numpy-1.13.0
  • numpy-1.14.1
  • numpy-1.14.1-compa
  • numpy-1.17-support
  • on-disk-docs
  • oob-checks
  • optimization-fusion-config
  • optimize-graph-distributed
  • order
  • order-dependents
  • order-sen
  • order-types
  • order-up
  • pandas-warnings
  • parquet-cleanup
  • parse-bytes
  • parse-bytes-sample
  • parse-dates-multi-col
  • parse-format
  • partd-shuffle
  • partd-support
  • partial
  • percentile-unknown-chunksize
  • pillow-optimized-docstring-failure
  • pip-instal
  • pip-install
  • point-slice
  • pool-kwarg
  • prefer-pyarrow
  • preload-doc
  • profiler-multi-get
  • profiler-notebook-filename
  • profiler-robust-to-errors
  • progress-last-duration
  • progressbar-debug
  • pubsub-docs
  • pyarrow-import-guard
  • python-3.7
  • quote
  • r.0.14.1
  • raise-kwargs
  • randint-dtype
  • random-create-array-docs
  • random-dtype
  • random-ignore-choice
  • random-names
  • random-seed
  • random-state
  • read-csv-blocksize
  • read-csv-delimiter
  • read-csv-fastpath
  • read-hdf-lock
  • read_csv-storage-options
  • read_csv_sep
  • read_hdf_front_slash
  • rec-concat-3
  • rechunk-align-to-previous-chunks
  • rechunk-auto
  • rechunk-first-pass
  • rechunk-integer
  • rechunk-no-change
  • rechunk-no-concatenate
  • redivide
  • reductions-empty
  • refactor-csv
  • register-callbacks
  • relax-interleave_partitions
  • release-0.14.1
  • release-0.15.0
  • release-0.15.3
  • release-0.15.4
  • release-0.16.0
  • release-0.16.1
  • release-0.17.0
  • release-0.18.0
  • release-0.19
  • release-0.4.1
  • remote-docs
  • remote-exception-check
  • remove
  • remove-changelog-requirement
  • remove-debug-message
  • remove-example
  • remove-faq
  • remove-fuse-getitem
  • remove-ready-set
  • remove-series-columns
  • remove-store
  • rename-to-csv
  • repartition
  • repartition-arg
  • repartition-freq
  • repartition-freq-regular
  • repartition-month
  • repartition_quantiles
  • repeat-optimize-blockwise
  • replace
  • replace-dask.get
  • replace-infer-header
  • repr-data
  • repr-none-format
  • requirements
  • reshape
  • respect-output-keys
  • revert-dask-cudf
  • rsvd
  • s3-read-text
  • sample-with-replacement
  • sanitize-index
  • sanitize-index-2
  • scheduler
  • scheduler-get
  • scheudling-docs-update
  • scipy-sparse
  • scipy.sparse.compute
  • screencasts
  • screencasts-more
  • series-accessor-type
  • series-docstrings
  • series-getitem-error
  • series-groupby-nunique
  • series-nunique-2
  • series-nunique-3
  • series-repartition
  • series-to-csv
  • set-index-categoricals
  • set-index-repartition
  • share-dict-tuple
  • sharedict-cleanup
  • should-not-check-distributed-state
  • shuffle-docs-motivation
  • shuffle-performance
  • shuffle-sorted-nans
  • shuffle-tasks-fix
  • singleton-dtype
  • sisp-authors
  • sizeof-dict
  • skipinitialspace
  • slicing-names
  • slicing-semi-sorted
  • small-distributed-edits
  • small-from-pandas
  • sort-from-pandas
  • sort-on-shuffle
  • spark-docs
  • spark-yarn-dask
  • sparse
  • sparse-version
  • sparse2
  • sphinx-styling
  • sql
  • sql-bytes-per-chunk
  • ssh-docs
  • store-without-compute
  • str-expand
  • str-index
  • style-garamond-fonts
  • subs-except
  • svd-compressed-deterministic
  • svd-compressed-slow
  • svd-name-2
  • svg-pad
  • take-dask-from-numpy
  • temporary-directory
  • temporary-directory-2
  • tensordot-expand
  • test-ci
  • test-distributed
  • test-distributed-fixtures
  • test-groupby-callable
  • test-pickle-globals
  • test-travis
  • testing-documentation
  • tests-remove-verbose
  • threaded-num-workers
  • threads-processes-doc
  • threadsafe-pseudorandom
  • timeseries-control
  • timeseries-index-name
  • tmp2
  • to-castra
  • to-castra-fix
  • to-csv
  • to-csv-gzip
  • to-csv-single-partition
  • to-dask-array-values
  • to-dask-dataframe
  • to-datetime
  • to-frame
  • to-hdf5
  • to-hdfstore
  • to-imperative
  • to-npy-stack
  • to-parquet-lazy
  • tokenize-complex
  • tokenize-kwargs
  • tokenize-uuid
  • topk
  • topk-key
  • transpose-neg-axes
  • travis-fix
  • travis-not-verbose
  • tree-reductions
  • ts-csv
  • uneven-new-blockdims
  • unify-chunks-public
  • update-docs-2
  • update-fold-docs
  • update-readme
  • update-sizeof
  • update-sphinx-rtd-theme
  • update-visualize
  • uuid-dot
  • v0.20.0
  • value-counts-sort
  • value-picklable
  • various
  • various-2
  • various-3
  • various-dataframe
  • various-fixes
  • vhstack
  • view
  • visualize
  • visualize-color-cmap
  • visualize-color-order
  • visualize-list
  • webinar
  • where-error
  • why-dask
  • wide
  • windows
  • windows-3
  • windows-ci
  • windows-fixes
  • windows-fixes2
  • windows2
  • worker-default-get
  • wrapped-name
  • xarray-normalize
  • xfail-castra
  • xfail-http-test
  • yaml-loads
  • zarr-align-chunks
  • zmq-errors
  • zmq-fix

pending completion
2219

push

travis-ci

mrocklin
Avoid sorting large stacks in order

When performning task ordering we sort tasks based on the
number of dependents/dependencies they have.  This is critical to
low-memory processing.

However, sometimes individual tasks have millions of dependencies,
for which an n*log(n) sort adds significant overhead.  In these cases
we give up on sorting, and just hope that the tasks are well ordered
naturally (such as is often the case in Python 3.6+ due to sorted
dicts and the natural ordering that exists when constructing common
graphs)

See https://github.com/pangeo-data/pangeo/issues/150#issuecomment-373066066
for a real-world case

9 of 9 new or added lines in 1 file covered. (100.0%)

14635 of 15903 relevant lines covered (92.03%)

0.92 hits per line

Relevant lines Covered
Build:
Build:
15903 RELEVANT LINES 14635 COVERED LINES
0.92 HITS PER LINE
Source Files on order-sorted
  • List 0
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
2219 order-sorted Avoid sorting large stacks in order When performning task ordering we sort tasks based on the number of dependents/dependencies they have. This is critical to low-memory processing. However, sometimes individual tasks have millions of dependenc... push 19 Mar 2018 02:54PM UTC mrocklin travis-ci pending completion  
2218 order-sorted Avoid sorting large stacks in order When performning task ordering we sort tasks based on the number of dependents/dependencies they have. This is critical to low-memory processing. However, sometimes individual tasks have millions of dependenc... push 19 Mar 2018 02:39PM UTC mrocklin travis-ci pending completion  
See All Builds (1464)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc