• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

mrocklin / dask
92%
master: 94%

Build:
Build:
LAST BUILD BRANCH: astype-passthrough
DEFAULT BRANCH: master
Repo Added 19 Apr 2015 07:56PM UTC
Files 106
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH bag-join-delayed
branch: bag-join-delayed
CHANGE BRANCH
x
Reset
  • bag-join-delayed
  • 0.18.0
  • 0.7.1
  • 3.5
  • 32bit
  • 32bit-compat
  • API-language-simplification
  • LinearOperator
  • add-authors
  • add-diversity-coc-develop
  • add-s3fs
  • add-supporters
  • add-testing-notes
  • add-visualize-to-diagnostics
  • anaconda-rename
  • api
  • apply-and-enforce-error-message
  • appveyor-version-bump
  • argreductions-unknown
  • array-arg-reduction-axis
  • array-atop-subs-type
  • array-best-practices
  • array-block-accessor
  • array-block-info
  • array-cache2
  • array-core-optimizations
  • array-dataframe-mixed
  • array-ellipsis
  • array-extend
  • array-extend-2
  • array-from-delayed-meta
  • array-hdf5
  • array-meta-xarray
  • array-randomstate-extensible
  • array-relax-meta
  • array-remove-ghost
  • array-repr-html
  • array-repr-name
  • array-setitem-error
  • array-to-dataframe
  • array-various
  • array-warnings
  • array_ufunc
  • asanyarray-datafrmae
  • asarray
  • asarray-chunks
  • assert-eq-namespace
  • astype-dtype
  • astype-passthrough
  • async-debug
  • async-exceptions
  • atop
  • atop-asarray
  • atop-error-check
  • atop-fixup
  • atop-fuse
  • atop-literals
  • atop-no-dependencies
  • atop-reduction-fuse
  • atop-validate-inputs
  • authors-scott
  • auto-chunks-creation
  • auto-chunks-limit
  • auto-load-protocols
  • avoid-fuse-array
  • avoid-warnings
  • await
  • backports-lzma
  • bag-concat
  • bag-coverage
  • bag-docs
  • bag-flatten
  • bag-fold
  • bag-frequencies-sorted
  • bag-from-castra
  • bag-hdfs
  • bag-large-files
  • bag-map-partitions-repeat
  • bag-name
  • bag-no-partitions
  • bag-range
  • bag-reify-optimize
  • bag-shuffle-keyword
  • bag-shuffle-task
  • bag-storage-options
  • bag-textfiles-empty
  • bag-to-textfiles-get
  • base-ravel-tokenize
  • bcolz-optional-lock
  • best-practices
  • big-array
  • bincount
  • blaze-org
  • blockwise
  • blockwise-root-fuse
  • bokeh-no-resize
  • bokeh-palette
  • broadcast-arrays-unify-chunks
  • bump-0.15.2
  • bump-2.0
  • bytes
  • bz2-hdfs
  • cache-key
  • cache-normalize-function
  • cache-options
  • cachey
  • callback-check
  • castra-columnstore
  • castra-fix
  • castra-selections
  • catch-warnings
  • changelog-2.2
  • chat-link
  • check-meta-typename
  • chunk-integer
  • chunks-error
  • chunks-normalize-dict
  • cite
  • clean-map-partitions
  • cleanup-rearrange-by-column-tasks
  • cleanup-threads
  • close-workers-waits
  • coarsen-dtype
  • coarsen-excess
  • collections
  • community-documentation
  • comopse
  • compatibility-collections
  • compose
  • compress
  • compute-literals
  • compute-meta
  • compute-module
  • concat
  • concatenate-dtypes
  • concatenate-errors
  • concatenate-unknown-chunksizes
  • config
  • config-array-defaults
  • config-merge-options
  • config-no-deepcopy
  • config-update-defaults
  • config-update-none
  • constructor-plugins
  • copy
  • count-no-lambda
  • cov
  • coverage
  • csv-error-message
  • csv-hdf-fixes
  • csv-header-none
  • csv-nrows
  • csv-sample-size
  • cudf-concat
  • cudf-registration
  • cull-docs
  • cum-agg-serialize
  • cumulative-reductions
  • cupy
  • custom-optimizations
  • custom-serialization
  • custom-serialization3
  • da-optimize
  • daily-stock-seed
  • dask-array-names
  • dask-array-normed
  • dask-gdf-compatibility
  • dask-interface-update
  • dask-io-get
  • dask-kubernetes-docs
  • dask.delayed-performance
  • dask.org-links
  • dataframe-accessor
  • dataframe-apply
  • dataframe-assignment
  • dataframe-atop
  • dataframe-auto-sort-set-inex
  • dataframe-best-practices
  • dataframe-creation
  • dataframe-csv-header
  • dataframe-docs
  • dataframe-docs-2
  • dataframe-eval
  • dataframe-fixes
  • dataframe-fixes-3
  • dataframe-from-imperative
  • dataframe-fuse
  • dataframe-groupby-accel
  • dataframe-groupby-reduction-obj
  • dataframe-groupby-sort
  • dataframe-hash
  • dataframe-keynames
  • dataframe-keynames-2
  • dataframe-len-optimiation
  • dataframe-like-inutils
  • dataframe-meta-err
  • dataframe-no-ix
  • dataframe-partitions
  • dataframe-pop
  • dataframe-query
  • dataframe-reduction-arithmetic
  • dataframe-repartition
  • dataframe-repeated-divisions
  • dataframe-sample-random-state
  • dataframe-scalar-finalize
  • dataframe-shuffle-dtype
  • dataframe-sorted-index
  • dataframe-std
  • dataframe-to-array
  • dataframe-var-zero
  • dataframe.drop
  • dataframe2
  • dataframe4
  • dataframes-binder
  • datasets
  • datasets-people
  • datetime-build-pd
  • dd-demo
  • dd-metadata-fillin
  • dd-repartition-same-limits
  • dd.read_hdf5-lock
  • dd.repartition
  • dealias-keys
  • debug-docs
  • deepsource-fix-46c144f8
  • delayed-attr
  • delayed-csv-names
  • delayed-function
  • delayed-motivation-docs
  • delayed-name
  • delayed-postcompute-first
  • delayed-sharedict-cleanup
  • demo-default-timeseries
  • demo-google
  • demo-name
  • demo-non-overlapping-partitions
  • demote-graph
  • deprecate-distributed
  • derived-from-fixup
  • derived_from
  • development-guidelines
  • df-split
  • diagnostics-api
  • dist-setup
  • distributed
  • distributed-api
  • distributed-cleanup
  • distributed-fix
  • distributed-import-star
  • distributed-nthreads
  • distributed-redirect
  • distributed-single-machine-docs
  • distributed-state
  • distributed-threads
  • distributed_tests
  • divide
  • do-delayed
  • doc-add-examples
  • doc-best-practices
  • doc-best-practices-large-chunks
  • doc-delayed-best-practices
  • doc-gpus
  • doc-phases-of-computation
  • doc-spark-revert
  • docs
  • docs-adaptive
  • docs-asking-for-help
  • docs-best-practices-load-data-dask
  • docs-bytes
  • docs-changelog
  • docs-cleanup
  • docs-compute-best-practices
  • docs-conda-defaults
  • docs-configuration-example
  • docs-dashboard
  • docs-dataframe-joins
  • docs-examples-redirect
  • docs-ghost
  • docs-gpu-autoplay-off
  • docs-groupby-aggregation
  • docs-gufunc
  • docs-helm-stable
  • docs-hpc-dask-jobqueue
  • docs-mcve
  • docs-meeting
  • docs-prometheus
  • docs-remote-data
  • docs-screencast-coordination
  • docs-try-now
  • docs-update
  • docs-update-2
  • docs-user-interface
  • docs-why
  • docs-zoom-meeting
  • document-order-loss
  • document-release-procedure
  • dont-fuse-numpy-arrays
  • dot-attributes
  • dot-fixes
  • dot-install
  • dot-ipy-image
  • dot-xfail-jpeg
  • drop-2.6
  • drop-duplicates-args
  • drop-new-axes
  • dtype-shapes
  • dtyped-reductions
  • effective-get
  • elemwise-stacked-with-lists
  • embarrassing
  • empty-quantiles
  • empty-rechunk
  • end-callback
  • eq
  • error-multi-index
  • external-packages-docs
  • fancy-indexing
  • fast-op
  • fastparquet-dtypes
  • fft-dtype
  • fix-1254
  • fix-3925
  • fix-faq-capitalization
  • flake
  • flake8-docs
  • fold-list
  • foo-like-name-false
  • fragment
  • from-array-auto-chunks
  • from-array-getitem
  • from-bcolz-column-order
  • from-delayed
  • from-delayed-meta
  • from-pandas-name
  • from-pandas-parallel-types
  • from-s3
  • from_array-lock
  • fsspec-docs
  • funcname-truncate
  • fuse-getitem
  • fuse-then-optimize
  • future-imports
  • futures-docs
  • gcsfs-bytes-protocol
  • get-non-recursive
  • get-raises
  • get-scheduler-test
  • getarray
  • gh-1962
  • gh-624
  • gh-872
  • ghost-optimize
  • github-issue-template
  • groupby-dataframe
  • groupby-docs
  • groupby-var-object
  • h5py-names
  • h5py-newaxis
  • handle-slow-scheduler-tests
  • has-parallel-type
  • hash-arays
  • high-level-graphs
  • html-repr-chunk-type
  • ignore-numpy-warnings
  • ignore-pil-optimize
  • ignore-progress-stdout-error
  • imperative-optimize
  • import-py2
  • import-skip-sparse
  • imports
  • imports-2
  • imread
  • imread2
  • index-dt-properties
  • index-setter
  • inline-docstring
  • institutional-faq
  • institutional-faq-2
  • is-dataframe-like
  • is-dataframe-like-attributes
  • is-dataframe-like-cudf
  • is-dataframe-type
  • is-partition-type
  • isin
  • issue-template
  • join-fixes
  • kubernetes-docs
  • landing-page
  • landing-update
  • learn
  • learn-design
  • limitations
  • list-arguments
  • list-list
  • loc
  • loc-series
  • lock-store
  • long-slice
  • machine-learning-docs
  • map-blocks-big-objects
  • map-blocks-chunks2
  • map-blocks-name
  • map-overlap
  • map-overlap-opt
  • map-overlap-ordering
  • map-overlap-shared-keys
  • map-partition-names
  • map-partitions-enforce
  • map-partitions-names
  • map-partitions-return-type
  • map_blocks_many
  • map_blocks_pandas
  • mark-slow-array-tests
  • master
  • mean-dict
  • memoize-package-of
  • memory-usage
  • merge-dispatch
  • merge-ensure-dict
  • merge-inner
  • mixed-processes
  • more-branching
  • more-register-cudf
  • more-shuffle-tests
  • multi
  • multinomial
  • mwta-warning
  • namespaces
  • nbytes
  • nearest-neighbor
  • negative-axes-reductions
  • nfs-local-docs
  • nlargest-series
  • nlp-bag
  • no-readme
  • no-write-config
  • normalize-chunks-dtype
  • normalize-chunks-none
  • normalize-function-threadsafe
  • npartitions
  • nuke-distributed
  • nuke_distributed
  • numexpr-scipy-tests
  • numpy-1.13.0
  • numpy-1.14.1
  • numpy-1.14.1-compa
  • numpy-1.17-support
  • on-disk-docs
  • oob-checks
  • optimization-fusion-config
  • optimize-graph-distributed
  • order
  • order-dependents
  • order-sen
  • order-sorted
  • order-types
  • order-up
  • pandas-warnings
  • parquet-cleanup
  • parse-bytes
  • parse-bytes-sample
  • parse-dates-multi-col
  • parse-format
  • partd-shuffle
  • partd-support
  • partial
  • percentile-unknown-chunksize
  • pillow-optimized-docstring-failure
  • pip-instal
  • pip-install
  • point-slice
  • pool-kwarg
  • prefer-pyarrow
  • preload-doc
  • profiler-multi-get
  • profiler-notebook-filename
  • profiler-robust-to-errors
  • progress-last-duration
  • progressbar-debug
  • pubsub-docs
  • pyarrow-import-guard
  • python-3.7
  • quote
  • r.0.14.1
  • raise-kwargs
  • randint-dtype
  • random-create-array-docs
  • random-dtype
  • random-ignore-choice
  • random-names
  • random-seed
  • random-state
  • read-csv-blocksize
  • read-csv-delimiter
  • read-csv-fastpath
  • read-hdf-lock
  • read_csv-storage-options
  • read_csv_sep
  • read_hdf_front_slash
  • rec-concat-3
  • rechunk-align-to-previous-chunks
  • rechunk-auto
  • rechunk-first-pass
  • rechunk-integer
  • rechunk-no-change
  • rechunk-no-concatenate
  • redivide
  • reductions-empty
  • refactor-csv
  • register-callbacks
  • relax-interleave_partitions
  • release-0.14.1
  • release-0.15.0
  • release-0.15.3
  • release-0.15.4
  • release-0.16.0
  • release-0.16.1
  • release-0.17.0
  • release-0.18.0
  • release-0.19
  • release-0.4.1
  • remote-docs
  • remote-exception-check
  • remove
  • remove-changelog-requirement
  • remove-debug-message
  • remove-example
  • remove-faq
  • remove-fuse-getitem
  • remove-ready-set
  • remove-series-columns
  • remove-store
  • rename-to-csv
  • repartition
  • repartition-arg
  • repartition-freq
  • repartition-freq-regular
  • repartition-month
  • repartition_quantiles
  • repeat-optimize-blockwise
  • replace
  • replace-dask.get
  • replace-infer-header
  • repr-data
  • repr-none-format
  • requirements
  • reshape
  • respect-output-keys
  • revert-dask-cudf
  • rsvd
  • s3-read-text
  • sample-with-replacement
  • sanitize-index
  • sanitize-index-2
  • scheduler
  • scheduler-get
  • scheudling-docs-update
  • scipy-sparse
  • scipy.sparse.compute
  • screencasts
  • screencasts-more
  • series-accessor-type
  • series-docstrings
  • series-getitem-error
  • series-groupby-nunique
  • series-nunique-2
  • series-nunique-3
  • series-repartition
  • series-to-csv
  • set-index-categoricals
  • set-index-repartition
  • share-dict-tuple
  • sharedict-cleanup
  • should-not-check-distributed-state
  • shuffle-docs-motivation
  • shuffle-performance
  • shuffle-sorted-nans
  • shuffle-tasks-fix
  • singleton-dtype
  • sisp-authors
  • sizeof-dict
  • skipinitialspace
  • slicing-names
  • slicing-semi-sorted
  • small-distributed-edits
  • small-from-pandas
  • sort-from-pandas
  • sort-on-shuffle
  • spark-docs
  • spark-yarn-dask
  • sparse
  • sparse-version
  • sparse2
  • sphinx-styling
  • sql
  • sql-bytes-per-chunk
  • ssh-docs
  • store-without-compute
  • str-expand
  • str-index
  • style-garamond-fonts
  • subs-except
  • svd-compressed-deterministic
  • svd-compressed-slow
  • svd-name-2
  • svg-pad
  • take-dask-from-numpy
  • temporary-directory
  • temporary-directory-2
  • tensordot-expand
  • test-ci
  • test-distributed
  • test-distributed-fixtures
  • test-groupby-callable
  • test-pickle-globals
  • test-travis
  • testing-documentation
  • tests-remove-verbose
  • threaded-num-workers
  • threads-processes-doc
  • threadsafe-pseudorandom
  • timeseries-control
  • timeseries-index-name
  • tmp2
  • to-castra
  • to-castra-fix
  • to-csv
  • to-csv-gzip
  • to-csv-single-partition
  • to-dask-array-values
  • to-dask-dataframe
  • to-datetime
  • to-frame
  • to-hdf5
  • to-hdfstore
  • to-imperative
  • to-npy-stack
  • to-parquet-lazy
  • tokenize-complex
  • tokenize-kwargs
  • tokenize-uuid
  • topk
  • topk-key
  • transpose-neg-axes
  • travis-fix
  • travis-not-verbose
  • tree-reductions
  • ts-csv
  • uneven-new-blockdims
  • unify-chunks-public
  • update-docs-2
  • update-fold-docs
  • update-readme
  • update-sizeof
  • update-sphinx-rtd-theme
  • update-visualize
  • uuid-dot
  • v0.20.0
  • value-counts-sort
  • value-picklable
  • various
  • various-2
  • various-3
  • various-dataframe
  • various-fixes
  • vhstack
  • view
  • visualize
  • visualize-color-cmap
  • visualize-color-order
  • visualize-list
  • webinar
  • where-error
  • why-dask
  • wide
  • windows
  • windows-3
  • windows-ci
  • windows-fixes
  • windows-fixes2
  • windows2
  • worker-default-get
  • wrapped-name
  • xarray-normalize
  • xfail-castra
  • xfail-http-test
  • yaml-loads
  • zarr-align-chunks
  • zmq-errors
  • zmq-fix

pending completion
2209

push

travis-ci

mrocklin
Support delayed and single-partition bags in Bag.join

This can significantly improve performance when joining against larger
collections due to serialization overhead on the distributed scheduler.

There is still more work to do here for multi-partition joins.

Experiments also show that GC is having a profound effect on performance
here.

17 of 17 new or added lines in 1 file covered. (100.0%)

14613 of 15876 relevant lines covered (92.04%)

0.92 hits per line

Relevant lines Covered
Build:
Build:
15876 RELEVANT LINES 14613 COVERED LINES
0.92 HITS PER LINE
Source Files on bag-join-delayed
  • List 0
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
2209 bag-join-delayed Support delayed and single-partition bags in Bag.join This can significantly improve performance when joining against larger collections due to serialization overhead on the distributed scheduler. There is still more work to do here for multi-pa... push 09 Mar 2018 07:24PM UTC mrocklin travis-ci pending completion  
2208 bag-join-delayed Support delayed and single-partition bags in Bag.join This can significantly improve performance when joining against larger collections due to serialization overhead on the distributed scheduler. There is still more work to do here for multi-pa... push 08 Mar 2018 11:59AM UTC mrocklin travis-ci pending completion  
See All Builds (1464)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc