• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

SciCrunch / sparc-curation / 550
2%

Build:
DEFAULT BRANCH: master
Ran 22 May 2020 05:43AM UTC
Jobs 1
Files 33
Run time 6s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
550

push

travis-ci

tgbugs
massive improvements in spc clone time, setup.py ver and dep bumps

A fresh pull of all 200 remote datasets now takes about 3 minutes.

NOTE: `spc pull` should NOT BE USED unless you know exactly what
you are doing. In the future this functionality will be restored
with better performance, but for now it is almost always faster
delete the contents of the dataset folder and express ds.rchildren.

It only took me about 9 months to finally figure out that I had
actually fixed many of the pulling performance bottlenecks and that we
can almost entirely get rid of the current implementation of pull.

As it turns out it I got almost everything sorted out so that it is
possible to just call `list(dataset_cache.rchildren)` and the entire
entire tree will populate itself. When we fix the cache constructor
this becomes `[rc.materialize() for rc in d.rchildren]` or similar,
depending on exactly what we name that method. Better yet, if we do
it using a bare for loop then the memory overhead will be zero.

The other piece that makes this faster is the completed sparse pull
implementation. We now use the remote package count with a default
cutoff of 10k packages to cause a dataset to be sparse, namely that
only its metadata files and their parend directories are pulled. The
implementation of that is a bit slow, but still about 2 orders of
magnitude faster than the alternative. The approach for implementing
is_sparse also points the way toward being able to mark folders with
additional operational information, e.g. that they should not be
exported or that they should not be pulled at all.

Some tweaks to how spc rmeta works were also made so that existing
metadata will not be repulled in a bulk clone. This work also makes
the BlackfynnCache aware of the dataset metadata pulled from rmeta,
so we should be able to start comparing ttl file and bf:internal
metadata in the near future.

1713 of 8608 relevant lines covered (19.9%)

0.2 hits per line

Jobs
ID Job ID Ran Files Coverage
2 550.2 (SCIGRAPH_API=https://scicrunch.org/api/1/sparc-scigraph SCICRUNCH_API_KEY=[secure]) 22 May 2020 05:43AM UTC 0
19.9
Travis Job 550.2
Source Files on build 550
Detailed source file information is not available for this build.
  • Back to Repo
  • Travis Build #550
  • 113f6327 on github
  • Prev Build on master (#549)
  • Next Build on master (#551)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc