• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

apache / carbondata / 1311
76%

Build:
DEFAULT BRANCH: master
Ran 27 Nov 2018 08:47AM UTC
Jobs 1
Files 1017
Run time 47s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
1311

push

jenkins

ravipesala
[CARBONDATA-3118] Parallelize block pruning of default datamap in driver for filter query processing

Parallelize block pruning of default datamap in driver for filter query processing

Background:
We do block pruning for the filter queries at the driver side.
In real time big data scenario, we can have millions of carbon files for
one carbon table.
It is currently observed that for 1 million carbon files it takes around 5
seconds for block pruning. As each carbon file takes around 0.005ms for
pruning
(with only one filter columns set in 'column_meta_cache' tblproperty).
If the files are more, we might take more time for block pruning.
Also, spark Job will not be launched until block pruning is completed. so,
the user will not know what is happening at that time and why spark job is
not launching.
currently, block pruning is taking time as each segment processing is
happening sequentially. we can reduce the time by parallelizing it.

solution:
Keep default number of threads for block pruning as 4.
User can reduce this number by a carbon property
carbon.max.driver.threads.for.pruning to set between -> 1 to 4.

In TableDataMap.prune(),

group the segments as per the threads by distributing equal carbon files to
each thread.
Launch the threads for a group of segments to handle block pruning.

This closes #2936

61617 of 77469 relevant lines covered (79.54%)

1.05 hits per line

Jobs
ID Job ID Ran Files Coverage
1 1311.1 27 Nov 2018 08:47AM UTC 0
79.54
Source Files on build 1311
Detailed source file information is not available for this build.
  • Back to Repo
  • Jenkins Build #1311
  • 0f5e898b on github
  • Prev Build on master (#1313)
  • Next Build on master (#1314)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc