• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

lightcopy / parquet-index / 288
96%
master: 97%

Build:
Build:
LAST BUILD BRANCH: v0.5.0
DEFAULT BRANCH: master
Ran 22 Mar 2017 06:57AM UTC
Jobs 5
Files 0
Run time 1s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
288

push

travis-ci

web-flow
add futures for pruning indexed partitions (#74)

This PR updates code for sequential resolution of `foldFilter` by using futures and executing `resolveSupported` method in parallel for each file in Parquet partition; partitions are still resolved sequentially. 

Manual testing shows about 1.5-2x performance improvement when filtering index partitions (with large portion of files being scanned). But it also introduces a little bit of overhead when filtering partitions on cached index (20ms vs 35ms). This approach is similar to eager loading.

Benchmarks:
Each test includes cold start and warm start (same query but all necessary column filters have been loaded on previous step)

### Dataset with 1000 partitions
#### Search 1 record
`master`
```
Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 438.53 ms
Post-Scan filters: isnotnull(strid#13),(strid#13 = 35732)

Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 37.332 ms
Post-Scan filters: isnotnull(strid#28),(strid#28 = 35732)
```

`filter-resolution`
```
Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 365.721 ms
Post-Scan filters: isnotnull(strid#7),(strid#7 = 35732)

Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 39.74 ms
Post-Scan filters: isnotnull(strid#22),(strid#22 = 35732)
```

#### Search all records
`master`
```
Applying index filters: IsNotNull(col4),EqualTo(col4,value)
Filtered indexed partitions in 641.424 ms
Post-Scan filters: isnotnull(col4#11),(col4#11 = value)
```

`filter-resolution`
```
Applying index filters: IsNotNull(col4),EqualTo(col4,value)
Filtered indexed partitions in 390.783 ms
Post-Scan filters: isnotnull(col4#11),(col4#11 = value)
```

### Dataset with 400 partitions
#### Search 1 record
`master`
```
Applying index filters: IsNotNull(code),EqualTo(co... (continued)

1058 of 1106 relevant lines covered (95.66%)

0.96 hits per line

Jobs
ID Job ID Ran Files Coverage
1 288.1 (TEST_SPARK_VERSION="2.0.0" TEST_SPARK_RELEASE="NA") 22 Mar 2017 06:58AM UTC 0
95.57
Travis Job 288.1
2 288.2 (TEST_SPARK_VERSION="2.0.1" TEST_SPARK_RELEASE="NA") 22 Mar 2017 06:58AM UTC 0
95.66
Travis Job 288.2
3 288.3 (TEST_SPARK_VERSION="2.0.0" TEST_SPARK_RELEASE="spark-2.0.0-bin-hadoop2.6") 22 Mar 2017 06:58AM UTC 0
95.57
Travis Job 288.3
4 288.4 (TEST_SPARK_VERSION="2.0.1" TEST_SPARK_RELEASE="spark-2.0.1-bin-hadoop2.6") 22 Mar 2017 06:57AM UTC 0
95.66
Travis Job 288.4
5 288.5 (TEST_SPARK_VERSION="2.0.2" TEST_SPARK_RELEASE="spark-2.0.2-bin-hadoop2.6") 22 Mar 2017 06:58AM UTC 0
95.66
Travis Job 288.5
Source Files on build 288
Detailed source file information is not available for this build.
  • Back to Repo
  • Travis Build #288
  • bc3b5028 on github
  • Prev Build on dictionary-filter-2 (#186)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc