• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

lightcopy / parquet-index / 283
97%

Build:
DEFAULT BRANCH: master
Ran 18 Mar 2017 12:16AM UTC
Jobs 5
Files 22
Run time 2min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
283

push

travis-ci

web-flow
add futures for pruning indexed partitions (#74)

This PR updates code for sequential resolution of `foldFilter` by using futures and executing `resolveSupported` method in parallel for each file in Parquet partition; partitions are still resolved sequentially. 

Manual testing shows about 1.5-2x performance improvement when filtering index partitions (with large portion of files being scanned). But it also introduces a little bit of overhead when filtering partitions on cached index (20ms vs 35ms). This approach is similar to eager loading.

Benchmarks:
Each test includes cold start and warm start (same query but all necessary column filters have been loaded on previous step)

### Dataset with 1000 partitions
#### Search 1 record
`master`
```
Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 438.53 ms
Post-Scan filters: isnotnull(strid#13),(strid#13 = 35732)

Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 37.332 ms
Post-Scan filters: isnotnull(strid#28),(strid#28 = 35732)
```

`filter-resolution`
```
Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 365.721 ms
Post-Scan filters: isnotnull(strid#7),(strid#7 = 35732)

Applying index filters: IsNotNull(strid),EqualTo(strid,35732)
Filtered indexed partitions in 39.74 ms
Post-Scan filters: isnotnull(strid#22),(strid#22 = 35732)
```

#### Search all records
`master`
```
Applying index filters: IsNotNull(col4),EqualTo(col4,value)
Filtered indexed partitions in 641.424 ms
Post-Scan filters: isnotnull(col4#11),(col4#11 = value)
```

`filter-resolution`
```
Applying index filters: IsNotNull(col4),EqualTo(col4,value)
Filtered indexed partitions in 390.783 ms
Post-Scan filters: isnotnull(col4#11),(col4#11 = value)
```

### Dataset with 400 partitions
#### Search 1 record
`master`
```
Applying index filters: IsNotNull(code),EqualTo(code,339382)
Filtered indexed partitions in 1178.762 ms
Post-Scan filters: isnotnull(code#8),(code#8 = 339382)

Applying index filters: IsNotNull(code),EqualTo(code,339382)
Filtered indexed partitions in 17.909 ms
Post-Scan filters: isnotnull(code#21),(code#21 = 339382)
```

`filter-resolution`
```
Applying index filters: IsNotNull(code),EqualTo(code,339382)
Filtered indexed partitions in 509.551 ms
Post-Scan filters: isnotnull(code#8),(code#8 = 339382)

Applying index filters: IsNotNull(code),EqualTo(code,339382)
Filtered indexed partitions in 12.801 ms
Post-Scan filters: isnotnull(code#21),(code#21 = 339382)
```

Closes #56.

1060 of 1106 relevant lines covered (95.84%)

4.78 hits per line

Jobs
ID Job ID Ran Files Coverage
1 283.1 (TEST_SPARK_VERSION="2.0.0" TEST_SPARK_RELEASE="NA") 18 Mar 2017 12:16AM UTC 0
95.57
Travis Job 283.1
2 283.2 (TEST_SPARK_VERSION="2.0.1" TEST_SPARK_RELEASE="NA") 18 Mar 2017 12:16AM UTC 0
95.66
Travis Job 283.2
3 283.3 (TEST_SPARK_VERSION="2.0.0" TEST_SPARK_RELEASE="spark-2.0.0-bin-hadoop2.6") 18 Mar 2017 12:17AM UTC 0
95.57
Travis Job 283.3
4 283.4 (TEST_SPARK_VERSION="2.0.1" TEST_SPARK_RELEASE="spark-2.0.1-bin-hadoop2.6") 18 Mar 2017 12:16AM UTC 0
95.66
Travis Job 283.4
5 283.5 (TEST_SPARK_VERSION="2.0.2" TEST_SPARK_RELEASE="spark-2.0.2-bin-hadoop2.6") 18 Mar 2017 12:16AM UTC 0
95.66
Travis Job 283.5
Source Files on build 283
Detailed source file information is not available for this build.
  • Back to Repo
  • Travis Build #283
  • bc3b5028 on github
  • Prev Build on master (#279)
  • Next Build on master (#291)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc