• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

wesm / parquet-cpp / 452
88%

Build:
DEFAULT BRANCH: master
Ran 25 Dec 2016 03:11PM UTC
Jobs 2
Files 135
Run time 6min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
452

push

travis-ci

wesm
<a href="https://github.com/wesm/parquet-cpp/commit/<a class=hub.com/wesm/parquet-cpp/commit/c2d8df9fb9ea5b8a15c5280e44b2d6255a17bd21">c2d8df9fb<a href="https://github.com/wesm/parquet-cpp/commit/c2d8df9fb9ea5b8a15c5280e44b2d6255a17bd21">&quot;&gt;PARQUET-816: Workaround for incorrect column chunk metadata in parquet-mr &amp;lt;= 1.2.8

This turned up in reading of old data files generated by parquet-mr in 2013. There&amp;#39;s a bug in parquet-mr 1.2.8 and lower in which the column chunk metadata in the Parquet file is incorrect. Impala inserted an explicit workaround for this (see See https://github.com/apache/incubator-impala/blob/</a><a class="double-link" href="https://github.com/wesm/parquet-cpp/commit/<a class="double-link" href="https://github.com/wesm/parquet-cpp/commit/88448d1d4ab31eaaf82f764b36dc7d11d4c63c32">88448d1d4</a>">88448d1d4</a><a href="https://github.com/wesm/parquet-cpp/commit/c2d8df9fb9ea5b8a15c5280e44b2d6255a17bd21">/be/src/exec/hdfs-parquet-scanner.cc#L1227).

In this particular file, the dictionary page header is 15 bytes, and the entire column chunk is: 15 (dict page header) + 277 (dictionary) + 17 (data page header) + 28 (data page) bytes, making 337 bytes.

But the metadata says the column chunk is only 322 bytes – the dict page header size got dropped from the accounting.

Author: Wes McKinney &lt;wes.mckinney@twosigma.com&gt;

Closes #209 from wesm/PARQUET-816 and squashes the following commits:

21fdcbe [Wes McKinney] Move FileVersion to an inner class in FileMetaData
64e7f95 [Wes McKinney] Remove unnecessary std::move causing clang warning
bacb815 [Wes McKinney] Fix compilation error in benchmarks
f4c259e [Wes McKinney] cpplint
1e8c160 [Wes McKinney] clang-format
d2aa9a8 [Wes McKinney] Do not continue reading data pages in SerializedPageReader reading the indicated number of rows in a row group
2638490 [Wes McKinney] Bring in IMPALA-694 workaround for PARQUET-816
bd3e949 [Wes McKinney] Optimistically decode truncated data pages. Add example data file

10362 of 10749 relevant lines covered (96.4%)

64589.77 hits per line

Jobs
ID Job ID Ran Files Coverage
1 452.1 25 Dec 2016 03:17PM UTC 0
96.4
Travis Job 452.1
2 452.2 25 Dec 2016 03:11PM UTC 0
0.0
Travis Job 452.2
Source Files on build 452
Detailed source file information is not available for this build.
  • Back to Repo
  • Travis Build #452
  • c2d8df9f on github
  • Prev Build on master (#446)
  • Next Build on master (#471)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc