• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

apache / parquet-cpp / 1715

Build:
DEFAULT BRANCH: master
Ran 20 Sep 2017 02:01AM UTC
Jobs 3
Files 0
Run time –
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
1715

push

travis-ci

wesm
PARQUET-1100: Introduce RecordReader interface to better support nested data, refactor parquet/arrow/reader

We did not have very consistent logic around reading values from leaf nodes versus reading semantic records where the repetition level is greater than zero. This introduces a reader class that reads from column chunks until it identifies the end of records. It also reads values (with spaces, if required by the schema) into internal buffers. This permitted a substantial refactoring and simplification of the code in parquet::arrow where we were handling the interpretation of batch reads as records manually.

As follow up patch, we should be able to take a collection of record readers from the same "tree" in a nested type and reassemble the intermediate Arrow structure and dealing with any redundant structure information in repetition and definition levels. This should a allow a unification of our nested data read code path so that we can read arbitrary nested structures.

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes #398 from wesm/PARQUET-1100 and squashes the following commits:

9ea85d9 [Wes McKinney] Revert to const args
f4dc0fe [Wes McKinney] Make parquet::schema::Node non-copyable. Use const-refs instead of const-ptr for non-nullable argument
0d859cc [Wes McKinney] Code review comments, scrubbing some flakes
1368415 [Wes McKinney] Fix more MSVC warnings
eccb84c [Wes McKinney] Give macro more accurate name
0eaada0 [Wes McKinney] Use int64_t instead of int for batch sizes
79c3709 [Wes McKinney] Add documentation. Remove RecordReader from public API
8fa619b [Wes McKinney] Initialize memory in DecodeSpaced to avoid undefined behavior
5a0c860 [Wes McKinney] Remove non-repeated branch from DelimitRecords
c754e6e [Wes McKinney] Refactor to skip record delimiting for non-repeated data
ed2a03f [Wes McKinney] Move more code into TypedRecordReader
2e934e9 [Wes McKinney] Set some integers as const
58d3a0f [Wes McKinney] Do not index into levels arrays
b766371 [Wes McKinney] Add RecordReader::Reserve to preallocate, fixing perf regression. cpplint
1bf3e8f [Wes McKinney] Refactor to create stateful parquet::RecordReader class to better support nested data. Shift value buffering logic from parquet/arrow/reader into RecordReader. Fix bug described in PARQUET-1100
Jobs
ID Job ID Ran Files Coverage
1 1715.1 20 Sep 2017 02:02AM UTC 0
Travis Job 1715.1
2 1715.2 20 Sep 2017 02:01AM UTC 0
Travis Job 1715.2
3 1715.3 20 Sep 2017 02:02AM UTC 0
Travis Job 1715.3
Source Files on build 1715
Detailed source file information is not available for this build.
  • Back to Repo
  • Travis Build #1715
  • 4b09ac70 on github
  • Prev Build on master (#1712)
  • Next Build on master (#1719)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc