• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

jharwell / sierra / 14916511482

08 May 2025 08:55PM UTC coverage: 80.239% (+0.05%) from 80.194%
14916511482

push

github

jharwell
feature(#326): Arrow storage

- Start updating docs/code to say "output files" instead of "csv"

- Move flattening to be a platform callback so it can be done before scaffolding
  a batch exp.

- Start hacking at statistics generation to support arrow and CSV. Things seem
  to work with arrow, but need to re-run some imagizing/csv tests to verify
  things aren't broken in other ways.

- Add a placeholder for fleshing out SIERRA's dataflow model, which is a really
  important aspect of usage which currently isn't documented.

- Remove excessive class usage in DataFrame{Reader,Writer}

- Overhaul collation and fix nasty bug where data was only being gathered from 1
  run per sim; no idea how long that has been in there. Added an assert so that
  can't happen again.

349 of 385 new or added lines in 28 files covered. (90.65%)

3 existing lines in 3 files now uncovered.

5441 of 6781 relevant lines covered (80.24%)

0.8 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

75.0
/sierra/plugins/storage/arrow/plugin.py
1
# Copyright 2025 John Harwell, All rights reserved.
2
#
3
#  SPDX-License-Identifier: MIT
4
"""
5
Plugin for reading/writing apache .arrow files.
6
"""
7

8
# Core packages
9
import pathlib
1✔
10
import typing as tp
1✔
11

12
# 3rd party packages
13
from retry import retry
1✔
14
import pandas as pd
1✔
15

16
# Project packages
17

18

19
def suffixes() -> tp.Set[str]:
1✔
NEW
20
    return {'.arrow'}
×
21

22

23
@retry(pd.errors.ParserError, tries=10, delay=0.100, backoff=1.1)  # type:ignore
1✔
24
def df_read(path: pathlib.Path, **kwargs) -> pd.DataFrame:
1✔
25
    """
26
    Read a pandas dataframe from an apache .arrow file.
27
    """
NEW
28
    return pd.read_feather(path)
×
29

30

31
@retry(pd.errors.ParserError, tries=10, delay=0.100, backoff=1.1)  # type:ignore
1✔
32
def df_write(df: pd.DataFrame, path: pathlib.Path, **kwargs) -> None:
1✔
33
    """
34
    Write a pandas dataframe to a apache .arrow file.
35
    """
NEW
36
    df.to_feather(path)
×
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc