• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / hitlist / 24699661514
80%

Build:
DEFAULT BRANCH: main
Ran 21 Apr 2026 01:44AM UTC
Jobs 1
Files 21
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

21 Apr 2026 01:43AM UTC coverage: 47.309% (+1.8%) from 45.555%
24699661514

push

github

web-flow
v1.12.2: Wire bulk_proteomics.parquet into builder + harmonize metadata (#90)

Makes bulk proteomics a first-class sibling of observations.parquet /
binding.parquet. `hitlist data build` now emits a third parquet at
~/.hitlist/bulk_proteomics.parquet — a long-form table with rows for
both protein- and peptide-level measurements (distinguished by
`granularity`), with per-source acquisition metadata denormalized onto
every row so the file is self-contained for MS-bias modeling.

Schema is harmonized with the per-sample schema in hitlist.export for
observations.parquet — same column names for instrument, instrument_type,
fragmentation, acquisition_mode, labeling, search_engine, fdr,
cell_line_name, sample_label, pmid, study_label, species — so the same
column list extracts from either index for joint analysis. Bulk-specific
prep fields (digestion, digestion_enzyme, fractionation, n_fractions,
quantification) sit alongside. See #89 for row-level digestion_enzyme
follow-up with non-tryptic digests.

Also:
- Bekker-Jensen protein-level abundance (71,520 rows across A549 /
  HCT116 / HEK293 / HeLa / MCF7) joined into load_bulk_proteomics()
  alongside CCLE. Use abundance_percentile (rank within cell line) for
  cross-source comparisons; intensity values are not directly comparable
  because CCLE is TMT-normalized and BJ is label-free.
- sources.yaml enriched with pmid, study_label, species, fragmentation,
  acquisition_mode, labeling, n_fractions so the harmonized fields are
  curated, not inferred.
- Loaders (load_bulk_proteomics, load_bulk_peptides) prefer the built
  parquet when present (fast + full metadata), fall back to packaged
  CSVs otherwise, so they keep working without data build.
- `cell_line` → `cell_line_name` column rename to match observations.

Bumps cache_is_valid to require all three parquets. Adds round-trip
tests for the harmonized columns, build artifact, and cross-source
loader behavior. Old loaders' `cell_line` filter arg keeps work... (continued)

1424 of 3010 relevant lines covered (47.31%)

0.47 hits per line

Coverage Regressions

Lines Coverage ∆ File
159
47.81
7.6% builder.py
Jobs
ID Job ID Ran Files Coverage
1 24699661514.1 21 Apr 2026 01:44AM UTC 21
47.31
GitHub Action Run
Source Files on build 24699661514
  • Tree
  • List 21
  • Changed 2
  • Source Changed 0
  • Coverage Changed 2
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #24699661514
  • 128813fe on github
  • Prev Build on main (#24685829802)
  • Next Build on main (#24699890996)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc