• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / hitlist
75%

Build:
DEFAULT BRANCH: main
Repo Added 30 Mar 2026 02:32PM UTC
Token 2BMOPfDLihNAYkvRMkn4Pz38iRgZjmtXf regen
Build 348 Last
Files 28
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH main
branch: SELECT
CHANGE BRANCH
x
Sync Branches
  • No branch selected
  • add-load-all-evidence
  • add-parquet-export-test
  • add-refs-aggregator-20260420
  • align-version-with-pypi
  • allele-resolution
  • attribution-per-donor-rows
  • audit-pmid-overrides
  • binding-index-split
  • builder-memory-reduction
  • bulk-proteomics-abundance-and-metadata
  • bulk-proteomics-non-tryptic-bj
  • category2-per-donor-curation
  • chore/audit-cleanup-stale-docs
  • chore/test-speed
  • codex/apc-lineage-curation
  • codex/curate-pmid-24366607
  • codex/curate-pmid-29093164
  • codex/export-ms-peptide-summary
  • codex/training-export-unified
  • curate-batch-2
  • curate-batch-3
  • curate-top-studies
  • expand-proteome-registry-bacteria
  • faridi-per-transfectant
  • feat/allele-bag-expansion-137
  • feat/apm-perturbation-columns
  • feat/assay-iri-evidence-row-id
  • feat/binding-response-measured-stacked
  • feat/build-top-level
  • feat/bulk-proteomics-n-replicates-possible
  • feat/class-label-severity-tiers
  • feat/condition-categories
  • feat/curate-shapiro-2025
  • feat/discrepancies-per-sample
  • feat/engineered-mhc-flag
  • feat/exclude-class-label-suspect
  • feat/export-cleanup
  • feat/export-provenance
  • feat/gomez-raji-plasma
  • feat/instrument-category-and-gomez-zepeda
  • feat/line-expression-anchors
  • feat/line-expression-cache
  • feat/list-args-space-separated
  • feat/maptac-dp-dq-match
  • feat/normalize-and-class-suspect
  • feat/observations-export
  • feat/pmhc-binder-classification
  • feat/pmhc-flat-and-optional
  • feat/pmhc-query-and-toplevel-reshuffle
  • feat/pmhc-sample-paired
  • feat/pmhc-serotype-expansion
  • feat/ptm-aware-peptides
  • feat/qc-plan-roadmap
  • feat/qc-proteome-coverage
  • feat/qc-top-level
  • feat/quantitative-binding-fields
  • feat/register-aav-proteomes
  • feat/register-mtb-plasmodium-proteomes
  • feat/rename-map-source-proteins
  • feat/report-from-index
  • feat/severity-tiers-in-curation-plan
  • feat/severity-tiers-in-qc-and-cli
  • feat/training-export-136
  • feat/transcript-aware-mappings
  • feat/vectorize-and-version
  • fix-allele-resolution-categorical-fillna
  • fix-alpizar-2017-split-b-alleles
  • fix-chen-2020-hela-abc-ko
  • fix-classify-allele-pair-gene-gene
  • fix-ebv-lcl-direct-ex-vivo
  • fix-hla-only-filter
  • fix-illing-2018-split-b57-transfectants
  • fix-pandas-categorical-warnings
  • fix-pmid-arrow-conversion
  • fix-sarango-2022-precision
  • fix-trolle-2016-split-721-221-transfectants
  • fix-weingarten-gabbay-2021-no-hbec
  • fix/audit-19-remaining
  • fix/c1r-allele-curation
  • fix/class-ii-implausible-45
  • fix/class-ii-implausible-threshold
  • fix/cross-reference-class-only-pmids
  • fix/derived-column-passthrough
  • fix/multi-allele-genotype-tokenization
  • fix/per-sample-mhc-curation
  • fix/pmhc-progress-wording
  • fix/pmid-curation-audits-128-132
  • fix/proteome-cache-bounded
  • fix/report-keyerror-and-cli-help
  • fix/run-all-include-proteome-coverage
  • fix/sample-resolver-tiebreak
  • fix/scanner-close-csv-files
  • fix/severity-tier-ptm-aware
  • ingest-abelin-2019-maptac
  • ingest-bekker-jensen-peptides
  • ingest-ccle-bulk-proteomics
  • ingest-strazar-2023-hla2
  • is-non-peptide-ligand-228
  • issue-104-108-digest-and-bounds
  • issue-105-atomic-rebuild
  • issue-106-export-bulk-cli
  • issue-110-proteome-index-memory
  • issue-117-summary-from-indices
  • issue-119-export-progress
  • issue-12-human-51-100
  • issue-121-118-normalize-and-lengths
  • issue-122-short-mhc2-filter
  • issue-8-nonclassical-hla
  • issue-85-scanner-perf
  • issue-86-proteome-index-cache
  • issue-93-remove-hla-filter
  • issue-98-fractionation-ph
  • issue-99-proteome-kmer-set
  • main
  • mono-allelic-detection
  • mono-allelic-method
  • obliterate-legacy-index-cache
  • per-transfectant-splits
  • perf/batch-drops-and-fillna
  • perf/merge-and-copy
  • perf/proteome-build-inline
  • perf/proteome-cache-build-order
  • perf/test-suite-integration-marker
  • perf/unique-map-string-ops
  • pin-mhcgnomes
  • predictor-multi-allele-narrowing
  • qc/discrepancies-report
  • rename-proteome-label
  • retire-human-only
  • retro-allele-predictor
  • sarkizova-monoallelic-recovery-45
  • scanner-v2
  • serotype-mapping
  • split-gene-mapping-from-observations
  • test/supplementary-build-e2e
  • unified-observations
  • update-readme-1.10
  • upgrade-mhc-unknown

09 May 2026 10:59PM UTC coverage: 74.898% (+0.4%) from 74.531%
25613983293

push

github

web-flow
v1.30.46: split gene/protein columns out of observations.parquet (closes #238 partial) (#243)

Drops ``gene_names``, ``gene_ids``, ``protein_ids``, and
``n_source_proteins`` from the observations.parquet schema.  The same
information has always been in ``peptide_mappings.parquet`` (one row
per peptide x protein); we were storing it in TWO places and merging
the long-form mapping back onto every observation row at build time.

## Wins (measured against the v1.30.45 build of the same source data)

| Metric | v1.30.45 | v1.30.46 | Delta |
|---|---|---|---|
| observations.parquet size | 192.2 MB | **117.6 MB** | **-39%, -74.5 MB** |
| Gene/protein columns combined size | 71.1 MB (38.8% of obs) | 0 MB | -71 MB |
| Gene-annotation merge step in build | full-frame merge | skipped | (eliminates the largest single transient memory step in build_observations) |
| ``hitlist pmhc --gene PRAME`` rows loaded | 4.4M (full corpus) | 257 (matched peptides only) | ~17,000x reduction |
| ``hitlist pmhc --gene PRAME`` parquet load time | ~6s | 0.8s | ~7.5x faster |

Build-side memory wins are harder to capture as a single number — the
v1.30.45 build held the full peptide_mappings (~65 MB) AND obs (~190 MB)
in pandas form simultaneously through ``annotate_observations_with_genes``,
which is the largest transient step.  Eliminating that step cuts the
single biggest memory blip in the build pipeline.

## What changed

### Build path (``builder.py``)

Skip ``annotate_observations_with_genes()``.  The peptide_mappings.parquet
sidecar continues to be built independently as it always was.

### Reader path (``observations.py``)

- ``load_observations`` now AUTO-ATTACHES gene/protein columns from
  peptide_mappings.parquet when the caller requests them but the
  parquet doesn't carry them (post-v1.30.46).  Only joins the
  matched-peptides slice (cheap on filtered loads, expensive only on
  full-corpus loads — same complexity as the old build-time merge).
- New entries in ``_DERI... (continued)

4234 of 5653 relevant lines covered (74.9%)

0.75 hits per line

Relevant lines Covered
Build:
Build:
5653 RELEVANT LINES 4234 COVERED LINES
0.75 HITS PER LINE
Source Files on main
  • Tree
  • List 28
  • Changed 4
  • Source Changed 0
  • Coverage Changed 4
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
25613983293 main v1.30.46: split gene/protein columns out of observations.parquet (closes #238 partial) (#243) Drops ``gene_names``, ``gene_ids``, ``protein_ids``, and ``n_source_proteins`` from the observations.parquet schema. The same information has always be... push 09 May 2026 11:01PM UTC web-flow github
74.9
25613896952 split-gene-mapping-from-observations Merge 12f276a1a into ea2cae856 Pull #243 09 May 2026 10:56PM UTC web-flow github
74.9
25601432313 main align version.py with PyPI: 1.30.44 → 1.30.45 (#242) PR #240's amended commit accidentally dropped its version bump (I ran `git commit --amend --message ...` without re-staging hitlist/version.py). The squash merge therefore landed v1.30.45's co... push 09 May 2026 12:46PM UTC web-flow github
74.53
25587678819 align-version-with-pypi Merge fc979af1d into f83694dd9 Pull #242 09 May 2026 01:27AM UTC web-flow github
74.53
25587641606 main v1.30.45: predictor-driven multi-allele narrowing in pmhc query (closes #239) (#240) When ``hitlist pmhc --predictor netmhcpan|mhcflurry`` runs against a multi-allele row (every per-donor row from #236, every IEDB row whose ``host_mhc_types`` pop... push 09 May 2026 01:25AM UTC web-flow github
74.53
25587100236 predictor-multi-allele-narrowing Merge 60b2d941d into a5200ad85 Pull #240 09 May 2026 01:04AM UTC web-flow github
74.53
25587063807 main v1.30.44: pre-add "" to compressed-categorical categories so fillna("") never TypeErrors (#241) Same bug class as v1.30.41 / v1.30.43, surfaced by ./test.sh --all (integration suite that loads the real observations.parquet). 12 test_export error... push 09 May 2026 01:02AM UTC web-flow github
74.4
25586985473 fix-allele-resolution-categorical-fillna Merge 488358408 into 75b189cc0 Pull #241 09 May 2026 12:59AM UTC web-flow github
74.4
25564289983 predictor-multi-allele-narrowing Merge fc3d204ac into 75b189cc0 Pull #240 08 May 2026 03:34PM UTC web-flow github
74.52
25563211782 main v1.30.43: per-donor row split for attributed peptides (closes #236) (#237) * v1.30.43: per-donor row split for attributed peptides (closes #236) When the per-peptide attribution CSV maps a peptide to N matched donors, the scanner now emits N obs... push 08 May 2026 03:12PM UTC web-flow github
74.39
See All Builds (348)

Badge your Repo: hitlist

We detected this repo isn’t badged! Grab the embed code to the right, add it to your repo to show off your code coverage, and when the badge is live hit the refresh button to remove this message.

Could not find badge in README.

Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

Refresh
  • Settings
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc