• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / hitlist / 25450907293

06 May 2026 05:31PM UTC coverage: 72.108% (+0.6%) from 71.522%
25450907293

push

github

web-flow
v1.30.38: per-peptide sample attribution for Sarkizova 2020 patient cohort (#45) (#231)

* v1.30.38: per-peptide sample attribution for Sarkizova 2020 patient cohort (#45)

IEDB stored Sarkizova 2020's 36k patient-tumor MS rows with
``mhc_restriction="HLA class I"`` and ``host_mhc_types`` = the
disease-wide allele union (14-18 alleles across GBM7+9+11, etc.),
which over-broadens the candidate bag. The paper's Sup_Data2
(MOESM4) attributes each peptide to specific patient samples.

This PR wires that attribution into the curation pipeline:

- New ``peptide_attributions: <csv>`` key in pmid_overrides.yaml.
- ``hitlist/data/peptide_attributions/sarkizova_2020_patient_cohort.csv``:
  29,357 peptide → sample_label rows from MOESM4.
- ``attribute_peptide_to_sample_alleles(pmid, peptide)``: looks up
  the peptide and returns the union of its samples' typed alleles.
- ``expand_allele_bag`` accepts a hashable ``attributed_alleles``
  frozenset; new ``"peptide_attribution"`` provenance.
- New ``"donor_bag"`` resolution tier (between four_digit and
  two_digit) for already-bagged restrictions.
- Scanner + supplement: when bag is non-empty and not "exact",
  promote the bag to ``mhc_restriction``. ``mhc_restriction`` now
  carries either a single 4-digit allele OR a semicolon-joined
  donor bag. ``"HLA class I"`` is reserved for true unknowns.
- ``load_observations`` / ``load_binding`` gain
  ``mhc_allele_in_bag`` and ``mhc_allele_provenance`` filters.
  ``mhc_restriction`` filter does bag-membership matching so
  ``"HLA-A*02:01"`` recovers multi-allelic patient cohorts.

Tests: 9 new curation tests + 3 new observation filter tests
covering the bag-membership / provenance paths.

Closes #45.

* review feedback: precompute peptide→alleles, vectorize bag filter, rename

Address review feedback on PR #231:

- Rename loader query parameter ``mhc_allele_in_bag`` →
  ``mhc_allele_in_set`` to match the underlying ``mhc_allele_set``
  column.
- Precompute ``peptide → fro... (continued)

4064 of 5636 relevant lines covered (72.11%)

0.72 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

54.93
/cli.py


Source Not Available

STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc