• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / hitlist / 25820271026 / 1
76%
main: 76%

Build:
DEFAULT BRANCH: main
Ran 13 May 2026 07:04PM UTC
Files 29
Run time 1s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

13 May 2026 07:02PM UTC coverage: 76.466% (+0.5%) from 76.015%
25820271026.1

push

github

web-flow
v1.30.54: cell_name parser library (#261 stage 1) (#265)

First slice of #261: a pure parser library that decomposes IEDB's
``Cell Name`` field into structured components.  No scanner changes
in this PR — scanner integration + new obs.parquet columns land as
stage 2; pmhc_query filtering lands as stage 3.

## Parser

``hitlist.cell_name_parser.parse_cell_name(cell_name, *, ...)``
returns a frozen ``CellNameInfo`` with:

  is_cell_line          bool   line vs primary-cell sample
  cell_line_name        str    canonical name (HEK293T, HAP1, ...)
  cell_line_input       str    synonym actually present in the input
                                (293-T) for traceability
  cell_type             str    tissue / cell type (B cell, Myeloid cell)
  donor_id              str    patient/donor ID extracted from
                                attributed_sample_label (e.g. "13240-005")
  genetic_modification  str    CALR KO / wildtype / MAPTAC / etc.
  raw_cell_name         str    original input

Auxiliary inputs (attributed_sample_label, monoallelic_host,
src_cell_line) optional but recommended — they sharpen the parse on
ambiguous strings.

## Cell-line registry

``hitlist/data/cell_lines.yaml`` carries 21 cell lines + 4
engineering systems with synonyms + cell_type of origin.  Seed list
covers the top observed values in the IEDB / CEDAR / PRIDE corpus
(HAP1, K562, HeLa, MDA-MB-231, HCT 116, A549, C1R, 721.221, JY,
Raji, MAVER-1, LM-MEL-44, LM-MEL-33, HROG17, HROG02, MCF-7,
MCF-7/LY2, THP-1, HEK293, HEK293T, Expi293F + MAPTAC, Strep-tag II,
sHLA, HLA-ABC-CRISPR-KO).

## Tests

86 new tests across 14 parametrized + per-case scenarios:
- 9 pure cell-line names → canonical resolution
- 13 synonyms (293T / 293-T / HEK 293T / THP1 / MCF7 / HCT-116 / ...)
- 2 "<line> cells" suffix-stripping cases (JY cells / C1R cells)
- 15 pure cell-type names → primary-cell tag
- 12 hybrid strings → split into line + cell_type
- 8 HAP1 genetic-modification variants (wildtype / WT /... (continued)

4669 of 6106 relevant lines covered (76.47%)

0.76 hits per line

Source Files on job 25820271026.1
  • Tree
  • List 29
  • Changed 1
  • Source Changed 0
  • Coverage Changed 1
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Build 25820271026
  • 884e0a14 on github
  • Prev Job for on main (#25819537433.1)
  • Next Job for on main (#25821654905.1)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc