• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / tcrsift / 26512622428
84%

Build:
DEFAULT BRANCH: main
Ran 27 May 2026 01:03PM UTC
Jobs 0
Files 0
Run time –
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
26512622428

push

github

web-flow
Fix #94: pick representative cell for coupled per-clone columns (#96)

Closes #94. ``_aggregate_clone_data`` was calling ``safe_mode`` on
``VDJ_*_aa`` and ``VDJ_*_nt`` (and V/J/C gene calls) independently
per column. When a clonotype's cells produced tied modes (common
when cells span a heterozygous V allele plus other polymorphism),
``pd.Series.mode().iloc[0]`` picked the lex-first per column — and
AA and NT strings sort independently, so AA and NT often ended up
sourced from DIFFERENT cells. Result: stored ``VDJ_*_nt`` no
longer translated to stored ``VDJ_*_aa``. 17/3839 clonotypes in
B1-2 + B1-3 hit the case, 12 of them on TRAV12-2 where donor B1
is heterozygous at position 48 (F vs S).

The new ``_validate_nt_aa_roundtrip`` from #91 surfaced this — same
class of integration bug that pre-fix unit tests couldn't catch.

Fix: new ``tcrsift.validation.pick_representative_cell`` helper
that ranks per-cell rows by summed UMI evidence across the given
chain columns and returns the top row. ``_aggregate_clone_data``
now picks a per-chain representative (highest TRA_1_umis for the α
chain, TRB_1_umis for the β chain) and copies every coupled per-
chain column (V/J/C gene calls, VDJ AA, VDJ NT) from that single
row — so the (AA, NT) pair always describes the same molecule, AND
each chain gets the cell with the strongest read support for it
even when α and β were captured in different cells.

The CDR3_alpha pick under ``group_by="CDR3b_only"`` got the same
treatment so future additions of more α columns at that site stay
coupled.

Audit + fix in til_select.py for the same anti-pattern at three
sites:
- ``aggregate_clone_consensus`` (line 770) — coupled metadata
  (CDR3 NT, full AA/NT, V/D/J/C genes, all TCR segments) per CDR3.
  Now picks the row with the highest ``umis_sum`` and reads all
  metadata from there; the additive sum columns (cell_count,
  umis_sum, reads_sum) still aggregate independently.
- ``match_clonotypes`` DB join (line 292) — db_epitope a... (continued)
Source Files on build 26512622428
Detailed source file information is not available for this build.
  • Back to Repo
  • Github Actions Build #26512622428
  • f9bbd389 on github
  • Prev Build on main (#26508828495)
  • Next Build on main (#26522745536)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc