• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / tcrsift / 26642287264
84%

Build:
DEFAULT BRANCH: main
Ran 29 May 2026 02:14PM UTC
Jobs 4
Files 31
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

29 May 2026 02:12PM UTC coverage: 78.367% (+0.1%) from 78.237%
26642287264

push

github

web-flow
2.1.0: fix #57 — Levenshtein-1 fuzzy CDR3 matching (#112)

Adds a fuzzy fallback for β-only matching in ``match_clonotypes``.
Pre-2.1 only exact CDR3 string equality recorded a hit; in real TIL
cohorts exact β matches are <10% of clones, so most true hits to
known antigen-specific TCRs were being missed.

New parameters on ``match_clonotypes`` and ``annotate_clonotypes``:

  match_mode: "exact" (default; pre-2.1 behaviour) | "levenshtein"
  max_distance: edit-distance threshold for fuzzy mode (currently
                clamped to 1; values >1 warn and clamp)

Implementation: deletion-canonical neighbor index over the database
β CDR3 column. Each DB entry contributes ``len(cdr3) + 1`` canonical
variants (the string itself plus one-character-deletion at each
position). For a query CDR3 of length L, lookup is O(L) — generate
its canonical variants and union the indexed sets to get the
candidate set. Pairwise Lev-distance check on candidates filters
out the canonical-set collisions at Lev distance >1. Tractable on
the actual VDJdb/IEDB scale (~300K β entries).

αβ matching stays strict-exact in both modes. Fuzzy αβ is too noisy
biologically — the paired-chain prior is the strong signal and we
don't want a Lev-1 hit on β to dragnet a wrong α.

Output:
  - db_match_distance: int — 0 for exact β, 1 for Levenshtein-1
    fuzzy β, None when unmatched.
  - db_match_strength: extended with ``b_only_near`` (and
    ``b_only_near_cross`` when host is non-human, per #83 + #54).

TCRdist is intentionally out-of-scope here — it would pull an
optional dependency tree (tcrdist3 or numpy-only tcrdist-rs port).
Worth a follow-up issue if the Levenshtein-1 hit rate isn't enough.

Tests: 31 new in tests/test_annotate_fuzzy.py covering:
  - Direct ``_levenshtein_distance_at_most_1`` unit cases:
    identical / substitution / insertion / deletion / two-edits /
    substitution-then-deletion / length-diff-too-large / empties.
  - Canonical-variant generator: self-inclusion, e... (continued)

6720 of 8575 relevant lines covered (78.37%)

3.13 hits per line

Coverage Regressions

Lines Coverage ∆ File
34
93.51
-0.19% annotate.py
Jobs
ID Job ID Ran Files Coverage
1 python-3.10 - 26642287264.1 29 May 2026 02:15PM UTC 31
78.36
GitHub Action Run
2 python-3.9 - 26642287264.2 29 May 2026 02:15PM UTC 31
78.33
GitHub Action Run
3 python-3.11 - 26642287264.3 29 May 2026 02:14PM UTC 31
78.36
GitHub Action Run
4 python-3.12 - 26642287264.4 29 May 2026 02:15PM UTC 31
78.36
GitHub Action Run
Source Files on build 26642287264
  • Tree
  • List 31
  • Changed 1
  • Source Changed 0
  • Coverage Changed 1
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #26642287264
  • 3b3e6d97 on github
  • Prev Build on main (#26640054702)
  • Next Build on main (#26644524356)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc