• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / hitlist / 25517476149
75%

Build:
DEFAULT BRANCH: main
Ran 07 May 2026 07:31PM UTC
Jobs 1
Files 28
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

07 May 2026 07:29PM UTC coverage: 74.071% (+1.9%) from 72.217%
25517476149

push

github

web-flow
v1.30.41: fix categorical fillna crash + obliterate legacy index cache (#234)

* v1.30.41: fix categorical fillna crash + obliterate legacy index cache

Two fixes in one PR:

1. **Categorical ``fillna`` crash on load.**  Surfaced by
   ``tsarina hits --gene PRAME`` against the freshly-rebuilt v1.30.40
   corpus:

       TypeError: Cannot setitem on a Categorical with a new category
       (), set the categories first

   Triggered by ``df["mhc_class"].fillna("")`` and the per-load
   ``mhc_restriction`` normalization path.  Post-#137 the columns are
   categorical (``{"I", "II", "non classical"}`` for mhc_class), and
   pandas refuses to ``fillna`` with a value outside the category set.

   Fix: cast to ``StringDtype`` before ``fillna`` in the three affected
   spots — ``observations.py:617`` (mhc_class severity check), 587
   (mhc_restriction normalization), and ``supplement.py:291``
   (mhc_species fallback).  StringDtype accepts arbitrary string
   fills.

   Regression tests:
   - ``test_load_observations_handles_categorical_mhc_class_with_nan``
   - ``test_load_observations_normalizes_categorical_mhc_restriction``

2. **Obliterate the legacy ``~/.hitlist/index/`` cache.**  ``hitlist
   data list`` showed an "Index" column whose ``cached`` / ``stale``
   status was based on the OLD CSV-scan fallback path in
   ``indexer.py``.  When ``observations.parquet`` is built, the actual
   ``get_index()`` API derives counts from the parquet directly — the
   per-source cache is never read.  The dated cache files just made
   ``hitlist data list`` look misleadingly stale.

   Changes:
   - ``hitlist/indexer.py`` rewritten: only the parquet-derived
     ``get_index()`` and ``validate_alleles_from_index`` survive.
     The CSV-scan fallback (``_index_from_csv``, ``_scan_single``,
     ``_cache_dir``, ``_cache_is_valid``, ``_resolve_source_paths``,
     ``_cache_key``) is gone.  ``get_index()`` now raises
     ``FileNotFoundError`` when the parquet isn't built ... (continued)

4108 of 5546 relevant lines covered (74.07%)

0.74 hits per line

Coverage Regressions

Lines Coverage ∆ File
169
56.44
1.66% cli.py
29
0.0
0.0% indexer.py
2
95.16
0.03% observations.py
2
89.29
0.0% supplement.py
Jobs
ID Job ID Ran Files Coverage
1 25517476149.1 07 May 2026 07:31PM UTC 28
74.07
GitHub Action Run
Source Files on build 25517476149
  • Tree
  • List 28
  • Changed 5
  • Source Changed 0
  • Coverage Changed 5
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #25517476149
  • 29f7e615 on github
  • Prev Build on main (#25493184752)
  • Next Build on main (#25562739614)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc