• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

openvax / topiary / 25870749548 / 3
90%
master: 90%

Build:
Build:
LAST BUILD BRANCH: v5.16.2
DEFAULT BRANCH: master
Ran 14 May 2026 04:13PM UTC
Files 35
Run time 2s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

14 May 2026 04:04PM UTC coverage: 89.791% (+0.2%) from 89.625%
25870749548.3

Pull #167

github

iskandr
DSL: native categorical equality for string columns

Adds an `IsIn` DSL node plus `Column.eq(value)` / `Column.ne(value)` /
`Column.isin(values)` methods so non-numeric columns (mhc_class,
source, gene, ...) are filterable natively in the DSL without
pre-masking via pandas.  The string parser additionally accepts
string literals on the right-hand side of `==` and `!=` so
`parse('mhc_class == "I"')` works end-to-end.

`DSLNode.__eq__` stays unoverridden (nodes remain hashable for use in
sets/dicts) — the explicit `.eq()` / `.ne()` / `.isin()` methods plus
the parser path are the supported entry points.  `IsIn` reads its
column raw, bypassing the float cast in `Column.eval`, so any dtype
works.

Why:

- pVACseq loader (5.16.0 first commit) emits a per-row `mhc_class`
  column ("I" / "II") so concat-ed multi-class results can be split.
  Filtering by class previously required pre-masking via pandas
  because the DSL couldn't compare strings; vaxrank-shape filters
  that combine numeric + categorical clauses needed two paths.
- Same path now supports filtering by `source` (provenance) and any
  other string column added by future loaders.

Implementation:

- New `IsIn(col_name, values, negate=False)` node in `ranking/nodes.py`.
  Constructor accepts scalar or iterable values; reads the column
  raw (no float cast); evaluates to a boolean Series indexed by
  `group_index`.  `__invert__` flips `negate`.
- `Column.eq` / `Column.ne` / `Column.isin` produce `IsIn` instances.
- Parser's `_cmp` detects a STRING token on the RHS of `==` / `!=`
  and produces `IsIn` directly; comparison with `<` / `<=` / `>` /
  `>=` against a string raises with a clear error.
- `apply._collect_column_names` recognizes `IsIn` so column-existence
  validation runs the same way as for `Column`.
- Exports: `IsIn` surfaces through `topiary.ranking` and the top-level
  `topiary` package.

Tests (138 across DSL + pvacseq suites, 696 across all relevant io /
dsl / ranking / wide suites; ... (continued)
Pull Request #167: Add pVACseq report loader (closes #94)

4547 of 5064 relevant lines covered (89.79%)

0.9 hits per line

Source Files on job python-3.11 - 25870749548.3
  • Tree
  • List 35
  • Changed 6
  • Source Changed 5
  • Coverage Changed 6
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Build 25870749548
  • da6028ad on github
  • Prev Job for on pvacseq-loader (#25821572339.3)
  • Next Job for on pvacseq-loader (#25872231615.3)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc