• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pirl-unc / hitlist / 24937746010
80%

Build:
DEFAULT BRANCH: main
Ran 25 Apr 2026 06:34PM UTC
Jobs 1
Files 23
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

25 Apr 2026 06:32PM UTC coverage: 60.497% (+0.8%) from 59.684%
24937746010

push

github

web-flow
v1.20.0: Transcript-aware peptide_mappings (#141) (#159)

Issue #141: ProteomeIndex.from_ensembl picked one longest protein-coding
transcript per gene and used best_t.id (an ENST) as protein_id, so
peptide_mappings.parquet:
  - silently collapsed transcript diversity away before the mapping
    table was built
  - mixed transcript-vs-protein semantics in the protein_id column
    depending on backend (ENST for Ensembl, FASTA-header strings for
    UniProt) so downstream consumers couldn't distinguish gene-level
    ambiguity from transcript-isoform ambiguity from same-protein
    repeated occurrences

Fix: index every protein-coding transcript per gene, expose
transcript_id and is_canonical_transcript as first-class mapping
columns, use the ENSP (when pyensembl surfaces it) as protein_id.

- hitlist/proteome.py:
  - from_ensembl now iterates every protein-coding transcript per
    gene rather than only the longest.  Indexes by t.protein_id
    (ENSP) when available, falling back to t.id (ENST) for older
    pyensembl releases / species without protein_id surfaced.
  - Canonical transcript = longest valid protein-coding translation
    (a stable, pyensembl-version-independent definition).  Each
    proteins[] entry's meta records gene_name, gene_id, transcript_id
    (always the ENST), and is_canonical_transcript bool.
  - from_fasta records transcript_id='' and is_canonical_transcript=
    False so the meta schema is uniform across backends.
  - map_peptides now emits transcript_id and is_canonical_transcript
    columns alongside protein_id.  Empty-result schema also includes
    them.

- hitlist/mappings.py:
  - _MAPPING_COLUMNS adds transcript_id + is_canonical_transcript.
  - _flanking_rows_to_mapping_rows propagates the new columns when
    present (real Ensembl-backed builds) and fills defaults for
    legacy fixtures so the parquet schema stays uniform.
  - load_peptide_mappings exposes transcript_id and
    is_canonical_transcript as pyarrow-p... (continued)

2533 of 4187 relevant lines covered (60.5%)

0.6 hits per line

Coverage Regressions

Lines Coverage ∆ File
91
24.77
8.02% mappings.py
70
80.78
0.0% export.py
19
88.07
8.57% proteome.py
Jobs
ID Job ID Ran Files Coverage
1 24937746010.1 25 Apr 2026 06:34PM UTC 23
60.5
GitHub Action Run
Source Files on build 24937746010
  • Tree
  • List 23
  • Changed 3
  • Source Changed 0
  • Coverage Changed 3
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #24937746010
  • 1f7bcbda on github
  • Prev Build on main (#24937135039)
  • Next Build on main (#24996999897)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc