• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

freeeve / roaringrange
84%

Build:
DEFAULT BRANCH: main
Repo Added 29 May 2026 10:49PM UTC
Token wEj4UsSU7lednrqHUblShNsIcj7AJB3gt regen
Build 142 Last
Files 16
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH main
branch: SELECT
CHANGE BRANCH
x
Sync Branches
  • No branch selected
  • main

01 Jul 2026 07:07PM UTC coverage: 84.284% (-0.03%) from 84.318%
28541272305

push

github

freeeve
feat(terms)!: per-language stop words + decouple stemming from language

Stop-word removal is now keyed on the index language (18 Snowball languages)
instead of a fixed English list, and the RRTI stem filter is decoupled from the
language so an index can strip a language's stop words without stemming.

Header semantics (RRTI): the `language` byte is meaningful when bit0 (stemmed)
OR bit1 (stop-words) is set. The reader builds the stemmer only under bit0 but
reads the language under bit0 | bit1. Enabling either filter requires a language
-- a filter set with no language is a build error (no language==0 => English
fallback). Defaults (no filter) stay byte-identical; the English list is the
unchanged 31-word set, so existing English-stopword indexes are unaffected.

Per-language lists live once in stopwords/<lang>.txt at the repo root (sorted,
lowercased, de-duplicated). Rust embeds them with include_str! and Go with
//go:embed -- the same physical files, so the two ports' lists are byte-identical
by construction. English is the fixed list; the other 17 are from NLTK, Tamil
from spaCy.

API:
- Rust: stop_words(lang) / is_stop_word(t, lang); Tokenizer::with(language, stem,
  stopwords, case_fold) with the old new(..) kept as a stem = language.is_some()
  shim; spec() widened to (language, stem, stopwords, case_fold); from_header
  reads the language under bit0 | bit1; TermIndexConfig / TermSplitBuildConfig
  gain a stem field; the stream writer sets FLAG_STEMMED from stem and errors on
  a filter with no language.
- Go: all 18 TermLanguage constants + a stopwordFile map; termStopWordList /
  isTermStopWord(t, lang); TermTokenizer.language; NewTermTokenizerFull and
  WriteTermIndexFull, with the old *With funcs kept as shims;
  TermSplitBuildConfig.Stem.
- Python: TermBuilder / TermSplitSetBuilder take stem=None (defaults to
  "a language was given"); a ValueError when a filter has no language.

Go multilingual stemming stays out of scope: only English ste... (continued)

33 of 38 new or added lines in 2 files covered. (86.84%)

1 existing line in 1 file now uncovered.

1507 of 1788 relevant lines covered (84.28%)

33.25 hits per line

Relevant lines Covered
Build:
Build:
1788 RELEVANT LINES 1507 COVERED LINES
33.25 HITS PER LINE
Source Files on main
  • Tree
  • List 16
  • Changed 2
  • Source Changed 2
  • Coverage Changed 2
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
28541272305 main feat(terms)!: per-language stop words + decouple stemming from language Stop-word removal is now keyed on the index language (18 Snowball languages) instead of a fixed English list, and the RRTI stem filter is decoupled from the language so an in... push 01 Jul 2026 07:08PM UTC freeeve github
84.28
28279688357 main feat(index)!: optional case-sensitive indexing (default case-fold) Add a case_sensitive index-creation setting across every text surface. The default (false) case-folds, so all existing builds stay byte-identical -- every pre-existing golden test... push 27 Jun 2026 07:02PM UTC freeeve github
84.32
28146543173 main fix(reader): harden index parsers against malicious inputs + add mutation fuzz harness Treat index bytes as untrusted (hostile origin or a corrupt/partial upload) and close the denial-of-service vectors a crafted file could trigger in the wasm re... push 25 Jun 2026 04:18AM UTC freeeve github
84.2
28134418554 main perf(facet): meta-only RrfFacets boot — facets() in 2 reads not ~191 (task 053) RrfFacets::open eagerly loaded the per-field top category head postings (~191 scattered range reads on the 108k-category DeepLibby .rrf -> 15-20s before the browse fa... push 24 Jun 2026 10:44PM UTC freeeve github
84.2
28122848213 main fix(demo): show N+ for depth-capped split/term search totals (task 052 leftover) The browse path (RrfFacets.filterIds) returns all survivors -> an exact total, while the tiered split-set search short-circuits at the ranked depth and has no exact-... push 24 Jun 2026 07:05PM UTC freeeve github
84.2
28121431909 main docs(tasks): capture the browse-vs-search totals discrepancy leftover in task 052 push 24 Jun 2026 06:41PM UTC freeeve github
84.2
28121213214 main feat(facet): counts_for — exact filtered counts for specific categories on demand Companion to the top-N counts cap: counts_full prices only the top categories per field, so the long tail is head-only/approximate. counts_for(result, &pairs) fetch... push 24 Jun 2026 06:38PM UTC freeeve github
84.2
28120638272 main perf(facet): cap counts_full to top-N categories per field (task 052 follow-up) FilteredIds.facetCounts fetched every category's tail, so on a wide sidecar (DeepLibby: 108,126 categories) one drill-down did 216,068 range-reads / 26.5 MB — unusabl... push 24 Jun 2026 06:28PM UTC freeeve github
84.2
28116869071 main ci: derive the wheel version from the git tag (PyPI tracks releases) The PyPI wheel version came from pyproject's static `version`, so it drifted from the git tags (tag v0.24.0/v0.24.1 -> PyPI 0.1.0/0.1.1). The release now rewrites python/pyproje... push 24 Jun 2026 05:24PM UTC freeeve github
84.2
28116321279 main chore(python): bump package version 0.1.0 -> 0.1.1 for the v0.24.1 patch release push 24 Jun 2026 05:14PM UTC freeeve github
84.2
See All Builds (142)

Badge your Repo: roaringrange

We detected this repo isn’t badged! Grab the embed code to the right, add it to your repo to show off your code coverage, and when the badge is live hit the refresh button to remove this message.

Could not find badge in README.

Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

Refresh
  • Settings
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc