• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

deepset-ai / haystack / 23149578349
93%

Build:
DEFAULT BRANCH: main
Ran 16 Mar 2026 02:48PM UTC
Jobs 1
Files 258
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

16 Mar 2026 02:44PM UTC coverage: 92.906% (+0.01%) from 92.894%
23149578349

push

github

web-flow
fix!: Update BM25 tokenization regex to match single char tokens (#10814)

* fix: correct off-by-one in BM25 average document length calculation

In `write_documents`, the average document length formula used
`len(self._bm25_attr)` after the new document was already inserted,
leading to an incorrect calculation:

  Before (wrong): (new_len + avg * N) / (N + 1)  — where N already includes new doc
  After  (fixed): (new_len + avg * (N - 1)) / N

This caused _avg_doc_len to be systematically underestimated, which
affects BM25 scoring accuracy (the length normalization term `b *
doc_len / avg_doc_len` would be too large, over-penalizing longer
documents).

The delete path was already correct since it adjusts `len` after
popping from `_bm25_attr`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update BM25 tokenization regex to match single char tokens

* Add tests

* update BM25 tokenization regex across multiple retriever tests

* add release note

* Update score values in tests

* Update release note

---------

Co-authored-by: gambletan <tan@echooo.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

15807 of 17014 relevant lines covered (92.91%)

0.93 hits per line

Jobs
ID Job ID Ran Files Coverage
1 23149578349.1 16 Mar 2026 02:48PM UTC 258
92.91
GitHub Action Run
Source Files on build 23149578349
  • Tree
  • List 258
  • Changed 1
  • Source Changed 0
  • Coverage Changed 1
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #23149578349
  • fdcfca58 on github
  • Prev Build on main (#23149543586)
  • Next Build on main (#23156518770)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc