• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

IBM / unitxt / 26155722481
81%

Build:
DEFAULT BRANCH: main
Ran 20 May 2026 10:12AM UTC
Jobs 1
Files 64
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

20 May 2026 10:07AM UTC coverage: 80.863% (-0.04%) from 80.903%
26155722481

push

github

web-flow
fix: CI compatibility fixes (HF_TOKEN, arena-hard migration, datasets 4.8.5) (#1966)

* fix: Replace huggingface-cli login with HF_TOKEN env var in catalog_preparation CI

The huggingface-cli command was not found on PATH in CI, causing the
login step to fail. Using the HF_TOKEN environment variable is the
recommended approach for CI and avoids PATH issues entirely.

Signed-off-by: Yoav Katz <katz@il.ibm.com>

* fix: Use keyword arguments in DatasetBuilder.as_dataset() call

The newer datasets library changed as_dataset() to accept fewer
positional arguments. Pass all arguments as keyword arguments for
forward compatibility.

Signed-off-by: Yoav Katz <katz@il.ibm.com>

* fix: Migrate arena-hard card to lmarena-ai/arena-hard-viewer and fix WeightedWinRateCorrelation metric

The old lmsys/arena-hard-browser HF space is no longer available. This
migrates to the replacement space lmarena-ai/arena-hard-viewer with
adapted processing steps for its different data format (flat prompt
field, messages-based answers, uid instead of question_id).

Also fixes a bug in WeightedWinRateCorrelation where pd.DataFrame
columns initialized as object dtype caused scipy pearsonr to fail
with newer numpy/scipy versions.

Signed-off-by: Yoav Katz <katz@il.ibm.com>

* fix: Force float32 in Perplexity metric to prevent NaN with float16 models

Models like bloom-560M default to float16, causing numerical overflow
in attention computations with padded inputs. Forcing float32 ensures
stable perplexity scores regardless of model's default dtype.

Signed-off-by: Yoav Katz <yoavkatz@il.ibm.com>
Signed-off-by: Yoav Katz <katz@il.ibm.com>

* fix: Remove run_post_process and verification_mode params for datasets>=4.8.5

The `datasets` library removed `run_post_process` and `verification_mode`
parameters from `DatasetBuilder.as_dataset()` in version 4.8.5. These
parameters were already non-functional in 4.8.4 (the `_post_process`
method and verification logic had been removed from the i... (continued)

1607 of 2007 branches covered (80.07%)

Branch coverage included in aggregate %.

10955 of 13528 relevant lines covered (80.98%)

0.81 hits per line

Coverage Regressions

Lines Coverage ∆ File
429
75.71
-0.16% unitxt/metrics.py
14
80.88
0.0% unitxt/api.py
Jobs
ID Job ID Ran Files Coverage
1 26155722481.1 20 May 2026 10:12AM UTC 64
80.86
GitHub Action Run
Source Files on build 26155722481
  • Tree
  • List 64
  • Changed 2
  • Source Changed 0
  • Coverage Changed 2
Coverage ∆ File Lines Relevant Covered Missed Hits/Line Branch Hits Branch Misses
  • Back to Repo
  • Github Actions Build #26155722481
  • 09366916 on github
  • Prev Build on main (#21828313409)
  • Next Build on main (#26172686025)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc