• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

kubeflow / trainer / 24566439746 / 1
62%
master: 62%

Build:
DEFAULT BRANCH: master
Ran 17 Apr 2026 01:05PM UTC
Files 40
Run time 2s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

17 Apr 2026 01:01PM UTC coverage: 58.057%. Remained the same
24566439746.1

push

github

web-flow
fix(examples): unblock Megatron TP notebook on GPU E2E (#3434)

* fix(e2e): bump Megatron notebook Complete-wait timeout to 10m

The Megatron TP notebook waits for the TrainJob to reach Complete with
timeout=120. On the oracle-vm-gpu-a10-1 CI runner the happy-path time
from TrainJob creation to Complete is ~96s (measured on the last passing
GPU E2E run, 2026-04-15). Any runner slowdown, image-pull delay, or
GPU-advertisement latency on top of that pushes the test past the 120s
budget and papermill raises TimeoutError, even though the TrainJob is
still on track to finish.

Every GPU E2E run has been failing here since 2026-04-16 ~16:00 UTC on
every branch, with no functional repo change between the last pass and
first fail. Bumping to 600s gives enough headroom for cold image pulls
and scheduling variance without masking real failures (papermill's
outer --execution-timeout is 1800s).

Signed-off-by: XploY04 <2004agarwalyash@gmail.com>

* fix(examples): update Megatron tokenizer library to null-text for 0.17.0

megatron-core 0.17.0 (published 2026-04-16 20:22 UTC) tightened the set
of accepted tokenizer library names in MegatronTokenizer.from_pretrained.
The bare "null" value is no longer accepted; the null-tokenizer keys are
now "null-text" and "null-multimodal":

  0.16.1: if library not in ['byte-level', 'null']: assert tokenizer_path
  0.17.0: if library not in ['byte-level', 'null-text', 'null-multimodal']:
              assert tokenizer_path

The notebook's call with metadata_path={"library": "null"} therefore
triggers `AssertionError: Tokenizer path must be specified.` on the GPU
E2E runner, which now installs 0.17.0 by default. Renaming to
"null-text" routes through the same NullTokenizer(vocab_size) library
class the old "null" key used, so behavior is unchanged on 0.17.0.

This explains why GPU E2E ran green on 2026-04-15 (installed 0.16.1)
and started failing with the tokenizer assertion once 0.17.0 landed on
PyPI. The previous commit's 120s ... (continued)

2032 of 3500 relevant lines covered (58.06%)

0.67 hits per line

Source Files on job 24566439746.1
  • Tree
  • List 40
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Build 24566439746
  • 71335f7d on github
  • Prev Job for on master (#24463169472.1)
  • Next Job for on master (#24569760600.1)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc