• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

databendlabs / openraft / 24661569127
91%
main: 87%

Build:
Build:
LAST BUILD BRANCH: release-0.10
DEFAULT BRANCH: main
Ran 20 Apr 2026 10:39AM UTC
Jobs 1
Files 150
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

20 Apr 2026 10:29AM UTC coverage: 90.496% (+0.07%) from 90.43%
24661569127

push

github

drmingdrmer
Fix: bound snapshot chunk retries and propagate stale-snapshot errors

`Chunked::send_snapshot` used to swallow every transient `RPCError`
variant — `Timeout`, `Unreachable`, `Network`, and even remote `Fatal`
— then `continue` the loop without changing `offset`. A flaky target
therefore streamed the same snapshot forever, even after the leader
had built a newer one. The outer `C::timeout(hard_ttl, ...)` expiring
also just `continue`d, making a tight loop bounded only by a 1 ms
per-iteration sleep.

Retry policy:

- `Timeout` / `Network`: module-local exponential backoff
  (`SNAPSHOT_CHUNK_RETRY_BASE` = 10 ms doubling to
  `SNAPSHOT_CHUNK_RETRY_MAX` = 200 ms). These errors typically clear
  within a packet-loss burst — a tight curve rides it out without
  involving the caller.
- `Unreachable`: caller's `RaftNetwork::backoff()` iterator, cached
  per outage and dropped on the next successful chunk. An unreachable
  target usually stays so for seconds to minutes, a cadence the
  application should pick.
- `PayloadTooLarge`: fail fast. Retrying the same chunk at the same
  size cannot make progress, and the append-entries shrink path in
  `ReplicationCore` is not reusable here yet.
- `RemoteError::Fatal`: propagate verbatim.
- `SnapshotMismatch`: reset offset + retry state and continue.

Bail out on `SNAPSHOT_CHUNK_MAX_RETRIES` (5) consecutive transient
failures and surface the underlying error. The replication layer then
unwinds and drives the next attempt with a fresh snapshot — exactly
what the new integration test
`t91_snapshot_retry_uses_latest_snapshot` verifies end-to-end.

The outer `hard_ttl` timeout also returns `NetworkError` immediately
rather than looping: stacking in-flight RPCs under the same deadline
cannot make progress, and the replication layer drives the next
attempt on its own timer.

Changes:

- Add `SNAPSHOT_CHUNK_MAX_RETRIES`, `SNAPSHOT_CHUNK_RETRY_BASE`,
  `SNAPSHOT_CHUNK_RETRY_MAX`, `SNAPSHOT_CHUNK_UNREACHABLE_FALLBACK`
  consta... (continued)

23 of 52 new or added lines in 1 file covered. (44.23%)

1 existing line in 1 file now uncovered.

9712 of 10732 relevant lines covered (90.5%)

7957.89 hits per line

Uncovered Changes

Lines Coverage ∆ File
29
74.59
-3.27% openraft/src/network/snapshot_transport.rs

Coverage Regressions

Lines Coverage ∆ File
1
74.59
-3.27% openraft/src/network/snapshot_transport.rs
Jobs
ID Job ID Ran Files Coverage
1 24661569127.1 20 Apr 2026 10:39AM UTC 150
90.5
GitHub Action Run
Source Files on build 24661569127
  • Tree
  • List 150
  • Changed 7
  • Source Changed 1
  • Coverage Changed 7
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #24661569127
  • 29941565 on github
  • Prev Build on release-0.9 (#24647645135)
  • Next Build on release-0.9 (#24664696795)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc