• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

lsm / neokai / 26411444620
81%

Build:
DEFAULT BRANCH: dev
Ran 25 May 2026 05:02PM UTC
Jobs 28
Files 571
Run time 2min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

25 May 2026 05:01PM UTC coverage: 81.169%. First build
26411444620

push

github

web-flow
Benchmark graph context tools on task 394 (#2009)

* docs: benchmark graph context tools

Compare CodeGraph, code-review-graph, Graphify, and baseline on task #394 to guide optional NeoKai integration priority.

* docs: correct graph benchmark findings

Address review feedback by fixing the task #394 answer key, MCP tool counts, and Graphify runtime notes.

* docs: add ast-grep benchmark comparison

Benchmark ast-grep as a structural search baseline alongside the graph context tools for task #394.

* docs: add unseeded graph benchmark round

* docs: add plain unseeded GLM baseline

* docs: add mixed graph benchmark round

* test: add graph tool benchmark as agent session integration test

Proper benchmark using NeoKai daemon sessions with MCP tool servers
attached, not raw Python HTTP calls. 12 test cases (describe.skip by
default): baseline GLM, 4 unseeded tool cases, 4 mixed discovery cases,
plus mixed baseline. Outputs JSON results to /tmp/.

Run: cd packages/daemon && GLM_API_KEY=xxx bun test tests/online/benchmark/benchmark-graph-tools.test.ts

* docs: add agent session benchmark results and fix GLM-SDK compatibility

Run graph tool benchmark through real NeoKai daemon sessions with MCP
servers attached. Key findings:

- GLM-5.x tool_use responses incompatible with Claude Agent SDK context-fetcher
- GLM-4.7 works for text-only and single-tool MCP sessions
- Mixed multi-tool sessions hang due to same SDK incompatibility
- GLM-4.7 did not voluntarily invoke MCP tools in any test case
- All 4 completed tests (baseline, CodeGraph, CRG, ast-grep) produced
  text-only plans with zero tool calls

Restructure benchmark: drop mixed round, keep unseeded tests only,
add text-only baseline prompt, increase timeouts, build indexes before
daemon start to avoid transport PONG timeout.

* fix: address benchmark PR review feedback

- Use BENCHMARK_PROMPT_UNSEDED for MCP cases (not TEXT_ONLY) so tools
  are not suppressed
- Record real commit SHA via git rev-parse... (continued)

9317 of 13703 branches covered (67.99%)

Branch coverage included in aggregate %.

77645 of 93434 relevant lines covered (83.1%)

295.38 hits per line

Jobs
ID Job ID Ran Files Coverage
1 daemon-0-shared - 26411444620.1 25 May 2026 05:02PM UTC 32
81.29
GitHub Action Run
2 daemon-5-space-other - 26411444620.2 25 May 2026 05:02PM UTC 115
40.08
GitHub Action Run
3 daemon-5-space-agent - 26411444620.3 25 May 2026 05:02PM UTC 167
24.46
GitHub Action Run
4 daemon-online-rewind-1 - 26411444620.4 25 May 2026 05:03PM UTC 325
22.49
GitHub Action Run
5 daemon-online-features-1 - 26411444620.5 25 May 2026 05:03PM UTC 325
23.13
GitHub Action Run
6 daemon-online-git - 26411444620.6 25 May 2026 05:03PM UTC 325
19.09
GitHub Action Run
7 daemon-online-rpc-1 - 26411444620.7 25 May 2026 05:03PM UTC 325
19.34
GitHub Action Run
8 daemon-online-rpc-3 - 26411444620.8 25 May 2026 05:03PM UTC 325
19.74
GitHub Action Run
9 daemon-online-components - 26411444620.9 25 May 2026 05:02PM UTC 325
18.07
GitHub Action Run
10 daemon-online-space-2 - 26411444620.10 25 May 2026 05:04PM UTC 325
33.32
GitHub Action Run
11 daemon-5-space-runtime - 26411444620.11 25 May 2026 05:03PM UTC 180
45.9
GitHub Action Run
12 daemon-4-space-storage - 26411444620.12 25 May 2026 05:04PM UTC 156
60.63
GitHub Action Run
13 daemon-online-mcp - 26411444620.13 25 May 2026 05:02PM UTC 325
18.45
GitHub Action Run
14 daemon-online-features-2 - 26411444620.14 25 May 2026 05:03PM UTC 325
22.68
GitHub Action Run
15 daemon-2-handlers - 26411444620.15 25 May 2026 05:02PM UTC 298
28.51
GitHub Action Run
16 daemon-online-sdk - 26411444620.16 25 May 2026 05:03PM UTC 325
22.35
GitHub Action Run
17 daemon-online-websocket - 26411444620.17 25 May 2026 05:02PM UTC 325
18.18
GitHub Action Run
18 daemon-online-convo - 26411444620.18 25 May 2026 05:03PM UTC 325
22.23
GitHub Action Run
19 daemon-online-agent-sdk - 26411444620.19 25 May 2026 05:03PM UTC 325
22.34
GitHub Action Run
20 daemon-online-coordinator - 26411444620.20 25 May 2026 05:02PM UTC 325
7.67
GitHub Action Run
21 daemon-5-space-workflow - 26411444620.21 25 May 2026 05:02PM UTC 111
30.98
GitHub Action Run
22 daemon-online-rewind-2 - 26411444620.22 25 May 2026 05:03PM UTC 325
22.96
GitHub Action Run
23 daemon-online-rpc-2 - 26411444620.23 25 May 2026 05:03PM UTC 325
23.51
GitHub Action Run
24 daemon-online-space-1 - 26411444620.24 25 May 2026 05:03PM UTC 325
33.8
GitHub Action Run
25 web - 26411444620.25 25 May 2026 05:03PM UTC 237
73.25
GitHub Action Run
26 daemon-1-core - 26411444620.26 25 May 2026 05:03PM UTC 331
35.95
GitHub Action Run
27 daemon-online-rpc-4 - 26411444620.27 25 May 2026 05:04PM UTC 325
23.36
GitHub Action Run
28 daemon-online-lifecycle - 26411444620.28 25 May 2026 05:03PM UTC 325
22.75
GitHub Action Run
Source Files on build 26411444620
  • Tree
  • List 571
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line Branch Hits Branch Misses
  • Back to Repo
  • Github Actions Build #26411444620
  • 41158616 on github
  • Prev Build on dev (#26408302152)
  • Next Build on dev (#26417906621)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc