• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

supabase / supabase / 20451647687
66%

Build:
DEFAULT BRANCH: master
Ran 23 Dec 2025 04:49AM UTC
Jobs 1
Files 67
Run time 72min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

23 Dec 2025 04:45AM UTC coverage: 63.776% (-4.3%) from 68.053%
20451647687

push

github

web-flow
feat: assistant evals (#41311)

* chore: bump `supabase` CLI

* chore: stricter message types in `generate-v4.ts`

* feat: tutorial eval

https://www.braintrust.dev/docs/evaluation

* feat: project ID for eval

* refactor: `generateAssistantResponse` out of `handlePost`

* refactor: generateAssistantResponse to lib/ai

* feat: factuality eval with assistant response

* chore: upgrade braintrust to v1.0.1

* chore: silence tsconfig warning

* feat: assertion scorer

* fix: aggregate tools across all steps

* refactor: strict tool names, remove need for `as const`

* refactor: generic tool name type in assertions

* feat: transfer mocks from `feature/braintrust`

* feat: LLM criteria assertion

* feat: braintrust evals workflow

* fix: BRAINTRUST_PROJECT_ID

* feat: `sql_similar` assertion

* fix: `OPENAI_API_KEY` in workflow env

* feat: split AssertionScorer into separate scorers

* feat: remove tutorial eval

* feat: 20 minute CI timeout

* feat: category in test case metadata

* feat: score with gpt-5

* refactor: dataset to own file, colocate scorers

* feat: "gpt-5.2-2025-12-11" for llm as a judge

* feat: SQL syntax scorer with `libpg-query`

* feat: `evals:setup` and `evals:run` scripts

* feat: `evals:setup` in CI

* feat: human readable scorer names

* chore: rename to "SQL Validity"

* feat: add 2 "sql_generation" test cases

* feat: update requiredTools in test cases

* chore: ignore Cursor MCP config

* feat: "Conciseness" score

* feat: "Completeness" scorer

* fix: generate-v4 test mocks

* feat: serialize "steps" for scorer inputs

* updated node mem options for typecheck

* updated runner

* remove ram update as actions handle this

* feat: read `BRAINTRUST_PROJECT_ID` from secrets

* feat: score helpfulness, remove old scorers

* feat: separate `evals:run` and `evals:upload` scripts

* feat: passthrough entire classifier result

* feat: use live `search_docs` impl, store docs result in metadata

* feat: reduce classifier options

* fea... (continued)

504 of 604 branches covered (83.44%)

Branch coverage included in aggregate %.

77 of 362 new or added lines in 2 files covered. (21.27%)

2114 of 3501 relevant lines covered (60.38%)

126.53 hits per line

New Missed Lines in Diff

Lines Coverage ∆ File
9
81.0
apps/studio/lib/ai/generate-assistant-response.ts
276
0.0
apps/studio/lib/ai/tools/mock-tools.ts
Subprojects
ID Flag name Job ID Ran Files Coverage
1 studio-tests 20451647687.1 23 Dec 2025 04:49AM UTC 67
63.78
GitHub Action Run
Source Files on build 20451647687
  • Tree
  • List 67
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line Branch Hits Branch Misses
  • Back to Repo
  • Github Actions Build #20451647687
  • 072883bc on github
  • Prev Build on master (#20447671292)
  • Next Build on master (#20458216576)
  • Delete
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc