• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

sisl / astra-rl / 19730796005 / 2
36%
main: 36%

Build:
DEFAULT BRANCH: main
Ran 27 Nov 2025 09:03AM UTC
Files 31
Run time 1s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

25 Nov 2025 11:48PM UTC coverage: 37.298%. Remained the same
19730796005.2

push

github

web-flow
Llamaguard 3 Granular Scorer and Wildguard Scorer (#31)

# Llamaguard 3 Harm Categories and Wildguard Scorer

## Description
New support for granular harm categories for Llamaguard 3 and added
support for Wildguard toxicity classifier.


### Added
- The LlamaguardScorer now takes at initialization the `harm_category`
parameter which lets the user set the harm category (S1-14) or "all"
- WildguardScorer is an implementation of Wildguard by AI2 which takes
the `scoring_target` parameter (`harmful_request`, `response_refusal`,
`harmful_response`) based on the scoring criteria of the Wildguard
toxicity classifier
- Documentation for both LlamaguardScorer and WildguardScorer has been
added

### Changed
- Both the Llamaguard and Wildguard scorers now take either a list of
strings as input `x` or a list of conversation histories. This is to
support future multi-turn red-teaming.
- Changes to `astra_rl.scorers` module init to import the Wildguard
Scorer
- Added/changed the files so the WildguardScorer shows up in the docs

### Fixed
None

### Removed
None

## Related Issue
None

## Note to Reviewers
Did testing by passing in lists of strings to see that output is
correct. Also passed incorrect input to see if safeguards trigger.

323 of 866 relevant lines covered (37.3%)

0.37 hits per line

Source Files on job 19730796005.2
  • Tree
  • List 31
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Build 19730796005
  • 4873e87f on github
  • Prev Job for on main (#19698147273.1)
  • Next Job for on main (#19758995753.1)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc