• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

pytorch / opacus / 17750090532
80%

Build:
DEFAULT BRANCH: main
Ran 16 Sep 2025 12:09AM UTC
Jobs 3
Files 133
Run time 1min
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

16 Sep 2025 12:01AM UTC coverage: 80.307%. Remained the same
17750090532

push

github

facebook-github-bot
Add support for passing args and kwargs to per-sample loss functions (#786)

Summary:
## Types of changes

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Docs change / refactoring / dependency upgrade

## Motivation and Context / Related issue

It prevents `TypeError: DPLossFastGradientAdaptiveClipping.__call__() got an unexpected keyword argument 'vocab_size'` error from triggering when assigning `DPLossFastGradientAdaptiveClipping` or `DPLossFastGradientClipping` to the `.loss_function` property of any `PreTrainedModel`.

Every `PreTrainedModel.loss_function()` call expects `vocab_size` amongst it's keyword arguments:
```
# transformers.models.gpt2.modeling_gpt2.py:1099
# Flatten the tokens
loss = self.loss_function(
        logits,
        labels,
        vocab_size=self.config.vocab_size,
        **kwargs,
    )
```
Meanwhile, `DPLossFastGradientAdaptiveClipping.__call__ `and `DPLossFastGradientClipping.__call__` don't have this keyword argument `vocab_size` in their signature. `vocab` size is later needed for tensor flattening:
```
def ForCausalLMLoss(
    logits,
    labels,
    vocab_size: int,
    num_items_in_batch: Optional[torch.Tensor] = None,
    ignore_index: int = -100,
    shift_labels: Optional[torch.Tensor] = None,
    **kwargs,
) -> torch.Tensor:
    # Upcast to float if we need to compute the loss to avoid potential precision issues
    logits = logits.float()

    if shift_labels is None:
        # Shift so that tokens < n predict n
        labels = nn.functional.pad(labels, (0, 1), value=ignore_index)
        shift_labels = labels[..., 1:].contiguous()

    # Flatten the tokens
    logits = logits.view(-1, vocab_size) <------ used here
```

## How Has This Been Tested (if it applies)
Tested and trained on transformers' `GPT2LMHeadModel` with LoRA and 4B parameter... (continued)

3 of 3 new or added lines in 2 files covered. (100.0%)

5591 of 6962 relevant lines covered (80.31%)

1.79 hits per line

Jobs
ID Job ID Ran Files Coverage
1 run-2 - 17750090532.1 16 Sep 2025 12:16AM UTC 132
80.08
GitHub Action Run
2 run-1 - 17750090532.2 16 Sep 2025 12:17AM UTC 132
80.08
GitHub Action Run
3 run-3 - 17750090532.3 16 Sep 2025 12:09AM UTC 70
47.81
GitHub Action Run
Source Files on build 17750090532
  • Tree
  • List 133
  • Changed 2
  • Source Changed 2
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line
  • Back to Repo
  • Github Actions Build #17750090532
  • c9032e95 on github
  • Prev Build on main (#17664266545)
  • Next Build on main (#17754228286)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc