|
Ran
|
Jobs
3
|
Files
136
|
Run time
1min
|
Badge
README BADGES
|
push
github
Fix GitHub issue #792: Fast gradient clipping ignores ignore_index masking (#808) Summary: Pull Request resolved: https://github.com/meta-pytorch/opacus/pull/808 Context/Motivation: Fixes https://github.com/meta-pytorch/opacus/issues/792 When using fast/ghost gradient clipping for NLP tasks, `DPLossFastGradientClipping` computes per-sample mean loss via `.mean(dim=1)`, which divides by the full sequence length. This ignores the `ignore_index` parameter from the criterion (e.g., `CrossEntropyLoss(ignore_index=-100)`), causing masked/padded positions to dilute the loss. For tasks like SQuAD where only a few tokens are real targets out of a long sequence, the loss becomes orders of magnitude too small, preventing training. This diff: - Modified `DPLossFastGradientClipping.__call__()` to check for `ignore_index` on the criterion and compute mean only over non-ignored positions when present - Added regression test `github_issue_test.py` verifying ignore_index is respected for both mean and sum reductions, plus a backwards-compatibility test for the no-masking case Reviewed By: aparna-aketi Differential Revision: D95489302 fbshipit-source-id: d02146a71
0 of 9 new or added lines in 1 file covered. (0.0%)
5895 of 7445 relevant lines covered (79.18%)
1.76 hits per line
| Lines | Coverage | ∆ | File |
|---|---|---|---|
| 9 |
71.91 |
-7.1% | opacus/utils/fast_gradient_clipping_utils.py |
| ID | Job ID | Ran | Files | Coverage | |
|---|---|---|---|---|---|
| 1 | run-3 - 22867323270.1 | 72 |
47.49 |
GitHub Action Run | |
| 2 | run-1 - 22867323270.2 | 135 |
78.96 |
GitHub Action Run | |
| 3 | run-2 - 22867323270.3 | 135 |
78.95 |
GitHub Action Run |
| Coverage | ∆ | File | Lines | Relevant | Covered | Missed | Hits/Line |
|---|