• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

OverLordGoldDragon / keras-adamw
97%
master: 95%

Build:
Build:
LAST BUILD BRANCH: 114-fix
DEFAULT BRANCH: master
Repo Added 27 Sep 2019 10:20PM UTC
Files 6
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

LAST BUILD ON BRANCH v1.36
branch: v1.36
CHANGE BRANCH
x
Reset
  • v1.36
  • 1.0
  • 1.2
  • 1.22
  • 1.23
  • 114-fix
  • 2.3.1
  • OverLordGoldDragon-patch-1
  • OverLordGoldDragon-patch-2
  • OverLordGoldDragon-patch-3
  • OverLordGoldDragon-sh-syntax-testing
  • OverLordGoldDragon-travis-1
  • TF2.2
  • master
  • patch-1
  • patch-2
  • v1.1
  • v1.1c
  • v1.1d
  • v1.1e
  • v1.2
  • v1.21
  • v1.22
  • v1.3
  • v1.30
  • v1.31
  • v1.32
  • v1.35
  • v1.37
  • v1.38
  • v1.38a

pending completion
196

push

travis-ci-com

web-flow
Correct normalization scheme; deprecate `batch_size`

Existing code normalized as: `norm = sqrt(batch_size / total_iterations)`, where `total_iterations` = (number of fits per epoch) * (number of epochs in restart). However, `total_iterations = total_samples / batch_size` --> `norm = batch_size * sqrt(1 / (total_iterations_per_epoch * epochs))`, making `norm` scale _linearly_ with `batch_size`, which differs from authors' sqrt.

Users who never changed `batch_size` throughout training will be unaffected. (λ = λ_norm * sqrt(b / BT); λ_norm is what we pick, our "guess". The idea of normalization is to make it so that if our guess works well for `batch_size=32`, it'll work well for `batch_size=16` - but if `batch_size` is never changed, then performance is only affected by the guess.)

Main change [here](https://github.com/OverLordGoldDragon/keras-adamw/pull/53/files#diff-220519926b87c12115d2f727803fbe6bR19), closing #52.

**Updating existing code**: for a choice of λ_norm that previously worked well, apply `*=  sqrt(batch_size)`. Ex: `Dense(bias_regularizer=l2(1e-4))` --> `Dense(bias_regularizer=l2(1e-4 * sqrt(32)))`.

1317 of 1351 relevant lines covered (97.48%)

1.99 hits per line

Relevant lines Covered
Build:
Build:
1351 RELEVANT LINES 1317 COVERED LINES
1.99 HITS PER LINE
Source Files on v1.36
  • Tree
  • List 6
  • Changed 0
  • Source Changed 0
  • Coverage Changed 0
Coverage ∆ File Lines Relevant Covered Missed Hits/Line

Recent builds

Builds Branch Commit Type Ran Committer Via Coverage
196 v1.36 Correct normalization scheme; deprecate `batch_size` Existing code normalized as: `norm = sqrt(batch_size / total_iterations)`, where `total_iterations` = (number of fits per epoch) * (number of epochs in restart). However, `total_iterations = to... push 13 Jul 2020 07:00PM UTC web-flow travis-ci-com pending completion  
194 v1.36 Correct normalization scheme; deprecate `batch_size` push 13 Jul 2020 06:30PM UTC OverLordGoldDragon travis-ci-com pending completion  
See All Builds (149)
  • Repo on GitHub
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2025 Coveralls, Inc