69
100%
master: 100%

Ran 15 Sep 2019 02:19PM UTC

Jobs 1

Files 21

Run time 1s

Badge

Embed ▾

pending completion

Build # 69

Build Type

push

travis-ci

Committed by

SwamyDev

Commit Message

Test baseline propagation & normalization

For the reinforce algorithm to converge it is important, that the value
estimates of the baseline also contain the accumulated reward which the
agent is using as a signal. Otherwise, the baseline adjustment is too
much off to be useful. Hence we introduced the one dimensional walk
environment, where the reward needs to propagate back to the start
position for the agent to stably learn the optimal policy. Additionally
the environment tests, whether the baseline estimates are done on
trajectory batches. This is important because one edge case is where the
basline would estimate perfect values for each state, reducing the
advanatage signals to almost 0 resulting in no useful signal at all.

Run Details

470 of 470 relevant lines covered (100.0%)

1.0 hits per line

Jobs

ID	Job ID	Ran	Files	Coverage
1	69.1	15 Sep 2019 02:19PM UTC	0	100.0	Travis Job 69.1

Source Files on build 69

Detailed source file information is not available for this build.

SwamyDev / reinforcement / 69
100%
master: 100%

README BADGES
x

Markdown

Textile

RDoc

HTML

Rst

Jobs

Source Files on build 69

SwamyDev / reinforcement / 69 100% master: 100%

README BADGES x

Markdown

Textile

RDoc

HTML

Rst

Jobs

Source Files on build 69

SwamyDev / reinforcement / 69
100%
master: 100%

README BADGES
x