SwamyDev / udacity-deep-rl-navigation / 62

Builds	Branch	Commit	Type	Ran	Committer	Via	Coverage
62	master	Merge pull request #5 from SwamyDev/dependabot/pip/resources/unity-mlagent/tensorflow-1.15.4 Bump tensorflow from 1.15.2 to 1.15.4 in /resources/unity-mlagent	push	03 Oct 2020 06:34AM UTC	web-flow	travis-ci-com	pending completion
58	priority-sampling	Extract the replay buffer logic This should make it easier to integrate the priority replay logic.	push	17 Mar 2020 08:00PM UTC	SwamyDev	travis-ci-com	pending completion
57	priority-sampling	Implement SumTree data structure This data structure will be used by the priority sampling algorithm. It provides an efficient means of keeping track of experience weights, to be used in sampling.	push	17 Mar 2020 07:30PM UTC	SwamyDev	travis-ci-com	pending completion
56	master	Merge pull request #4 from SwamyDev/p3 Integrate final multi-agent project	push	14 Mar 2020 10:14PM UTC	web-flow	travis-ci-com	pending completion
53	p3	Update report and readme draft Improved wording and corrected some spelling mistakes. Also corrected for the score issue, where I've used the mean of both agent rewards before. Now I use the correct max score.	push	14 Mar 2020 09:37PM UTC	SwamyDev	travis-ci-com	pending completion
54	p3	Update report and readme draft Improved wording and corrected some spelling mistakes. Also corrected for the score issue, where I've used the mean of both agent rewards before. Now I use the correct max score.	Pull #4	14 Mar 2020 09:26PM UTC	web-flow	travis-ci-com	pending completion
51	p3	Add P3 report first draft The report is necessary to finish the final project.	push	14 Mar 2020 02:17PM UTC	SwamyDev	travis-ci-com	pending completion
50	p3	Rename MADDPG to NDDPG As the current implementation does not really follow the MADDPG algorithm as described in the paper (link at the bottom), but rather just trains two independent DDPG algorithms, I decided to rename it, to avoid confusion. ...	push	14 Mar 2020 10:00AM UTC	SwamyDev	travis-ci-com	pending completion
49	p3	Fix NDDPG tests and coverage report With the new agent setup the basic learning tests failed, because it now required different observation, action and reward shapes. Also the multiple inheritance approach didn't work that well, because of the or...	push	14 Mar 2020 09:40AM UTC	SwamyDev	travis-ci-com	pending completion
47	p3	Implement keyboard interrupt and snapshots I want to keep the state when of the agent when I interrupt training. Often because it already reached a very high value. Also I wanted a way of investigating agent behaviour during training to get a bet...	push	11 Mar 2020 08:46PM UTC	SwamyDev	travis-ci-com	pending completion
46	p3	Remove MultiAgentWrapper It is good practise to remove code that is not used anymore. This particular implementation has been removed, because I decided to treat multi agents as their own algorithms. Mostly because it is not so easy to keep agent...	push	11 Mar 2020 07:43PM UTC	SwamyDev	travis-ci-com	pending completion
45	p3	Improve exploration logic of MADDPG agent By allowing preheat steps which are completely random the agent can observe a more diverse set of states. Otherwise it wouldn't see that much because it tends to crash early on into a local minimum. With ...	push	11 Mar 2020 06:59PM UTC	SwamyDev	travis-ci-com	pending completion
43	p3	Implement multi agent wrapper This allows me to use the CLI interface to start and train a multi agent setup. By wrapping existing agents, I make minimal changes to the code and keep the architecture flexible.	push	08 Mar 2020 03:42PM UTC	SwamyDev	travis-ci-com	pending completion
42	p3	Add multi agent handling to unity env adapter To solve the third project the unity environment gym adapter needs to handle multi agent environments properly, hence the adapter has been extended. Also another test mode for the new environment has ...	push	07 Mar 2020 10:10AM UTC	SwamyDev	travis-ci-com	pending completion
41	master	Add link to spinning up RL	push	24 Feb 2020 09:26PM UTC	web-flow	travis-ci-com	pending completion

← Previous
1
2
3
Next →