Merge pull request #3 from findmyway/fix_fig_13_2

findmyway · web-flow · commit 938522d48c14 · 2019-01-17T21:44:22.000+08:00
modify params in chapter13/short_corridor.jl
diff --git a/README.md b/README.md
@@ -68,7 +68,7 @@ If you would like to make some improvements, I'd suggest the following workflow:
 | | [fig_10_3](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_10_3.png), [fig_10_4](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_10_4.png), [fig_10_5](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_10_5.png)| |
 |Chapter11 | [fig_11_2](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_11_2.png) | |
 |Chapter12 | [fig_12_3](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_12_3.png)| Other figures in Chapter12 are not that easy to reproduce by using the Ju.jl package. You may take a try and correct me with a PR.|
-| Chapter13 | [fig_13_1](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_13_1.png), [fig_13_2](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_13_2.png) | Figure_13_2 is a slightly different to the original figure on the book.|
+| Chapter13 | [fig_13_1](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_13_1.png), [fig_13_2](https://raw.githubusercontent.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/master/docs/src/assets/figures/figure_13_2.png) | ~~Figure_13_2 is a slightly different to the original figure on the book.~~ Thanks to Eric Graves's clarification, fixed in [#3](https://github.com/Ju-jl/ReinforcementLearningAnIntroduction.jl/pull/3)|
 
 # Related Packages
 
diff --git a/docs/src/assets/figures/figure_13_2.png b/docs/src/assets/figures/figure_13_2.png
diff --git a/src/chapter13/short_corridor.jl b/src/chapter13/short_corridor.jl
@@ -28,7 +28,7 @@ function run_once_RL()
         features[i, :, :] .= [0 1; 1 0]
     end
     agent = Agent(ReinforceLearner(LinearPolicy(features, [-1.47, 1.47]),
-                                2e-4,
+                                2^-13,
                                 1.),
                 EpisodeSARDBuffer())
     callbacks = (stop_at_episode(1000, false),  rewards_of_each_episode())
@@ -56,8 +56,8 @@ function run_once_RLBaseline()
     end
     agent = Agent(ReinforceBaselineLearner(TabularV(zeros(length(observationspace(env)))),
                                            LinearPolicy(features, [-1.47, 1.47]),
-                                           1e-4,
-                                           1e-4,
+                                           2^-6,
+                                           2^-9,
                                            1.),
                 EpisodeSARDBuffer())
     callbacks = (stop_at_episode(1000, false),  rewards_of_each_episode())