首页 > 代码库 > Incentivizing exploration in reinforcement learning with deep predictive models

Incentivizing exploration in reinforcement learning with deep predictive models

Stadie, Bradly C., Sergey Levine, and Pieter Abbeel. "Incentivizing exploration in reinforcement learning with deep predictive models." arXiv preprint arXiv:1507.00814 (2015).

 

作者通过模拟(状态,动作)的不确定性,从而修改reward,帮助agent进行探索。作者说用了他们的方法不用进行随机探索。该方法比较通用,适用于多种RL模型,但是要训练auto-encoder,所以也稍微有点繁琐。

 

实用指数:3颗星

理论指数:1颗星

创新指数:4颗星

Incentivizing exploration in reinforcement learning with deep predictive models