This software package allows the user to easily apply the powerful pilco framework to a wide range of continuous-valued RL and control problems, requiring only a small amount of problem-specific extra coding. The package includes five example scenarios as demonstrations of what is possible and to help the user apply the package to their own problems. The high-level steps of the pilco algorithm are the following: Learn a Gaussian process (GP) model of the system dynamics, perform deterministic approximate inference for policy evaluation, update the policy parameters using exact gradient information, apply the learned controller to the system. The software package provides an interface, which allows for setting up novel tasks without the need to be familiar with the intricate details of model learning, policy evaluation and improvement.
- Changes to previous version:
Initial Announcement on mloss.org.
Leave a comment
You must be logged in to post comments.