RLPy is a framework for performing sequential decision making experiments in Python. RLPy provides a fine-grained view of learning agents, breaking them into modular components and providing a library for each. Additionally, RLPy provides a wide variety of problem domains to test these agents - these are listed at bottom.
Parallelization: Easily scale experiments by running them in parallel on multiple cores of a single machine (user only needs to specify the number of cores) or on HTCondor computing cluster.
Hyperparameter Optimization: Built-in support for optimizing hyperparameters with state-of-the-art methods (using the hyperopt package). The user only needs to specify the parameters and their bounds.
Code Profiling: Easily identify performance bottlenecks of the code with built-in profiling support. A color-coded call graph of execution reveils slow functions.
Plotting: User specifies experimental configuration and number of runs for statistical significance. Then using merger.py tool, user need only specify quantities to appear on the graph; runs of the same configuration are automatically associated and averaged, and various configurations can be plotted simultaneously with confidence intervals.
Learning Agent Components:
Value Function Representations:
- Bellman Error Basis Functions
- Fourier Basis Functions
- Incremental Feature Dependency Discovery (iFDD)
- Radial Basis Functions
- Tile Coding
- Uniform Random
- Least-Squares Policy Iteration
- Natural Actor-Critic
- Policy Iteration
- Trajectory-based Value Iteration
- Value Iteration
- Bicycle Balancing
- CartPole Balancing (2-state or 4-state)
- CartPole Swingup (2-state or 4-state)
- Fifty-State ChainMDP
- HIV Treatment
- Helicopter Hovering
- Intruder Monitoring
- Persistent Search and Track
- RC Car
- System Administrator
- Changes to previous version:
Initial Announcement on mloss.org.
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.