Skip to content

Custom implementation of unavailable baseline algorithms

License

Notifications You must be signed in to change notification settings

shubham0704/RL_baselines

Repository files navigation

RL_baselines

Custom implementation of unavailable RL SOTA baseline algorithms.

Currently available pytorch implementations -

  • POIS (Policy Optimization via Importance Sampling) (NeuRIPS 2018) - paper

In progress -

  • Minimum-Variance Policy Evaluation for Policy Improvement (UAI 2023)- paper

Installation

pip install -r requirements.txt

Replicating POIS results -

python evaluate_pois.py

Results for 500 iterations

We show evaluation results against PPO and TRPO in the linear Gaussian and MLP Gaussian Policy with learnable mean and fixed variance.

Image

About

Custom implementation of unavailable baseline algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages