Custom implementation of unavailable RL SOTA baseline algorithms.
Currently available pytorch implementations -
- POIS (Policy Optimization via Importance Sampling) (NeuRIPS 2018) - paper
In progress -
- Minimum-Variance Policy Evaluation for Policy Improvement (UAI 2023)- paper
pip install -r requirements.txt
python evaluate_pois.py
We show evaluation results against PPO and TRPO in the linear Gaussian and MLP Gaussian Policy with learnable mean and fixed variance.