-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to do hyperparameter tuning #76
Comments
Hi @simsurace, Sorry, I did not answer (and save you some time) but I got Covid-stroke. Actually |
Thanks, I did not notice that this is an implicit optimization. So is it independent of hyperparameters? If yes, this and the PR #77 would be unnecessary. I will give it a try and see if the results change! Sorry to hear about your illness. Hope you get better soon. |
If it works out with the ignore statements, I could convert the PR into a documentation thing where this is explained. |
That's an interesting question actually it depends on the parametrization. Right now I am parametrizing with m and S, mean and covariance. But one could parameterize the covariance as (K^{-1} + D)^{-1} and similarly for the mean, there one could optimize the hyperparameters as well but that's a more complicated matter. In summary, for full GPs, the kernel parameters only matters for the KL(q(f)||p(f)) and for sparse GPs they also are influenced in the expected log likelihood, but that's it. |
Just to clarify my understanding: |
Oh yeah sorry, somehow I got confused with the updates on |
EDIT: Ah I think I now understood, one should not expose the variational parameters to the optimizer, but have an internal CAVI loop for them. Still struggling to make it work though. Do you have a working example for hyperparameter optimization of the augmented ELBO? No hurry though. This is not very urgent, but it would be nice to make this work and compare it to the normal SVGP optimization loop for speed. |
aux_posterior
is not AD-ready
I tried to AD
aug_elbo
in theNegBinomialLikelihood
example, i.e. (removed unnecessary bits), purposefully avoiding ParameterHandling.jl and trying only withForwardDiff.gradient
There is an easy fix (happy to open a PR): change the definition of
aux_posterior
asBTW: is it expected that the values of the augmented ELBO are so much larger in magnitude than the normal ELBO?
The text was updated successfully, but these errors were encountered: