-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BijectiveSimplexLink in the docs #66
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #66 +/- ##
=======================================
Coverage 96.40% 96.40%
=======================================
Files 11 11
Lines 139 139
=======================================
Hits 134 134
Misses 5 5 ☔ View full report in Codecov by Sentry. |
src/links.jl
Outdated
For example with the [`SoftMaxLink`](@ref), to obtain a `n-1`-simplex leading to | ||
`n` categories for the [`CategoricalLikelihood`](@ref), | ||
one needs to pass `n` latent GP. | ||
However, by wrapping the link into a `BijectiveSimplexLink`, only `n-1` latent GP are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What effect does this reparametrisation have on the model? Does it break the symmetry amongst classes (i.e., would changing the order of classes change the resulting fit)? Would it be useful to mention the 0 added at the end in the docstring, or is that irrelevant?
Also, a harder question that you may not have the answer to but I would be curious if this makes it easier to fit the model (because there's no more redundancy through the overall level, and hence it becomes identifiable)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in some quick 1-D experiments with a few classes I did in my thesis. I generated data via the [fs, 0] process and trying to recover the latent GPs with the bijective and non-bijective link.
I consistently observed that the bijective link produced more correct probability distributions (compared to the true generating probabilities) but that the non-bijective likelihood had a better log-likelihood on the training data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was with the logistic-softmax link. Also the augmentation non-bijective creates improper priors whereas in the bijective case all is nice and beautiful.
In terms of speed there does not seem to be a difference, but I did not check thoroughly
Co-authored-by: st-- <[email protected]>
@st-- Did I address your comments? |
Bumpity bump |
Since this is just a minor doc change, I think it can safely be merged? |
No description provided.