-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive Metropolis algorithm #57
base: master
Are you sure you want to change the base?
Conversation
Is it possible to just add a new proposal and extend the existing sampler, if needed, but not add a new sampler? I think this would simplify the code and also be easier if one wants to combine different proposals. I am also wondering if |
src/adaptivemetropolis.jl
Outdated
mutable struct AMProposal <: Proposal{MvNormal} | ||
epsilon::Symmetric | ||
scalefactor::Float64 | ||
proposal::MvNormal | ||
samplemean::AbstractArray | ||
samplesqmean::AbstractMatrix | ||
N::Int | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add type parameters such that the fields are concretely typed: https://docs.julialang.org/en/v1/manual/performance-tips/index.html#Avoid-fields-with-abstract-type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed now.
src/adaptivemetropolis.jl
Outdated
N::Int | ||
end | ||
|
||
function AMProposal(epsilon::AbstractMatrix, scalefactor=2.38^2 / size(epsilon, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for this specific scale factor? Is there a reference for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation is from this classic:
I will add a reference!
I didn't want to modify existing code in order to not break something but the changes could be trivially implemented in MetropolisHastings! I can move things there if that'd be better.
I wanted to leave users the option of querying the sampler freely for new proposals without changing the state, hence why I thought updating the proposal in |
My suggestion was not to update the proposal based on the proposed sample but on the previous sample - this should always be the accepted sample. And since the previous sample is available in the |
I see, so like the approach in #39 ! I personally think having a separate callback interface for adaptation would be cleaner, and being able to track acceptances/rejections might be useful for some algorithms. And compared to the |
I'm not strictly against adding such a function but there is one major problem: transitions contain all parameters but only for a subset of them (e.g. just one parameter) one might want to use the adaptive proposal. Thus in general it is not correct to call One way to achieve this is to handle it in In particular due to these mixed cases, I think it would be better to only add a new proposal but not a new sampler. |
Another advantage of handling it in the |
Slightly unrelated thought:
I think probably one would not want to keep track of these statistics in a separate |
src/adaptivemetropolis.jl
Outdated
# When the proposal is initialised the empirical posterior covariance is zero | ||
function trackstep(proposal::AMProposal, trans::Transition) | ||
proposal.samplemean .= trans.params | ||
proposal.samplesqmean .= trans.params * trans.params' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still computes trans.params * trans.params'
first and then copies it to proposal.samplesqmean
. You can avoid this by writing
proposal.samplesqmean .= trans.params * trans.params' | |
mul!(proposal.samplesqmean, trans.params, trans.params') |
src/adaptivemetropolis.jl
Outdated
|
||
# Recompute the empirical posterior covariance matrix | ||
function trackstep(proposal::AMProposal, trans::Transition, ::Union{Val{true}, Val{false}}) | ||
proposal.samplemean .= (proposal.samplemean .* proposal.N .+ trans.params) ./ (proposal.N + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it is numerically more stable to write
proposal.samplemean .= (proposal.samplemean .* proposal.N .+ trans.params) ./ (proposal.N + 1) | |
proposal.N += 1 | |
proposal.samplemean .+= (trans.params .- proposal.samplemean) ./ proposal.N |
src/adaptivemetropolis.jl
Outdated
proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ | ||
(proposal.N + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would fuse all operations (might require Compat.jl on older Julia versions) but is not as stable as the version for the mean above:
proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ | |
(proposal.N + 1) | |
mul!(proposal.samplesqmean, trans.params, trans.params', true, N) | |
ldiv!(N + 1, proposal.samplesqmean) |
src/adaptivemetropolis.jl
Outdated
proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ | ||
(proposal.N + 1) | ||
|
||
samplecov = proposal.samplesqmean .- proposal.samplemean * proposal.samplemean' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One should be aware that this is the "naive" algorithm for computing the variance which can lead to catastrophic cancellation (see e.g. https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Na%C3%AFve_algorithm). Maybe it would be better to use Welford's algorithm (https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have a look at this later. In combination with creating a new MvNormal
distribution every round, which is not very efficient, I think we could make this whole step more efficient by using the rank-one Cholesky update algorithm, although I am not sure how best to do this given that MvNormal
s are immutable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Welford implementation in AdvancedHMC is pretty good. I think it's an adaptation of the Stan version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added!
I agree, which is why I think implementing #38 would be useful! It shouldn't be a big change but probably belongs in another pull request |
I have just merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome stuff -- I'm happy with this for the moment but will defer to David's judgement if he has further suggestions.
src/adaptivemetropolis.jl
Outdated
proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ | ||
(proposal.N + 1) | ||
|
||
samplecov = proposal.samplesqmean .- proposal.samplemean * proposal.samplemean' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Welford implementation in AdvancedHMC is pretty good. I think it's an adaptation of the Stan version.
Any way we could reuse the Welford implementation from AdvancedHMC? |
By the way, it would be easy to combine #39 with this pull request, but we should maybe fix nomenclature for the methods as the names almost overlap - any suggestions? |
Only if it's moved to a separate package I assume. |
There's a subpackage called One idea would be basically copying the interface in |
How about moving the AdvancedHMC mass matrix adaption code into |
It might make sense to define an abstract |
There is a PR in progress - see TuringLang/AdvancedHMC.jl#259 |
No problem at all, that PR got stalled due to myself not really finding the time to finish it off decently (it worked for what I needed, and I got sidetracked). |
I created a new branch Main changes:
Not sure if the type interface is the cleanest, I would appreciate any feedback! Note: I'm not sure if we really want to support multiple adaptives for arrays/tuples of proposals, this feature could be removed... |
Intuitively I would have thought |
Not really to be honest! I tried to see if it would be cleaner but at least my new "solution" is more complicated than what we have already. Unless you or @cpfiffer see any advantages in separating out the adaptor logic (I don't really) we can just keep it as it is. |
For right now, I think it's fine to just go as-is. We can add in more general adaptation interfaces later on. |
I have uploaded a new version that uses Welford's algorithm. The code now more or less covers the |
Cool. I like it. I've turned on the test suite so we can see how the tests run. |
There was an error due to my using a different version of LinearAlgebra.jl, the |
I'll approve for now. Could you add a little bit of usage info to the README file? |
@@ -5,6 +5,8 @@ version = "0.6.0" | |||
[deps] | |||
AbstractMCMC = "80f14c24-f653-4e6a-9b94-39d6b0f70001" | |||
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" | |||
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" | |||
PDMats = "90014a1f-27ba-587c-ab20-58faa44d9150" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a [compat]
entry for PDMats?
src/mh-core.jl
Outdated
# Called to update proposal when the first sample is drawn | ||
trackstep!(proposal::Proposal, params) = nothing | ||
trackstep!(proposal::AbstractArray, params) = foreach(trackstep!, proposal, params) | ||
trackstep!(proposal::NamedTuple, params) = foreach(trackstep!, proposal, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameters in the NamedTuple params
are not necessarily in the same order as the proposal
.
src/mh-core.jl
Outdated
trackstep!(proposal::Proposal, params, | ||
::Union{Val{true}, Val{false}}) = nothing | ||
|
||
trackstep!(proposal::AbstractArray, params, accept::Union{Val{true},Val{false}}) = | ||
foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params) | ||
|
||
trackstep!(proposal::NamedTuple, params, accept::Union{Val{true},Val{false}}) = | ||
foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the type annotations on accept
? They are not needed.
trackstep!(proposal::Proposal, params, | |
::Union{Val{true}, Val{false}}) = nothing | |
trackstep!(proposal::AbstractArray, params, accept::Union{Val{true},Val{false}}) = | |
foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params) | |
trackstep!(proposal::NamedTuple, params, accept::Union{Val{true},Val{false}}) = | |
foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params) | |
trackstep!(proposal::Proposal, params, accept) = nothing | |
function trackstep!(proposal::AbstractArray, params, accept) | |
return foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params) | |
end | |
function trackstep!(proposal::NamedTuple, params, accept) | |
return foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params) | |
end |
Additionally, again the NamedTuple
implementation is not completely correct.
src/mh-core.jl
Outdated
@@ -157,6 +173,7 @@ function AbstractMCMC.step( | |||
transition = AdvancedMH.transition(spl, model, init_params) | |||
end | |||
|
|||
trackstep!(spl.proposal, transition.params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The step
implementations are written for generic AbstractTransition
s and MHSampler
, so neither spl.proposal
nor transition.params
might exist.
Co-authored-by: Cameron Pfiffer <[email protected]> Co-authored-by: David Widmann <[email protected]>
@arzwa We should also incorporate your univariate adaptive proposal. Any suggestions for a name? We can also change |
@kaandocal Are you still interested in getting this PR merged? |
[Diff since v0.6.7](TuringLang/AdvancedMH.jl@v0.6.7...v0.6.8) **Merged pull requests:** - Remove type restriction of log density function (TuringLang#68) (@devmotion)
We should probably get this PR out of the way. I did not pursue this much further because I had trouble getting it to work on my sampling problems - this approach needs a lot of samples to get a decent estimate of the covariance matrix and seems quite fickle in practice. Due to the nature of MH it just takes a while to explore the target distribution even if you have a decent covariance matrix... |
Yep, I totally get that. The reason an AMH sampler would be useful is less to do with me wanting to use AMH itself, and more to do with AMH being a useful building block in other algorithms. I think @fipelle mentioned an interest in this earlier? |
I've implemented a simple version of the standard AM algorithm as a small extension to #39 (this PR is independent of #39). The pull request features a multivariate Gaussian proposal which uses a rescaled version of the posterior covariance (Haario et al.'s famous 2.38 rule) + a small epsilon term for positive-definiteness and initial exploration. In order to update the proposal I had to create a new sampler class,
AMSampler
, which is basicallyMetropolisHastings
but updates the proposal at each sampling step via thetrackstep
function. The default implementation oftrackstep
does nothing.The update mechanism is slightly different from that in #39: this one performs adaptation in
AbstractMCMC.step
whereas #39 does it inpropose
. I think the newAMSampler
class should be able to handle both, and since it consists of minimal modifications to the originalMetropolisHastings
it could provide a general interface for adaptive MH algorithms.Notes:
propose
forAMSampler
but this should not be an issue.trackstep
interface more uniform.