-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Affine distributions #1442
base: master
Are you sure you want to change the base?
Add Affine distributions #1442
Conversation
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Some more high-level comments:
|
I agree on the array-like variates bit and actually started work on that yesterday. I agree we should encourage and use I'm not sure what benefits a split would give, and it might make type-checking and conversions more of a chore since |
There are default fallbacks for |
Unrelated to the previous comments, we might want some field name other than |
@devmotion it turns out that this is already implemented in MeasureTheory.jl, and quite nicely too. Chad offered to move it to MeasuresBase, since you mentioned you'd be interested in adding MeasuresBase as a dependency. Should we maybe just add MeasuresBase as a dependency and reexport this feature? |
No, this is not sufficient. We want to create and work with |
If we go with |
We have to support To me it seems an additional function would not have any advantage but only extend the API, which is possibly confusing for users, developers, and an additional source for breaking changes in the future. The main argument above for it, it seems, was that it would be easier to implement. Actually, I think this is not correct:
|
Got it; I'll rewrite this to handle |
Should we memoize |
No, I don't think it's worth it. In my experience it just causes problems, eg with AD and undesired promotions (e.g. if sigma is an |
Hmm; could we memoize in cases where |
No, |
I'm very certain memoization just causes problems here and is not worth it. As an example in a different package, similar issues occurred with memoization of inverses in PDMats and hence in the latest release it was removed. |
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Maybe we could use the following design, which seems to address also your concerns about complicated dispatches and constructions of affine transformations above:
Then we could support all transformations of the form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional suggestion: Just copy the existing methods for LocationScale
for univariate AffineDistribution
s. This makes it also less likely that we accidentally introduce bugs.
src/univariate/affine.jl
Outdated
if iszero(σ) | ||
throw(ArgumentError("scale must be non-zero")) | ||
elseif size(σ)[end] ≠ size(ρ)[1] | ||
throw(DimensionMismatch( | ||
"scaling matrix and distribution have incompatible dimensions" * | ||
"$(size(σ)) and $(size(ρ)))" | ||
)) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not possible to check iszero
here, I assume? The second check only makes sense if the type is defined more generally and allows to increase or decrease the number of dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the second check. The first check is possible -- it returns true
if σ
is the zero matrix.
src/univariate/affine.jl
Outdated
mean(d::AffineDistribution) = d.μ + d.σ * mean(d.ρ) | ||
median(d::AffineDistribution) = d.μ + d.σ * median(d.ρ) | ||
mode(d::AffineDistribution) = d.μ + d.σ * mode(d.ρ) | ||
modes(d::AffineDistribution) = d.μ .+ d.σ .* modes(d.ρ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong e.g. for MultivariateDistribution
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how median
might be wrong (since that's not uniquely defined in multiple dimensions), although that should just throw an error when it tries to call median
on a multivariate distribution. But what's wrong with the rest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The broadcast operations are not compatible, e.g.,
julia> rand(4) .+ rand(4, 3) .* [rand(3) for _ in 1:5]
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 4 and 5")
Stacktrace:
[1] _bcs1
@ ./broadcast.jl:516 [inlined]
[2] _bcs
@ ./broadcast.jl:510 [inlined]
[3] broadcast_shape
@ ./broadcast.jl:504 [inlined]
[4] combine_axes
@ ./broadcast.jl:499 [inlined]
[5] _axes
@ ./broadcast.jl:224 [inlined]
[6] axes
@ ./broadcast.jl:222 [inlined]
[7] combine_axes
@ ./broadcast.jl:499 [inlined]
[8] instantiate
@ ./broadcast.jl:281 [inlined]
[9] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(+), Tuple{Vector{Float64}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(*), Tuple{Matrix{Float64}, Vector{Vector{Float64}}}}}})
@ Base.Broadcast ./broadcast.jl:860
[10] top-level scope
@ REPL[1]:1
julia> rand(3) .+ rand() .* [rand(3) for _ in 1:5]
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 3 and 5")
Stacktrace:
[1] _bcs1
@ ./broadcast.jl:516 [inlined]
[2] _bcs
@ ./broadcast.jl:510 [inlined]
[3] broadcast_shape
@ ./broadcast.jl:504 [inlined]
[4] combine_axes
@ ./broadcast.jl:499 [inlined]
[5] instantiate
@ ./broadcast.jl:281 [inlined]
[6] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Vector{Float64}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Vector{Float64}}}}}})
@ Base.Broadcast ./broadcast.jl:860
[7] top-level scope
@ REPL[2]:1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm; would map
work?
src/univariate/affine.jl
Outdated
skewness(d::UnivariateAffine) = Base.sign(d) * skewness(d.ρ) | ||
kurtosis(d::UnivariateAffine) = kurtosis(d.ρ) | ||
|
||
cov(d::AffineDistribution) = d.σ * cov(d.ρ) * d.σ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should be defined only for multivariate distributions, and we should have different implementations based on the type of the scale to reduce allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that, but do we want to deviate from Base
's behavior (where cov
returns the variance for univariate outcomes)?
return d.μ + d.σ * quantile(d.ρ, q) | ||
end | ||
|
||
rand(rng::AbstractRNG, d::AffineDistribution) = d.μ + d.σ * rand(rng, d.ρ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For higher-dimensional distributions we should define _rand!
.
|
||
rand(rng::AbstractRNG, d::AffineDistribution) = d.μ + d.σ * rand(rng, d.ρ) | ||
|
||
function gradlogpdf(d::ContinuousAffine, x::Real) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should restrict it to univariate distributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it is only defined for univariate distributions. In fact it's completely undocumented but I learnt just that it is also defined for MvNormal
and MvTDist
. So maybe it's fine to keep it. Then the bigger issue is that the implementation is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? I'm not sure I see the mistake.
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: Moritz Schauer <[email protected]>
All tests now pass, except for |
I hope you don't take this personal but I don't think we can merge the PR in its current state. In my opinion it is difficult to review and, in particular, it is difficult to assess if there are any breaking changes. Distributions is a central package and hence we have to be careful about not breaking downstream packages (since there are so many). To me it still seems there are too many changes and too general definitions, even for the univariate case currently handled by As a way forward, I think we should break up this PR (or maybe start from scratch) and
|
If you'd like, I can break this into two PRs. The first can add a |
I also would like to point out there have been essentially no changes to the behavior, of
All bugs were related to dealing with negative transformations of discrete distributions, where I forgot to account for the probability that If you'd like to rewrite the code for multivariate affine transformations from scratch, work out a new interface to replace the old |
Just to be sure there's no misunderstanding: We should not have separate distributions but at most type aliases (maybe they re not needed though). As mentioned above, it would be preferable to allow negative scales only in a second PR to keep changes in the different PRs minimal and make them easier to review.
Unfortunately, while the behaviour might not have changed (I don't know) the implementation has to a larger extent than what seemed necessary. During the different rounds of preliminary review I noticed and commented on methods that were modified slightly or generalized (sometimes incorrectly). Due to the renaming and other unrelated changes it is not possible for me to check with certainty if changes are needed, are bug fixes, are necessary or unnecessary generalizations, or are just style changes. |
Ok; if you'd like to implement renaming and all the different changes in a different PR, go ahead, then. I'll make a single PR that only changes LocationScale to allow negative scale values, or else (if you don't like this idea) I'll add reflections directly to a different package instead of spending more and more time making changes completely unrelated to this feature at your request. |
Sure, it's seemingly simple changes and of course it could be done in fewer PRs - but Distributions has > 300 direct dependents and > 700 indirect ones and hence I won't merge any PRs, in particular no refactorings, for which I feel I can't review them properly and guarantee with reasonable confidence that they don't break anything. Therefore I suggest breaking it up into small PRs that are easy to review. |
No description provided.