You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have pixel data is R1G1B1A1R2G2B2A2... format. To make compositing we need to deinterleave them into R1R2R3..., G1G2G3..., B1B2B3..., A1A2A3... format.
Implementation in Vc and std-simd
In Vc library we could use Vc::InterleavedMemoryWrapper to achieve this goal. std-simd doesn't provide any efficient solution for the task. We could use generator constructor for it, but it doesn't seem to generate what we want:
Usecase
We have pixel data is R1G1B1A1R2G2B2A2... format. To make compositing we need to deinterleave them into R1R2R3..., G1G2G3..., B1B2B3..., A1A2A3... format.
Implementation in Vc and std-simd
In Vc library we could use Vc::InterleavedMemoryWrapper to achieve this goal. std-simd doesn't provide any efficient solution for the task. We could use generator constructor for it, but it doesn't seem to generate what we want:
Old version with Vc::IntterleavedMemoryWrapper:
https://godbolt.org/z/fEQ3s1
Generator syntax with std-simd:
https://godbolt.org/z/pUtMXm
You can see that generator-based version compiler generates more than twice more instructions. That is really suboptimal
PS:
I think this issue may be somewhat related: VcDevel/std-simd#4
The text was updated successfully, but these errors were encountered: