Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to use IndicesFromVec with std::array / Simplify API for TableLookups? #2419

Open
FabianSchuetze opened this issue Dec 31, 2024 · 2 comments

Comments

@FabianSchuetze
Copy link

I wonder if it is possible to simplify the API for TableLookup and use a std:array or initializer list for the indices. When I currently use a table lookup, the index creation is a bit convoluted (in my mind, at least). Consider, for example, the current snippet:

    constexpr int32_t front[4] = {0, 1, 4, 5};
    VecOutT indices_vec = hn::Load(d, &front[0]);
    auto idx = hn::IndicesFromVec(d, indices_vec);  	
    hn::TwoTablesLookupLanes(c01x01, c01x23, idx);  

I looked at some examples but didn't find a concise way to create the indices. Maybe my code is not idiomatic for HWY. In that case, my question is obsolete.

I like to use an initializer list (which converts to a std::array) like so:

hn::TwoTablesLookupLanes(c01x01, c01x23, {0, 1, 4 5});

I am not sure if that is possible, because the IndicesFromVec function required the type tag d as an input argument, but it seems to me that IndicesFromVec(d, {0, 1, 4, 5}) should be a possible API. The function signature is:

template <class D, typename TI, HWY_IF_T_SIZE_D(D, 1)>
HWY_API Indices128<TFromD<D>, MaxLanes(D())> IndicesFromVec(
    D d, Vec128<TI, MaxLanes(D())> vec);

A std::array<T, N> would allow extracting the type and the length, which then seems to contain the same info as Vec128<TI, MaxLanes(D())>. I looked at the Neon function for IndicesFromVec and must confess that I don't fully understand its implementation.

Would there a possibility to use std::arrays for the indices?

@jan-wassenberg
Copy link
Member

This is another consequence of the non-constexpr RVV/SVE vectors. If Lanes(d) is not known at compile-time, then passing an initializer list would not work, or at least require passing the upper bound (MaxLanes) in elements, which is so huge on RVV as to be impractical, right?

We also do not pass the indices directly to Table(s)LookupLanes because there might be nontrivial conversion effort from int32 lane indices to the actual byte indices required by SSE4 PSHUFB, and those might be reused across several table lookups.

Does that make sense? Given those constraints, do you see any possible simplifications?

@FabianSchuetze
Copy link
Author

This is another consequence of the non-constexpr RVV/SVE vectors.

Hmm, that's what I was beginning to think after you replied to #2418. Thanks to your answer, I realized that the return type of IndicesFromVec on SVE is sizeless. I also wrote some code specifically for Neon, but as you indicated, initializing a vector with an initializer list is clumsy (thus discouraged?), https://godbolt.org/z/PW8sWYejT. Equally important, making TableLookups based on static information also doesn't seem to be a great idea on SVE/RVV then...

Ok, thanks for all the great info! If I cannot think of something better in the next few days, I'll close the issue. Btw: Happy 2025!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants