-
Notifications
You must be signed in to change notification settings - Fork 61
Ranges: reduce interface
Reduce has a couple of tricky interface issues.
Every 'plus' operation needs a zero associated with that operation.
On top of it: this zero
can be different for the same operation.
Consider: min_value
- it's zero is the first element in the range.
Proposed solution:
Instead of an operation accept a std::pair<Op, value_type<I>>
. Second argument is zero
. (the zero is scalar)
Alternative 1:
specialise eve::zero
for operation. I don't think this works for min_value
Alternative 2:
pass two values: init and zero to reduce
algorithm. Seems a touch confusing.
Alternative 3:
require that init
and zero
are the same value. This works quite well for reduce, there is no need for it to use the init value in the algorithm, you can add it afterwards.
However, something like inclusive_scan
needs an init value as well as zero
. So we'd need some other solution.
We can add up numbers to a bigger number type.
The standard resolves it by adding up to std::common_type_t<value_type<I>, Init>
.
We could sort of do that.
The problem is with the number of lanes
. The correct way to add let's say short
to int
on avx2 is to load by 8 shorts and convert them to 8 ints. However, the way we use to specify the number of lanes
to use in the algorithm, is with iterator::cardinal
.
But that cardinal
is for shorts not ints. So it seems like we don't respect what user asked of us.
Proposed solution:
Decide that it's OK. The user passed number of lane
is respected in a sense of bytes
. So we convert 16 shorts
to 8 ints
but it's still 32 bytes.
We can also force the number of lanes through traits somehow.
Alternative solution: try to respect the number of lanes passed in, when it's passed in via iterator The problem is: in the natural interface we start to create unnatural results.
Let's say we write count_if
for logical::is_wide
using converting to -1 trick
struct count_if_ : one_range_interface<count_if_>
auto impl(auto preprocessed_range, auto pred)
{
using test_t = decltype(pred(load(preprocessed_range.begin());
if constexpr (test_t::is_wide_logical) {
return -eve::transform_reduce(
preprocessed_range.traits(), preprocessed_range.begin(), preprocessed_range.end(),
[&](auto x) {
using signed_int = std::make_signed_t<test_t::value_type>>;
using ints = eve::wide<signed_int, typename test_t::cardinal>;
return eve::bit_cast(test(x), eve::as_<ints>{});
},
int(-1)
);
}
}
} count_if;
This transform reduce loop will be using aggregated wides for no reason. In order to avoid it, we'd have to cast down here. I think preserving the cardinal is a less needed requirement.