Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LV][VPlan] Add initial support for CSA vectorization #106560

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 57 additions & 1 deletion llvm/include/llvm/Analysis/IVDescriptors.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
//
//===----------------------------------------------------------------------===//
//
// This file "describes" induction and recurrence variables.
// This file "describes" induction, recurrence, and conditional scalar
// assignment (CSA) variables.
//
//===----------------------------------------------------------------------===//

Expand Down Expand Up @@ -423,6 +424,61 @@ class InductionDescriptor {
SmallVector<Instruction *, 2> RedundantCasts;
};

/// A Conditional Scalar Assignment (CSA) is an assignment from an initial
/// scalar that may or may not occur.
class CSADescriptor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think CAS is a very common term, would be good to have a more descriptive name if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel has used the term conditional scalar assingmnet. I have abbreviated it as CSA for short. I have documented the acronym in the code in this patch in multiple places

/// A Conditional Scalar Assignment (CSA) is an assignment from an initial
/// scalar that may or may not occur.

// This file "describes" induction, recurrence, and conditional scalar
// assignment (CSA) variables.

STATISTIC(CSAsVectorized,
"Number of conditional scalar assignments vectorized");

I thought that ConditionalScalarAssignmentDescriptor, createConditionalScalarAssignmentMaskPhi, and VPConditionalScalarAssignmentDescriptorExtractScalarRecipe were quite long for example.

Do you have any suggestion on what you'd like it to be named? Is expanding CSA to ConditionalScalarAssignment everywhere your preference?

For now, I've tried to be proactive in ab17128.

/// If the conditional assignment occurs inside a loop, then Phi chooses
/// the value of the assignment from the entry block or the loop body block.
PHINode *Phi = nullptr;

/// The initial value of the CSA. If the condition guarding the assignment is
/// not met, then the assignment retains this value.
Value *InitScalar = nullptr;

/// The Instruction that conditionally assigned to inside the loop.
Instruction *Assignment = nullptr;

/// Create a CSA Descriptor that models a valid CSA with its members
/// initialized correctly.
CSADescriptor(PHINode *Phi, Instruction *Assignment, Value *InitScalar)
: Phi(Phi), InitScalar(InitScalar), Assignment(Assignment) {}

public:
/// Create a CSA Descriptor that models an invalid CSA.
CSADescriptor() = default;

/// If Phi is the root of a CSA, set CSADesc as the CSA rooted by
/// Phi. Otherwise, return a false, leaving CSADesc unmodified.
static bool isCSAPhi(PHINode *Phi, Loop *TheLoop, CSADescriptor &CSADesc);

operator bool() const { return isValid(); }

/// Returns whether SI is the Assignment in CSA
static bool isCSASelect(CSADescriptor Desc, SelectInst *SI) {
return Desc.getAssignment() == SI;
}

/// Return whether this CSADescriptor models a valid CSA.
bool isValid() const { return Phi && InitScalar && Assignment; }

/// Return the PHI that roots this CSA.
PHINode *getPhi() const { return Phi; }

/// Return the initial value of the CSA. This is the value if the conditional
/// assignment does not occur.
Value *getInitScalar() const { return InitScalar; }

/// The Instruction that is used after the loop
Instruction *getAssignment() const { return Assignment; }

/// Return the condition that this CSA is conditional upon.
Value *getCond() const {
if (auto *SI = dyn_cast_or_null<SelectInst>(Assignment))
return SI->getCondition();
return nullptr;
}
};

} // end namespace llvm

#endif // LLVM_ANALYSIS_IVDESCRIPTORS_H
9 changes: 9 additions & 0 deletions llvm/include/llvm/Analysis/TargetTransformInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -1828,6 +1828,10 @@ class TargetTransformInfo {
: EVLParamStrategy(EVLParamStrategy), OpStrategy(OpStrategy) {}
};

/// \returns true if the loop vectorizer should vectorize conditional
/// scalar assignments for the target.
bool enableCSAVectorization() const;

/// \returns How the target needs this vector-predicated operation to be
/// transformed.
VPLegalization getVPLegalizationStrategy(const VPIntrinsic &PI) const;
Expand Down Expand Up @@ -2266,6 +2270,7 @@ class TargetTransformInfo::Concept {
SmallVectorImpl<Use *> &OpsToSink) const = 0;

virtual bool isVectorShiftByScalarCheap(Type *Ty) const = 0;
virtual bool enableCSAVectorization() const = 0;
virtual VPLegalization
getVPLegalizationStrategy(const VPIntrinsic &PI) const = 0;
virtual bool hasArmWideBranch(bool Thumb) const = 0;
Expand Down Expand Up @@ -3077,6 +3082,10 @@ class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
return Impl.isVectorShiftByScalarCheap(Ty);
}

bool enableCSAVectorization() const override {
return Impl.enableCSAVectorization();
}

VPLegalization
getVPLegalizationStrategy(const VPIntrinsic &PI) const override {
return Impl.getVPLegalizationStrategy(PI);
Expand Down
2 changes: 2 additions & 0 deletions llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -1016,6 +1016,8 @@ class TargetTransformInfoImplBase {

bool isVectorShiftByScalarCheap(Type *Ty) const { return false; }

bool enableCSAVectorization() const { return false; }

TargetTransformInfo::VPLegalization
getVPLegalizationStrategy(const VPIntrinsic &PI) const {
return TargetTransformInfo::VPLegalization(
Expand Down
17 changes: 17 additions & 0 deletions llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,10 @@ class LoopVectorizationLegality {
/// induction descriptor.
using InductionList = MapVector<PHINode *, InductionDescriptor>;

/// CSAList contains the CSA descriptors for all the CSAs that were found
/// in the loop, rooted by their phis.
using CSAList = MapVector<PHINode *, CSADescriptor>;

/// RecurrenceSet contains the phi nodes that are recurrences other than
/// inductions and reductions.
using RecurrenceSet = SmallPtrSet<const PHINode *, 8>;
Expand Down Expand Up @@ -321,6 +325,12 @@ class LoopVectorizationLegality {
/// Returns True if V is a Phi node of an induction variable in this loop.
bool isInductionPhi(const Value *V) const;

/// Returns the CSAs found in the loop.
const CSAList &getCSAs() const { return CSAs; }

/// Returns true if Phi is the root of a CSA in the loop.
bool isCSAPhi(PHINode *Phi) const { return CSAs.count(Phi) != 0; }

/// Returns a pointer to the induction descriptor, if \p Phi is an integer or
/// floating point induction.
const InductionDescriptor *getIntOrFpInductionDescriptor(PHINode *Phi) const;
Expand Down Expand Up @@ -550,6 +560,10 @@ class LoopVectorizationLegality {
void addInductionPhi(PHINode *Phi, const InductionDescriptor &ID,
SmallPtrSetImpl<Value *> &AllowedExit);

// Updates the vetorization state by adding \p Phi to the CSA list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Updates the vetorization state by adding \p Phi to the CSA list.
/// Updates the vetorization state by adding \p Phi to the CSA list.

void addCSAPhi(PHINode *Phi, const CSADescriptor &CSADesc,
SmallPtrSetImpl<Value *> &AllowedExit);

/// The loop that we evaluate.
Loop *TheLoop;

Expand Down Expand Up @@ -594,6 +608,9 @@ class LoopVectorizationLegality {
/// variables can be pointers.
InductionList Inductions;

/// Holds the conditional scalar assignments
CSAList CSAs;

/// Holds all the casts that participate in the update chain of the induction
/// variables, and that have been proven to be redundant (possibly under a
/// runtime guard). These casts can be ignored when creating the vectorized
Expand Down
58 changes: 57 additions & 1 deletion llvm/lib/Analysis/IVDescriptors.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
//
//===----------------------------------------------------------------------===//
//
// This file "describes" induction and recurrence variables.
// This file "describes" induction, recurrence, and conditional scalar
// assignment (CSA) variables.
//
//===----------------------------------------------------------------------===//

Expand Down Expand Up @@ -1570,3 +1571,58 @@ bool InductionDescriptor::isInductionPHI(
D = InductionDescriptor(StartValue, IK_PtrInduction, Step);
return true;
}

/// Return CSADescriptor that describes a CSA that matches one of these
/// patterns:
/// phi loop_inv, (select cmp, value, phi)
/// phi loop_inv, (select cmp, phi, value)
/// phi (select cmp, value, phi), loop_inv
/// phi (select cmp, phi, value), loop_inv
/// If the CSA does not match any of these paterns, return a CSADescriptor
/// that describes an InvalidCSA.
bool CSADescriptor::isCSAPhi(PHINode *Phi, Loop *TheLoop, CSADescriptor &CSA) {

// Must be a scalar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Must be a scalar
// Must be a scalar.

Type *Type = Phi->getType();
if (!Type->isIntegerTy() && !Type->isFloatingPointTy() &&
!Type->isPointerTy())
return false;

// Match phi loop_inv, (select cmp, value, phi)
// or phi loop_inv, (select cmp, phi, value)
// or phi (select cmp, value, phi), loop_inv
// or phi (select cmp, phi, value), loop_inv
if (Phi->getNumIncomingValues() != 2)
return false;
auto SelectInstIt = find_if(Phi->incoming_values(), [&Phi](const Use &U) {
return match(U.get(), m_Select(m_Value(), m_Specific(Phi), m_Value())) ||
match(U.get(), m_Select(m_Value(), m_Value(), m_Specific(Phi)));
});
if (SelectInstIt == Phi->incoming_values().end())
return false;
auto LoopInvIt = find_if(Phi->incoming_values(), [&](Use &U) {
return U.get() != *SelectInstIt && TheLoop->isLoopInvariant(U.get());
});
if (LoopInvIt == Phi->incoming_values().end())
return false;

// Phi or Sel must be used only outside the loop,
// excluding if Phi use Sel or Sel use Phi
auto IsOnlyUsedOutsideLoop = [&](Value *V, Value *Ignore) {
return all_of(V->users(), [Ignore, TheLoop](User *U) {
if (U == Ignore)
return true;
if (auto *I = dyn_cast<Instruction>(U))
return !TheLoop->contains(I);
return true;
});
};
Instruction *Select = cast<SelectInst>(SelectInstIt->get());
Value *LoopInv = LoopInvIt->get();
if (!IsOnlyUsedOutsideLoop(Phi, Select) ||
!IsOnlyUsedOutsideLoop(Select, Phi))
return false;

CSA = CSADescriptor(Phi, Select, LoopInv);
return true;
}
4 changes: 4 additions & 0 deletions llvm/lib/Analysis/TargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1351,6 +1351,10 @@ bool TargetTransformInfo::preferEpilogueVectorization() const {
return TTIImpl->preferEpilogueVectorization();
}

bool TargetTransformInfo::enableCSAVectorization() const {
return TTIImpl->enableCSAVectorization();
}

TargetTransformInfo::VPLegalization
TargetTransformInfo::getVPLegalizationStrategy(const VPIntrinsic &VPI) const {
return TTIImpl->getVPLegalizationStrategy(VPI);
Expand Down
5 changes: 5 additions & 0 deletions llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2361,6 +2361,11 @@ bool RISCVTTIImpl::isLegalMaskedExpandLoad(Type *DataTy, Align Alignment) {
return true;
}

bool RISCVTTIImpl::enableCSAVectorization() const {
return ST->hasVInstructions() &&
ST->getProcFamily() == RISCVSubtarget::SiFive7;
artagnon marked this conversation as resolved.
Show resolved Hide resolved
}

bool RISCVTTIImpl::isLegalMaskedCompressStore(Type *DataTy, Align Alignment) {
auto *VTy = dyn_cast<VectorType>(DataTy);
if (!VTy || VTy->isScalableTy())
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,10 @@ class RISCVTTIImpl : public BasicTTIImplBase<RISCVTTIImpl> {
return TLI->isVScaleKnownToBeAPowerOfTwo();
}

/// \returns true if the loop vectorizer should vectorize conditional
/// scalar assignments for the target.
bool enableCSAVectorization() const;

/// \returns How the target needs this vector-predicated operation to be
/// transformed.
TargetTransformInfo::VPLegalization
Expand Down
35 changes: 31 additions & 4 deletions llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,10 @@ static cl::opt<bool> EnableHistogramVectorization(
"enable-histogram-loop-vectorization", cl::init(false), cl::Hidden,
cl::desc("Enables autovectorization of some loops containing histograms"));

static cl::opt<bool>
EnableCSA("enable-csa-vectorization", cl::init(false), cl::Hidden,
cl::desc("Control whether CSA loop vectorization is enabled"));
artagnon marked this conversation as resolved.
Show resolved Hide resolved

/// Maximum vectorization interleave count.
static const unsigned MaxInterleaveFactor = 16;

Expand Down Expand Up @@ -750,6 +754,15 @@ bool LoopVectorizationLegality::setupOuterLoopInductions() {
return llvm::all_of(Header->phis(), IsSupportedPhi);
}

void LoopVectorizationLegality::addCSAPhi(
PHINode *Phi, const CSADescriptor &CSADesc,
SmallPtrSetImpl<Value *> &AllowedExit) {
assert(CSADesc.isValid() && "Expected Valid CSADescriptor");
LLVM_DEBUG(dbgs() << "LV: found legal CSA opportunity" << *Phi << "\n");
AllowedExit.insert(Phi);
CSAs.insert({Phi, CSADesc});
}

/// Checks if a function is scalarizable according to the TLI, in
/// the sense that it should be vectorized and then expanded in
/// multiple scalar calls. This is represented in the
Expand Down Expand Up @@ -867,14 +880,24 @@ bool LoopVectorizationLegality::canVectorizeInstrs() {
continue;
}

// As a last resort, coerce the PHI to a AddRec expression
// and re-try classifying it a an induction PHI.
// Try to coerce the PHI to a AddRec expression and re-try classifying
// it a an induction PHI.
if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID, true) &&
!IsDisallowedStridedPointerInduction(ID)) {
addInductionPhi(Phi, ID, AllowedExit);
continue;
}

// Check if the PHI can be classified as a CSA PHI.
if (EnableCSA || (TTI->enableCSAVectorization() &&
EnableCSA.getNumOccurrences() == 0)) {
artagnon marked this conversation as resolved.
Show resolved Hide resolved
CSADescriptor CSADesc;
if (CSADescriptor::isCSAPhi(Phi, TheLoop, CSADesc)) {
addCSAPhi(Phi, CSADesc, AllowedExit);
continue;
}
}

reportVectorizationFailure("Found an unidentified PHI",
"value that could not be identified as "
"reduction is used outside the loop",
Expand Down Expand Up @@ -1858,11 +1881,15 @@ bool LoopVectorizationLegality::canFoldTailByMasking() const {
for (const auto &Reduction : getReductionVars())
ReductionLiveOuts.insert(Reduction.second.getLoopExitInstr());

SmallPtrSet<const Value *, 8> CSALiveOuts;
for (const auto &CSA : getCSAs())
CSALiveOuts.insert(CSA.second.getAssignment());

// TODO: handle non-reduction outside users when tail is folded by masking.
for (auto *AE : AllowedExit) {
// Check that all users of allowed exit values are inside the loop or
// are the live-out of a reduction.
michaelmaitland marked this conversation as resolved.
Show resolved Hide resolved
if (ReductionLiveOuts.count(AE))
// are the live-out of a reduction or a CSA.
if (ReductionLiveOuts.count(AE) || CSALiveOuts.count(AE))
continue;
for (User *U : AE->users()) {
Instruction *UI = cast<Instruction>(U);
Expand Down
35 changes: 33 additions & 2 deletions llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
Original file line number Diff line number Diff line change
Expand Up @@ -174,8 +174,8 @@ class VPBuilder {
new VPInstruction(Opcode, Operands, WrapFlags, DL, Name));
}

VPValue *createNot(VPValue *Operand, DebugLoc DL = {},
const Twine &Name = "") {
VPInstruction *createNot(VPValue *Operand, DebugLoc DL = {},
const Twine &Name = "") {
return createInstruction(VPInstruction::Not, {Operand}, DL, Name);
}
Comment on lines -177 to 180
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fhahn: see also this, for #108858.


Expand Down Expand Up @@ -231,6 +231,37 @@ class VPBuilder {
Ptr, Offset, VPRecipeWithIRFlags::GEPFlagsTy(true), DL, Name));
}

VPInstruction *createCSAMaskPhi(VPValue *InitMask, DebugLoc DL,
const Twine &Name) {
return createInstruction(VPInstruction::CSAMaskPhi, {InitMask}, DL, Name);
}

VPInstruction *createAnyOf(VPValue *Cond, DebugLoc DL, const Twine &Name) {
return createInstruction(VPInstruction::AnyOf, {Cond}, DL, Name);
}

VPInstruction *createCSAMaskSel(VPValue *Cond, VPValue *MaskPhi,
VPValue *AnyOf, DebugLoc DL,
const Twine &Name) {
return createInstruction(VPInstruction::CSAMaskSel, {Cond, MaskPhi, AnyOf},
DL, Name);
}

VPInstruction *createAnyOfEVL(VPValue *Cond, VPValue *EVL, DebugLoc DL,
const Twine &Name) {
return createInstruction(VPInstruction::AnyOfEVL, {Cond, EVL}, DL, Name);
}

VPInstruction *createCSAVLPhi(DebugLoc DL, const Twine &Name) {
return createInstruction(VPInstruction::CSAVLPhi, {}, DL, Name);
}

VPInstruction *createCSAVLSel(VPValue *AnyOfEVL, VPValue *VLPhi, VPValue *EVL,
DebugLoc DL, const Twine &Name) {
return createInstruction(VPInstruction::CSAVLSel, {AnyOfEVL, VLPhi, EVL},
DL, Name);
}

VPDerivedIVRecipe *createDerivedIV(InductionDescriptor::InductionKind Kind,
FPMathOperator *FPBinOp, VPValue *Start,
VPCanonicalIVPHIRecipe *CanonicalIV,
Expand Down
Loading
Loading