Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs #121330

Open
wants to merge 1 commit into
base: users/zhaoqi5/add-relax-relocs-for-tls-le
Choose a base branch
from

Conversation

zhaoqi5
Copy link
Contributor

@zhaoqi5 zhaoqi5 commented Dec 30, 2024

If linker relaxation enabled, relaxable code sequence expanded from pseudos should avoid being separated by instruction scheduling. This commit tags scheduling boundary for them to avoid being scheduled. (Except for tls_le and call36/tail36. Because tls_le can be scheduled and have no influence to relax, call36/tail36 are expanded later in LoongArchExpandPseudo pass.)

A new mask target-flag is added to attach relax relocs to the relaxable code sequence. (No need to add it for tls_le and call36/tail36 because we can simply add relax relocs for them according to their relocs. But for other code sequence, such as PCALA_{HI20/LO12}, we must use the mask flag, mainly because relax should not be added when code model is large.)

Because of the new mask target-flag, get "direct" flags is necessary when using their target-flags. In addition, code sequence after being optimized by MergeBaseOffset pass may not relaxable any more, so the relax "bitmask" flag should be removed.

… relocs

If linker relaxation enabled, relaxable code sequence expanded
from pseudos should avoid being separated by instruction scheduling.
This commit tags scheduling boundary for them to avoid being
scheduled. (Except for `tls_le` and `call36/tail36`. Because
`tls_le` can be scheduled and have no influence to relax,
`call36/tail36` are expanded later in `LoongArchExpandPseudo` pass.)

A new mask target-flag is added to attach relax relocs to the
relaxable code sequence. (No need to add it for `tls_le` and
`call36/tail36` because of the reasons shown above.) Because of this,
get "direct" flags is necessary when using their target-flags.
In addition, code sequence after being optimized by `MergeBaseOffset`
pass may not relaxable any more, so the relax "bitmask" flag should
be removed.
@llvmbot
Copy link
Member

llvmbot commented Dec 30, 2024

@llvm/pr-subscribers-backend-loongarch

Author: ZhaoQi (zhaoqi5)

Changes

If linker relaxation enabled, relaxable code sequence expanded from pseudos should avoid being separated by instruction scheduling. This commit tags scheduling boundary for them to avoid being scheduled. (Except for tls_le and call36/tail36. Because tls_le can be scheduled and have no influence to relax, call36/tail36 are expanded later in LoongArchExpandPseudo pass.)

A new mask target-flag is added to attach relax relocs to the relaxable code sequence. (No need to add it for tls_le and call36/tail36 because of the reasons shown above.) Because of this, get "direct" flags is necessary when using their target-flags. In addition, code sequence after being optimized by MergeBaseOffset pass may not relaxable any more, so the relax "bitmask" flag should be removed.


Patch is 28.44 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121330.diff

11 Files Affected:

  • (modified) llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp (+26-8)
  • (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp (+96-3)
  • (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.h (+3)
  • (modified) llvm/lib/Target/LoongArch/LoongArchMCInstLower.cpp (+2-2)
  • (modified) llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp (+25-5)
  • (modified) llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp (+1)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchBaseInfo.h (+22)
  • (modified) llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp (+1)
  • (added) llvm/test/CodeGen/LoongArch/linker-relaxation.ll (+102)
  • (added) llvm/test/CodeGen/LoongArch/mir-relax-flags.ll (+64)
  • (modified) llvm/test/CodeGen/LoongArch/mir-target-flags.ll (+28-3)
diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
index 0218934ea3344a..be60de3d63d061 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
@@ -187,18 +187,23 @@ bool LoongArchPreRAExpandPseudo::expandPcalau12iInstPair(
   MachineInstr &MI = *MBBI;
   DebugLoc DL = MI.getDebugLoc();
 
+  const auto &STI = MF->getSubtarget<LoongArchSubtarget>();
+  bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax);
+
   Register DestReg = MI.getOperand(0).getReg();
   Register ScratchReg =
       MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass);
   MachineOperand &Symbol = MI.getOperand(1);
 
   BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), ScratchReg)
-      .addDisp(Symbol, 0, FlagsHi);
+      .addDisp(Symbol, 0,
+               EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi);
 
   MachineInstr *SecondMI =
       BuildMI(MBB, MBBI, DL, TII->get(SecondOpcode), DestReg)
           .addReg(ScratchReg)
-          .addDisp(Symbol, 0, FlagsLo);
+          .addDisp(Symbol, 0,
+                   EnableRelax ? LoongArchII::addRelaxFlag(FlagsLo) : FlagsLo);
 
   if (MI.hasOneMemOperand())
     SecondMI->addMemOperand(*MF, *MI.memoperands_begin());
@@ -481,6 +486,7 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc(
   unsigned ADD = STI.is64Bit() ? LoongArch::ADD_D : LoongArch::ADD_W;
   unsigned ADDI = STI.is64Bit() ? LoongArch::ADDI_D : LoongArch::ADDI_W;
   unsigned LD = STI.is64Bit() ? LoongArch::LD_D : LoongArch::LD_W;
+  bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax);
 
   Register DestReg = MI.getOperand(0).getReg();
   Register Tmp1Reg =
@@ -488,7 +494,10 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc(
   MachineOperand &Symbol = MI.getOperand(Large ? 2 : 1);
 
   BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), Tmp1Reg)
-      .addDisp(Symbol, 0, LoongArchII::MO_DESC_PC_HI);
+      .addDisp(Symbol, 0,
+               (EnableRelax && !Large)
+                   ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_PC_HI)
+                   : LoongArchII::MO_DESC_PC_HI);
 
   if (Large) {
     // Code Sequence:
@@ -526,19 +535,28 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc(
     // pcalau12i $a0, %desc_pc_hi20(sym)
     // addi.w/d  $a0, $a0, %desc_pc_lo12(sym)
     // ld.w/d    $ra, $a0, %desc_ld(sym)
-    // jirl      $ra, $ra, %desc_ld(sym)
-    // add.d     $dst, $a0, $tp
+    // jirl      $ra, $ra, %desc_call(sym)
+    // add.w/d   $dst, $a0, $tp
     BuildMI(MBB, MBBI, DL, TII->get(ADDI), LoongArch::R4)
         .addReg(Tmp1Reg)
-        .addDisp(Symbol, 0, LoongArchII::MO_DESC_PC_LO);
+        .addDisp(Symbol, 0,
+                 EnableRelax
+                     ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_PC_LO)
+                     : LoongArchII::MO_DESC_PC_LO);
   }
 
   BuildMI(MBB, MBBI, DL, TII->get(LD), LoongArch::R1)
       .addReg(LoongArch::R4)
-      .addDisp(Symbol, 0, LoongArchII::MO_DESC_LD);
+      .addDisp(Symbol, 0,
+               (EnableRelax && !Large)
+                   ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_LD)
+                   : LoongArchII::MO_DESC_LD);
   BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PseudoDESC_CALL), LoongArch::R1)
       .addReg(LoongArch::R1)
-      .addDisp(Symbol, 0, LoongArchII::MO_DESC_CALL);
+      .addDisp(Symbol, 0,
+               (EnableRelax && !Large)
+                   ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_CALL)
+                   : LoongArchII::MO_DESC_CALL);
   BuildMI(MBB, MBBI, DL, TII->get(ADD), DestReg)
       .addReg(LoongArch::R4)
       .addReg(LoongArch::R2);
diff --git a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp
index 7d0e4f9d58a16d..13c8a5a39b6f4a 100644
--- a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp
@@ -443,6 +443,89 @@ bool LoongArchInstrInfo::isSchedulingBoundary(const MachineInstr &MI,
     break;
   }
 
+  const auto &STI = MF.getSubtarget<LoongArchSubtarget>();
+  if (STI.hasFeature(LoongArch::FeatureRelax)) {
+    // When linker relaxation enabled, the following instruction patterns are
+    // prohibited from being reordered:
+    //
+    // * pcalau12i $a0, %pc_hi20(s)
+    //   addi.w/d $a0, $a0, %pc_lo12(s)
+    //
+    // * pcalau12i $a0, %got_pc_hi20(s)
+    //   ld.w/d $a0, $a0, %got_pc_lo12(s)
+    //
+    // * pcalau12i $a0, %ie_pc_hi20(s)
+    //   ld.w/d $a0, $a0, %ie_pc_lo12(s)
+    //
+    // * pcalau12i $a0, %ld_pc_hi20(s) | %gd_pc_hi20(s)
+    //   addi.w/d $a0, $a0, %got_pc_lo12(s)
+    //
+    // * pcalau12i $a0, %desc_pc_hi20(s)
+    //   addi.w/d  $a0, $a0, %desc_pc_lo12(s)
+    //   ld.w/d    $ra, $a0, %desc_ld(s)
+    //   jirl      $ra, $ra, %desc_call(s)
+    unsigned AddiOp = STI.is64Bit() ? LoongArch::ADDI_D : LoongArch::ADDI_W;
+    unsigned LdOp = STI.is64Bit() ? LoongArch::LD_D : LoongArch::LD_W;
+    switch (MI.getOpcode()) {
+    case LoongArch::PCALAU12I: {
+      auto MO0 = LoongArchII::getDirectFlags(MI.getOperand(1));
+      auto SecondOp = std::next(MII);
+      if (MO0 == LoongArchII::MO_DESC_PC_HI) {
+        if (SecondOp == MIE || SecondOp->getOpcode() != AddiOp)
+          break;
+        auto Ld = std::next(SecondOp);
+        if (Ld == MIE || Ld->getOpcode() != LdOp)
+          break;
+        auto MO1 = LoongArchII::getDirectFlags(SecondOp->getOperand(2));
+        auto MO2 = LoongArchII::getDirectFlags(Ld->getOperand(2));
+        if (MO1 == LoongArchII::MO_DESC_PC_LO && MO2 == LoongArchII::MO_DESC_LD)
+          return true;
+        break;
+      }
+      if (SecondOp == MIE ||
+          (SecondOp->getOpcode() != AddiOp && SecondOp->getOpcode() != LdOp))
+        break;
+      auto MO1 = LoongArchII::getDirectFlags(SecondOp->getOperand(2));
+      if (MO0 == LoongArchII::MO_PCREL_HI && SecondOp->getOpcode() == AddiOp &&
+          MO1 == LoongArchII::MO_PCREL_LO)
+        return true;
+      if (MO0 == LoongArchII::MO_GOT_PC_HI && SecondOp->getOpcode() == LdOp &&
+          MO1 == LoongArchII::MO_GOT_PC_LO)
+        return true;
+      if (MO0 == LoongArchII::MO_IE_PC_HI && SecondOp->getOpcode() == LdOp &&
+          MO1 == LoongArchII::MO_IE_PC_LO)
+        return true;
+      if ((MO0 == LoongArchII::MO_LD_PC_HI ||
+           MO0 == LoongArchII::MO_GD_PC_HI) &&
+          SecondOp->getOpcode() == AddiOp && MO1 == LoongArchII::MO_GOT_PC_LO)
+        return true;
+      break;
+    }
+    case LoongArch::ADDI_W:
+    case LoongArch::ADDI_D: {
+      auto MO = LoongArchII::getDirectFlags(MI.getOperand(2));
+      if (MO == LoongArchII::MO_PCREL_LO || MO == LoongArchII::MO_GOT_PC_LO)
+        return true;
+      break;
+    }
+    case LoongArch::LD_W:
+    case LoongArch::LD_D: {
+      auto MO = LoongArchII::getDirectFlags(MI.getOperand(2));
+      if (MO == LoongArchII::MO_GOT_PC_LO || MO == LoongArchII::MO_IE_PC_LO)
+        return true;
+      break;
+    }
+    case LoongArch::PseudoDESC_CALL: {
+      auto MO = LoongArchII::getDirectFlags(MI.getOperand(2));
+      if (MO == LoongArchII::MO_DESC_CALL)
+        return true;
+      break;
+    }
+    default:
+      break;
+    }
+  }
+
   return false;
 }
 
@@ -618,7 +701,8 @@ bool LoongArchInstrInfo::reverseBranchCondition(
 
 std::pair<unsigned, unsigned>
 LoongArchInstrInfo::decomposeMachineOperandsTargetFlags(unsigned TF) const {
-  return std::make_pair(TF, 0u);
+  const unsigned Mask = LoongArchII::MO_DIRECT_FLAG_MASK;
+  return std::make_pair(TF & Mask, TF & ~Mask);
 }
 
 ArrayRef<std::pair<unsigned, const char *>>
@@ -644,20 +728,29 @@ LoongArchInstrInfo::getSerializableDirectMachineOperandTargetFlags() const {
       {MO_IE_PC_LO, "loongarch-ie-pc-lo"},
       {MO_IE_PC64_LO, "loongarch-ie-pc64-lo"},
       {MO_IE_PC64_HI, "loongarch-ie-pc64-hi"},
+      {MO_LD_PC_HI, "loongarch-ld-pc-hi"},
+      {MO_GD_PC_HI, "loongarch-gd-pc-hi"},
+      {MO_CALL36, "loongarch-call36"},
       {MO_DESC_PC_HI, "loongarch-desc-pc-hi"},
       {MO_DESC_PC_LO, "loongarch-desc-pc-lo"},
       {MO_DESC64_PC_LO, "loongarch-desc64-pc-lo"},
       {MO_DESC64_PC_HI, "loongarch-desc64-pc-hi"},
       {MO_DESC_LD, "loongarch-desc-ld"},
       {MO_DESC_CALL, "loongarch-desc-call"},
-      {MO_LD_PC_HI, "loongarch-ld-pc-hi"},
-      {MO_GD_PC_HI, "loongarch-gd-pc-hi"},
       {MO_LE_HI_R, "loongarch-le-hi-r"},
       {MO_LE_ADD_R, "loongarch-le-add-r"},
       {MO_LE_LO_R, "loongarch-le-lo-r"}};
   return ArrayRef(TargetFlags);
 }
 
+ArrayRef<std::pair<unsigned, const char *>>
+LoongArchInstrInfo::getSerializableBitmaskMachineOperandTargetFlags() const {
+  using namespace LoongArchII;
+  static const std::pair<unsigned, const char *> TargetFlags[] = {
+      {MO_RELAX, "loongarch-relax"}};
+  return ArrayRef(TargetFlags);
+}
+
 // Returns true if this is the sext.w pattern, addi.w rd, rs, 0.
 bool LoongArch::isSEXT_W(const MachineInstr &MI) {
   return MI.getOpcode() == LoongArch::ADDI_W && MI.getOperand(1).isReg() &&
diff --git a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.h b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.h
index ef9970783107ea..a5b31878bfa1c2 100644
--- a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.h
+++ b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.h
@@ -91,6 +91,9 @@ class LoongArchInstrInfo : public LoongArchGenInstrInfo {
   ArrayRef<std::pair<unsigned, const char *>>
   getSerializableDirectMachineOperandTargetFlags() const override;
 
+  ArrayRef<std::pair<unsigned, const char *>>
+  getSerializableBitmaskMachineOperandTargetFlags() const override;
+
 protected:
   const LoongArchSubtarget &STI;
 };
diff --git a/llvm/lib/Target/LoongArch/LoongArchMCInstLower.cpp b/llvm/lib/Target/LoongArch/LoongArchMCInstLower.cpp
index d1de0609f24ce2..d87ed068ebff8a 100644
--- a/llvm/lib/Target/LoongArch/LoongArchMCInstLower.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchMCInstLower.cpp
@@ -27,7 +27,7 @@ static MCOperand lowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym,
   MCContext &Ctx = AP.OutContext;
   LoongArchMCExpr::VariantKind Kind;
 
-  switch (MO.getTargetFlags()) {
+  switch (LoongArchII::getDirectFlags(MO)) {
   default:
     llvm_unreachable("Unknown target flag on GV operand");
   case LoongArchII::MO_None:
@@ -134,7 +134,7 @@ static MCOperand lowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym,
         ME, MCConstantExpr::create(MO.getOffset(), Ctx), Ctx);
 
   if (Kind != LoongArchMCExpr::VK_LoongArch_None)
-    ME = LoongArchMCExpr::create(ME, Kind, Ctx);
+    ME = LoongArchMCExpr::create(ME, Kind, Ctx, LoongArchII::hasRelaxFlag(MO));
   return MCOperand::createExpr(ME);
 }
 
diff --git a/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp b/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
index e9455fdd23ba54..7f98f7718a538d 100644
--- a/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
@@ -105,7 +105,7 @@ bool LoongArchMergeBaseOffsetOpt::detectFoldable(MachineInstr &Hi20,
     return false;
 
   const MachineOperand &Hi20Op1 = Hi20.getOperand(1);
-  if (Hi20Op1.getTargetFlags() != LoongArchII::MO_PCREL_HI)
+  if (LoongArchII::getDirectFlags(Hi20Op1) != LoongArchII::MO_PCREL_HI)
     return false;
 
   auto isGlobalOrCPIOrBlockAddress = [](const MachineOperand &Op) {
@@ -157,7 +157,7 @@ bool LoongArchMergeBaseOffsetOpt::detectFoldable(MachineInstr &Hi20,
 
   const MachineOperand &Lo12Op2 = Lo12->getOperand(2);
   assert(Hi20.getOpcode() == LoongArch::PCALAU12I);
-  if (Lo12Op2.getTargetFlags() != LoongArchII::MO_PCREL_LO ||
+  if (LoongArchII::getDirectFlags(Lo12Op2) != LoongArchII::MO_PCREL_LO ||
       !(isGlobalOrCPIOrBlockAddress(Lo12Op2) || Lo12Op2.isMCSymbol()) ||
       Lo12Op2.getOffset() != 0)
     return false;
@@ -597,9 +597,28 @@ bool LoongArchMergeBaseOffsetOpt::foldIntoMemoryOps(MachineInstr &Hi20,
   if (!isInt<32>(NewOffset))
     return false;
 
+  // If optimized by this pass successfully, MO_RELAX bitmask target-flag should
+  // be removed from the code sequence.
+  //
+  // For example:
+  //   pcalau12i $a0, %pc_hi20(symbol)
+  //   addi.d $a0, $a0, %pc_lo12(symbol)
+  //   ld.w $a0, $a0, 0
+  //
+  //   =>
+  //
+  //   pcalau12i $a0, %pc_hi20(symbol)
+  //   ld.w $a0, $a0, %pc_lo12(symbol)
+  //
+  // Code sequence optimized before can be relax by linker. But after being
+  // optimized, it cannot be relaxed any more. So MO_RELAX flag should not be
+  // carried by them.
   Hi20.getOperand(1).setOffset(NewOffset);
+  Hi20.getOperand(1).setTargetFlags(
+      LoongArchII::getDirectFlags(Hi20.getOperand(1)));
   MachineOperand &ImmOp = Lo12.getOperand(2);
   ImmOp.setOffset(NewOffset);
+  ImmOp.setTargetFlags(LoongArchII::getDirectFlags(ImmOp));
   if (Lo20 && Hi12) {
     Lo20->getOperand(2).setOffset(NewOffset);
     Hi12->getOperand(2).setOffset(NewOffset);
@@ -617,15 +636,16 @@ bool LoongArchMergeBaseOffsetOpt::foldIntoMemoryOps(MachineInstr &Hi20,
         switch (ImmOp.getType()) {
         case MachineOperand::MO_GlobalAddress:
           MO.ChangeToGA(ImmOp.getGlobal(), ImmOp.getOffset(),
-                        ImmOp.getTargetFlags());
+                        LoongArchII::getDirectFlags(ImmOp));
           break;
         case MachineOperand::MO_MCSymbol:
-          MO.ChangeToMCSymbol(ImmOp.getMCSymbol(), ImmOp.getTargetFlags());
+          MO.ChangeToMCSymbol(ImmOp.getMCSymbol(),
+                              LoongArchII::getDirectFlags(ImmOp));
           MO.setOffset(ImmOp.getOffset());
           break;
         case MachineOperand::MO_BlockAddress:
           MO.ChangeToBA(ImmOp.getBlockAddress(), ImmOp.getOffset(),
-                        ImmOp.getTargetFlags());
+                        LoongArchII::getDirectFlags(ImmOp));
           break;
         default:
           report_fatal_error("unsupported machine operand type");
diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp b/llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp
index b611365f608af9..62b08be5435cda 100644
--- a/llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp
@@ -38,6 +38,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeLoongArchTarget() {
   initializeLoongArchMergeBaseOffsetOptPass(*PR);
   initializeLoongArchOptWInstrsPass(*PR);
   initializeLoongArchPreRAExpandPseudoPass(*PR);
+  initializeLoongArchExpandPseudoPass(*PR);
   initializeLoongArchDAGToDAGISelLegacyPass(*PR);
 }
 
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchBaseInfo.h b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchBaseInfo.h
index 23699043b9926a..371ae580419b21 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchBaseInfo.h
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchBaseInfo.h
@@ -17,6 +17,7 @@
 #include "MCTargetDesc/LoongArchMCTargetDesc.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/StringSwitch.h"
+#include "llvm/CodeGen/MachineOperand.h"
 #include "llvm/MC/MCInstrDesc.h"
 #include "llvm/TargetParser/SubtargetFeature.h"
 
@@ -58,8 +59,29 @@ enum {
   MO_LE_ADD_R,
   MO_LE_LO_R,
   // TODO: Add more flags.
+
+  // Used to differentiate between target-specific "direct" flags and "bitmask"
+  // flags. A machine operand can only have one "direct" flag, but can have
+  // multiple "bitmask" flags.
+  MO_DIRECT_FLAG_MASK = 0x3f,
+
+  MO_RELAX = 0x40
 };
 
+// Given a MachineOperand that may carry out "bitmask" flags, such as MO_RELAX,
+// return LoongArch target-specific "direct" flags.
+static inline unsigned getDirectFlags(const MachineOperand &MO) {
+  return MO.getTargetFlags() & MO_DIRECT_FLAG_MASK;
+}
+
+// Add MO_RELAX "bitmask" flag when FeatureRelax is enabled.
+static inline unsigned addRelaxFlag(unsigned Flags) { return Flags | MO_RELAX; }
+
+// \returns true if the given MachineOperand has MO_RELAX "bitmask" flag.
+static inline bool hasRelaxFlag(const MachineOperand &MO) {
+  return MO.getTargetFlags() & MO_RELAX;
+}
+
 // Target-specific flags of LAInst.
 // All definitions must match LoongArchInstrFormats.td.
 enum {
diff --git a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
index 187869bfa241b1..71f044dadf8be5 100644
--- a/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
+++ b/llvm/lib/Target/LoongArch/MCTargetDesc/LoongArchMCCodeEmitter.cpp
@@ -249,6 +249,7 @@ LoongArchMCCodeEmitter::getExprOpValue(const MCInst &MI, const MCOperand &MO,
       break;
     case LoongArchMCExpr::VK_LoongArch_CALL36:
       FixupKind = LoongArch::fixup_loongarch_call36;
+      RelaxCandidate = true;
       break;
     case LoongArchMCExpr::VK_LoongArch_TLS_DESC_PC_HI20:
       FixupKind = LoongArch::fixup_loongarch_tls_desc_pc_hi20;
diff --git a/llvm/test/CodeGen/LoongArch/linker-relaxation.ll b/llvm/test/CodeGen/LoongArch/linker-relaxation.ll
new file mode 100644
index 00000000000000..2827a95547903b
--- /dev/null
+++ b/llvm/test/CodeGen/LoongArch/linker-relaxation.ll
@@ -0,0 +1,102 @@
+; RUN: llc --mtriple=loongarch64 --filetype=obj -mattr=-relax \
+; RUN:     --relocation-model=pic --code-model=medium < %s \
+; RUN:     | llvm-readobj -r - | FileCheck --check-prefixes=CHECK-RELOC,PCALA-RELOC %s
+; RUN: llc --mtriple=loongarch64 --filetype=obj -mattr=+relax \
+; RUN:     --relocation-model=pic --code-model=medium < %s \
+; RUN:     | llvm-readobj -r - | FileCheck --check-prefixes=CHECK-RELOC,RELAX %s
+
+; RUN: llc --mtriple=loongarch64 --filetype=obj -mattr=-relax --enable-tlsdesc \
+; RUN:     --relocation-model=pic --code-model=medium < %s \
+; RUN:     | llvm-readobj -r - | FileCheck --check-prefix=DESC-RELOC %s
+; RUN: llc --mtriple=loongarch64 --filetype=obj -mattr=+relax --enable-tlsdesc \
+; RUN:     --relocation-model=pic --code-model=medium < %s \
+; RUN:     | llvm-readobj -r - | FileCheck --check-prefixes=DESC-RELOC,DESC-RELAX %s
+
+;; Check relocations when disable or enable linker relaxation.
+;; This tests are also able to test for removing relax mask flags
+;; after loongarch-merge-base-offset pass because no relax relocs
+;; are emitted after being optimized by it.
+
+@g_e = external global i32
+@g_i = internal global i32 0
+@g_i1 = internal global i32 1
+@t_un = external thread_local global i32
+@t_ld = external thread_local(localdynamic) global i32
+@t_ie = external thread_local(initialexec) global i32
+@t_le = external thread_local(localexec) global i32
+
+declare void @callee1() nounwind
+declare dso_local void @callee2() nounwind
+declare dso_local void @callee3() nounwind
+
+define ptr @caller() nounwind {
+; RELAX:            R_LARCH_ALIGN - 0x1C
+; CHECK-RELOC:      R_LARCH_GOT_PC_HI20 g_e 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; CHECK-RELOC-NEXT: R_LARCH_GOT_PC_LO12 g_e 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; PCALA-RELOC:      R_LARCH_PCALA_HI20 .bss 0x0
+; RELAX-NEXT:       R_LARCH_PCALA_HI20 g_i 0x0
+; PCALA-RELOC:      R_LARCH_PCALA_LO12 .bss 0x0
+; RELAX-NEXT:       R_LARCH_PCALA_LO12 g_i 0x0
+; CHECK-RELOC:      R_LARCH_TLS_GD_PC_HI20 t_un 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; CHECK-RELOC-NEXT: R_LARCH_GOT_PC_LO12 t_un 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; CHECK-RELOC-NEXT: R_LARCH_CALL36 __tls_get_addr 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; DESC-RELOC:       R_LARCH_TLS_DESC_PC_HI20 t_un 0x0
+; DESC-RELAX:       R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_PC_LO12 t_un 0x0
+; DESC-RELAX-NEXT:  R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_LD t_un 0x0
+; DESC-RELAX-NEXT:  R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_CALL t_un 0x0
+; DESC-RELAX-NEXT:  R_LARCH_RELAX - 0x0
+; CHECK-RELOC-NEXT: R_LARCH_TLS_LD_PC_HI20 t_ld 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; CHECK-RELOC-NEXT: R_LARCH_GOT_PC_LO12 t_ld 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; CHECK-RELOC-NEXT: R_LARCH_CALL36 __tls_get_addr 0x0
+; RELAX-NEXT:       R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_PC_HI20 t_ld 0x0
+; DESC-RELAX-NEXT:  R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_PC_LO12 t_ld 0x0
+; DESC-RELAX-NEXT:  R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_LD t_ld 0x0
+; DESC-RELAX-NEXT:  R_LARCH_RELAX - 0x0
+; DESC-RELOC-NEXT:  R_LARCH_TLS_DESC_CALL t_ld 0x0
+; DESC-RELAX-NEXT:  R_...
[truncated]

@SixWeining SixWeining requested review from heiher and wangleiat January 2, 2025 03:31
Copy link
Member

@heiher heiher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Register DestReg = MI.getOperand(0).getReg();
Register ScratchReg =
MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass);
MachineOperand &Symbol = MI.getOperand(1);

BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), ScratchReg)
.addDisp(Symbol, 0, FlagsHi);
.addDisp(Symbol, 0,
EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi

->

LoongArchII::encodeFlags(FlagsHi, EnableRelax)
static inline unsigned encodeFlags(unsigned Flags, bool Relax) {
  return Flags | (Relax ? MO_RELAX : 0);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants