-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLD] Do not combine cg_profile from obj and ordering file #121325
base: main
Are you sure you want to change the base?
Conversation
Add tests to track section reordering when both cg_profile section and call-graph-ordering-file were given.
cg_profile in object is from CGProfilePass and it is often inaccurate. While call-graph-ordering-file is provided by user. It is weird to aggregate them together especially when call-graph-ordering-file is accurate enough.
Add test in #121324. |
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld-coff Author: Haohai Wen (HaohaiWen) Changescg_profile in object is from CGProfilePass and it is often inaccurate. Full diff: https://github.com/llvm/llvm-project/pull/121325.diff 4 Files Affected:
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index be01ee41c9a2f2..4b9c189e21510d 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -2812,10 +2812,10 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
// Handle /call-graph-ordering-file and /call-graph-profile-sort (default on).
if (config->callGraphProfileSort) {
llvm::TimeTraceScope timeScope("Call graph");
- if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
+ if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
parseCallGraphFile(arg->getValue());
- }
- readCallGraphsFromObjectFiles(ctx);
+ else
+ readCallGraphsFromObjectFiles(ctx);
}
// Handle /print-symbol-order.
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index f573a8d3e19f3b..e8e99fa874b5d5 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3215,11 +3215,12 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
// Read the callgraph now that we know what was gced or icfed
if (ctx.arg.callGraphProfileSort != CGProfileSortKind::None) {
- if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
+ if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
if (std::optional<MemoryBufferRef> buffer =
readFile(ctx, arg->getValue()))
readCallGraph(ctx, *buffer);
- readCallGraphsFromObjectFiles<ELFT>(ctx);
+ } else
+ readCallGraphsFromObjectFiles<ELFT>(ctx);
}
// Write the result to the file.
diff --git a/lld/test/COFF/cgprofile-obj.s b/lld/test/COFF/cgprofile-obj.s
index b267850c46382c..da8011724d9859 100644
--- a/lld/test/COFF/cgprofile-obj.s
+++ b/lld/test/COFF/cgprofile-obj.s
@@ -2,9 +2,12 @@
# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-win32 %s -o %t
# RUN: lld-link /subsystem:console /entry:A %t /out:%t2 /debug:symtab
-# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s
+# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s --check-prefix=CG-OBJ
# RUN: lld-link /call-graph-profile-sort:no /subsystem:console /entry:A %t /out:%t3 /debug:symtab
# RUN: llvm-nm --numeric-sort %t3 | FileCheck %s --check-prefix=NO-CG
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: lld-link /subsystem:console /entry:A %t /out:%t4 /debug:symtab /call-graph-ordering-file:%t.call_graph
+# RUN: llvm-nm --numeric-sort %t4 | FileCheck %s --check-prefix=CG-OBJ-OF
.section .text,"ax", one_only, D
D:
@@ -33,13 +36,17 @@ Aa:
.cg_profile B, C, 30
.cg_profile C, D, 90
-# CHECK: 140001000 T A
-# CHECK: 140001001 T B
-# CHECK: 140001002 T C
-# CHECK: 140001003 t D
-
+# CG-OBJ: 140001000 T A
+# CG-OBJ: 140001001 T B
+# CG-OBJ: 140001002 T C
+# CG-OBJ: 140001003 t D
# NO-CG: 140001000 t D
# NO-CG: 140001001 T C
# NO-CG: 140001002 T B
# NO-CG: 140001003 T A
+
+# CG-OBJ-OF: 140001000 t D
+# CG-OBJ-OF: 140001001 T A
+# CG-OBJ-OF: 140001004 T C
+# CG-OBJ-OF: 140001005 T B
diff --git a/lld/test/ELF/cgprofile-obj.s b/lld/test/ELF/cgprofile-obj.s
index 0848adc5e4279a..14016658707af4 100644
--- a/lld/test/ELF/cgprofile-obj.s
+++ b/lld/test/ELF/cgprofile-obj.s
@@ -2,12 +2,15 @@
# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
# RUN: ld.lld -e A %t.o -o %t
-# RUN: llvm-nm --no-sort %t | FileCheck %s
+# RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=CG-OBJ
# RUN: ld.lld --call-graph-profile-sort=none -e A %t.o -o %t
# RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=NO-CG
## --no-call-graph-profile-sort is an alias for --call-graph-profile-sort=none.
# RUN: ld.lld --no-call-graph-profile-sort -e A %t.o -o %t1
# RUN: cmp %t %t1
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: ld.lld -e A %t.o -call-graph-ordering-file=%t.call_graph -o %t2
+# RUN: llvm-nm --no-sort %t2 | FileCheck %s --check-prefix=CG-OBJ-OF
.section .text.D,"ax",@progbits
D:
@@ -36,12 +39,17 @@ Aa:
.cg_profile B, C, 30
.cg_profile C, D, 90
-# CHECK: 0000000000201123 t D
-# CHECK: 0000000000201122 T C
-# CHECK: 0000000000201121 T B
-# CHECK: 0000000000201120 T A
+# CG-OBJ: 0000000000201123 t D
+# CG-OBJ: 0000000000201122 T C
+# CG-OBJ: 0000000000201121 T B
+# CG-OBJ: 0000000000201120 T A
# NO-CG: 0000000000201120 t D
# NO-CG: 0000000000201121 T C
# NO-CG: 0000000000201122 T B
# NO-CG: 0000000000201123 T A
+
+# CG-OBJ-OF: 0000000000201120 t D
+# CG-OBJ-OF: 0000000000201124 T C
+# CG-OBJ-OF: 0000000000201125 T B
+# CG-OBJ-OF: 0000000000201121 T A
|
@llvm/pr-subscribers-lld Author: Haohai Wen (HaohaiWen) Changescg_profile in object is from CGProfilePass and it is often inaccurate. Full diff: https://github.com/llvm/llvm-project/pull/121325.diff 4 Files Affected:
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index be01ee41c9a2f2..4b9c189e21510d 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -2812,10 +2812,10 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
// Handle /call-graph-ordering-file and /call-graph-profile-sort (default on).
if (config->callGraphProfileSort) {
llvm::TimeTraceScope timeScope("Call graph");
- if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
+ if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
parseCallGraphFile(arg->getValue());
- }
- readCallGraphsFromObjectFiles(ctx);
+ else
+ readCallGraphsFromObjectFiles(ctx);
}
// Handle /print-symbol-order.
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index f573a8d3e19f3b..e8e99fa874b5d5 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3215,11 +3215,12 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
// Read the callgraph now that we know what was gced or icfed
if (ctx.arg.callGraphProfileSort != CGProfileSortKind::None) {
- if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
+ if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
if (std::optional<MemoryBufferRef> buffer =
readFile(ctx, arg->getValue()))
readCallGraph(ctx, *buffer);
- readCallGraphsFromObjectFiles<ELFT>(ctx);
+ } else
+ readCallGraphsFromObjectFiles<ELFT>(ctx);
}
// Write the result to the file.
diff --git a/lld/test/COFF/cgprofile-obj.s b/lld/test/COFF/cgprofile-obj.s
index b267850c46382c..da8011724d9859 100644
--- a/lld/test/COFF/cgprofile-obj.s
+++ b/lld/test/COFF/cgprofile-obj.s
@@ -2,9 +2,12 @@
# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-win32 %s -o %t
# RUN: lld-link /subsystem:console /entry:A %t /out:%t2 /debug:symtab
-# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s
+# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s --check-prefix=CG-OBJ
# RUN: lld-link /call-graph-profile-sort:no /subsystem:console /entry:A %t /out:%t3 /debug:symtab
# RUN: llvm-nm --numeric-sort %t3 | FileCheck %s --check-prefix=NO-CG
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: lld-link /subsystem:console /entry:A %t /out:%t4 /debug:symtab /call-graph-ordering-file:%t.call_graph
+# RUN: llvm-nm --numeric-sort %t4 | FileCheck %s --check-prefix=CG-OBJ-OF
.section .text,"ax", one_only, D
D:
@@ -33,13 +36,17 @@ Aa:
.cg_profile B, C, 30
.cg_profile C, D, 90
-# CHECK: 140001000 T A
-# CHECK: 140001001 T B
-# CHECK: 140001002 T C
-# CHECK: 140001003 t D
-
+# CG-OBJ: 140001000 T A
+# CG-OBJ: 140001001 T B
+# CG-OBJ: 140001002 T C
+# CG-OBJ: 140001003 t D
# NO-CG: 140001000 t D
# NO-CG: 140001001 T C
# NO-CG: 140001002 T B
# NO-CG: 140001003 T A
+
+# CG-OBJ-OF: 140001000 t D
+# CG-OBJ-OF: 140001001 T A
+# CG-OBJ-OF: 140001004 T C
+# CG-OBJ-OF: 140001005 T B
diff --git a/lld/test/ELF/cgprofile-obj.s b/lld/test/ELF/cgprofile-obj.s
index 0848adc5e4279a..14016658707af4 100644
--- a/lld/test/ELF/cgprofile-obj.s
+++ b/lld/test/ELF/cgprofile-obj.s
@@ -2,12 +2,15 @@
# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
# RUN: ld.lld -e A %t.o -o %t
-# RUN: llvm-nm --no-sort %t | FileCheck %s
+# RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=CG-OBJ
# RUN: ld.lld --call-graph-profile-sort=none -e A %t.o -o %t
# RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=NO-CG
## --no-call-graph-profile-sort is an alias for --call-graph-profile-sort=none.
# RUN: ld.lld --no-call-graph-profile-sort -e A %t.o -o %t1
# RUN: cmp %t %t1
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: ld.lld -e A %t.o -call-graph-ordering-file=%t.call_graph -o %t2
+# RUN: llvm-nm --no-sort %t2 | FileCheck %s --check-prefix=CG-OBJ-OF
.section .text.D,"ax",@progbits
D:
@@ -36,12 +39,17 @@ Aa:
.cg_profile B, C, 30
.cg_profile C, D, 90
-# CHECK: 0000000000201123 t D
-# CHECK: 0000000000201122 T C
-# CHECK: 0000000000201121 T B
-# CHECK: 0000000000201120 T A
+# CG-OBJ: 0000000000201123 t D
+# CG-OBJ: 0000000000201122 T C
+# CG-OBJ: 0000000000201121 T B
+# CG-OBJ: 0000000000201120 T A
# NO-CG: 0000000000201120 t D
# NO-CG: 0000000000201121 T C
# NO-CG: 0000000000201122 T B
# NO-CG: 0000000000201123 T A
+
+# CG-OBJ-OF: 0000000000201120 t D
+# CG-OBJ-OF: 0000000000201124 T C
+# CG-OBJ-OF: 0000000000201125 T B
+# CG-OBJ-OF: 0000000000201121 T A
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I believe that --call-graph-ordering-file is under-utilized and the interaction wasn't well studied. Have you tried some examples where not aggregation provides larger performance win?
cg_profile in object is from CGProfilePass and it is often inaccurate.
While call-graph-ordering-file is provided by user. It is weird to
aggregate them together especially when call-graph-ordering-file is
accurate enough.