-
Notifications
You must be signed in to change notification settings - Fork 137
Pull requests: ROCm/composable_kernel
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CK_TILE] fmha fwd splitkv optimization for decode (seqlen_q=1)
#1789
opened Jan 3, 2025 by
poyenc
Loading…
4 tasks done
[CK_TILE] Sync fmha fwd splitkv minor optimizations
#1785
opened Jan 1, 2025 by
poyenc
Loading…
2 tasks done
[CK_TILE] Adjust kBlockSize of reduce example for better perf
#1779
opened Dec 27, 2024 by
ClementLinCF
Loading…
3 tasks
CK Tile GEMM Block and Block Method Selection with new 2x2 default policy warps layout
#1776
opened Dec 26, 2024 by
ThomasNing
Loading…
5 of 6 tasks
Grouped convolution backward weight special vector size loads
#1772
opened Dec 23, 2024 by
bartekxk
Loading…
6 tasks done
device_prop.hpp: move static map to helper function and initialize there
#1763
opened Dec 18, 2024 by
coconutruben
Loading…
3 of 6 tasks
[Ck tile] Use raw store to improve layernorm performance
#1752
opened Dec 16, 2024 by
rocking5566
Loading…
enable bias feature that add bias before adding residual (for rtpllm project)
#1741
opened Dec 11, 2024 by
AMD-dteng
Loading…
disable atomicAdd for C Output Vector Length = 1 with 16bit data type
#1737
opened Dec 10, 2024 by
zjing14
Loading…
Apply universal gemm to bwd_weight_cshuffle operator
#1658
opened Nov 12, 2024 by
mozga-amd
Loading…
[do not review] int4 scale based on jzhang's pre work
noCI
Disable testing on supported CI systems: math libraries CI has this feature enabled..
[DO NOT REVIEW] Add int4 dequant with scale only supports based on ZhangJing's PR #1572
noCI
Disable testing on supported CI systems: math libraries CI has this feature enabled..
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.