-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[refactor] Generalized SwiGLU python code #1160
base: main
Are you sure you want to change the base?
Conversation
Dear @danthe3rd have you had a chance to look at the PR? |
Created base classes and generalized functions in common_glu.py to reuse by SwiGLU and other GLU-like activation functions that may be implemented in the future.
Hi @lw if it’s tedious to check all these changes at once, I can delete this PR and create several trivial PRs out of this one, easer to check. What do you think? |
Hi, |
Hello @danthe3rd, |
What do you mean by "MR"? |
MR - Merge Request or Pull Request (PR) |
So this PR is not implementing anything new, but if you want to add support for other activation functions, probably we should discuss it. I believe this SwiGLU implementation has poor design choices, and it might be better to start from scratch for another implementation. |
You can find me on PyTorch slack channel by first and last name, or by using nick name (same as here). |
Created base classes and generalized functions from the SwiGLU implementation to be reused by SwiGLU and other GLU-like activation functions which may be implemented in the future.
What does this PR do?
Fixes #1158.
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.