Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hackathon 7th] 修复 Embedding 层的初始化参数 #3962

Merged
merged 1 commit into from
Jan 6, 2025

Conversation

megemini
Copy link
Contributor

@megemini megemini commented Jan 3, 2025

PR types

Bug fixes

PR changes

APIs

Describe

修复 Embedding 层的初始化参数。

Embedding 的初始化参数在 3.0.0 已经改变,会出现以下问题:

测试如下代码:
PaddleSpeech/paddlespeech/s2t/modules/align.py

class Embedding(nn.Embedding):
    def __init__(self,
                 num_embeddings,
                 embedding_dim,
                 padding_idx=None,
                 sparse=False,
                 weight_attr=None,
                 name=None):
        if weight_attr is None:
            weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal())
        super(Embedding, self).__init__(num_embeddings, embedding_dim,
                                        padding_idx, sparse, weight_attr, name)

    def forward(self, x):
        # todo(megemini):
        # print('>>> emb in', x)
        
        o = super().forward(x)

        print('>>> emb out', o.sum())

        return o

结论:

  • 3.0.0 的 weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal()) 没有作用,无论是否设置,输出都是一样的。
  • 2.5.1 通过设置 weight_attr 可以得到不同的输出

2.5.1

weight_attr is None

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-52.55189896)

weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal())
2025-01-03 15:00:53.364 | INFO | paddlespeech.s2t.exps.u2.model:do_train:184 - Train Total Examples: 3754

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-442.94750977)

3.0.0

weight_attr is None
2025-01-03 14:57:53.095 | INFO | paddlespeech.s2t.exps.u2.model:do_train:184 - Train Total Examples: 3754

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-52.55189896)

weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal())
2025-01-03 15:05:05.818 | INFO | paddlespeech.s2t.exps.u2.model:do_train:184 - Train Total Examples: 3754

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-52.55189896)

@zxcd @Liyulingyue @GreatV @enkilee @yinfan98

@mergify mergify bot added the S2T asr/st label Jan 3, 2025
Copy link
Collaborator

@zxcd zxcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该还有几个align.py里面有这个,一起改了吧

Copy link
Collaborator

@zxcd zxcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zxcd zxcd merged commit 553a9db into PaddlePaddle:develop Jan 6, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants