[Hackathon 7th] 修复 Embedding 层的初始化参数 #3962

megemini · 2025-01-03T07:58:04Z

PR types

Bug fixes

PR changes

APIs

Describe

修复 Embedding 层的初始化参数。

Embedding 的初始化参数在 3.0.0 已经改变，会出现以下问题：

测试如下代码：
PaddleSpeech/paddlespeech/s2t/modules/align.py

class Embedding(nn.Embedding):
    def __init__(self,
                 num_embeddings,
                 embedding_dim,
                 padding_idx=None,
                 sparse=False,
                 weight_attr=None,
                 name=None):
        if weight_attr is None:
            weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal())
        super(Embedding, self).__init__(num_embeddings, embedding_dim,
                                        padding_idx, sparse, weight_attr, name)

    def forward(self, x):
        # todo(megemini):
        # print('>>> emb in', x)
        
        o = super().forward(x)

        print('>>> emb out', o.sum())

        return o

结论：

3.0.0 的 weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal()) 没有作用，无论是否设置，输出都是一样的。
2.5.1 通过设置 weight_attr 可以得到不同的输出

2.5.1

weight_attr is None

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-52.55189896)

weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal())
2025-01-03 15:00:53.364 | INFO | paddlespeech.s2t.exps.u2.model:do_train:184 - Train Total Examples: 3754

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-442.94750977)

3.0.0

weight_attr is None
2025-01-03 14:57:53.095 | INFO | paddlespeech.s2t.exps.u2.model:do_train:184 - Train Total Examples: 3754

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-52.55189896)

weight_attr = paddle.ParamAttr(initializer=nn.initializer.Normal())
2025-01-03 15:05:05.818 | INFO | paddlespeech.s2t.exps.u2.model:do_train:184 - Train Total Examples: 3754

Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
1681211)
emb out Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=False,
-52.55189896)

@zxcd @Liyulingyue @GreatV @enkilee @yinfan98

zxcd

应该还有几个align.py里面有这个，一起改了吧

zxcd

LGTM

[Fix] emb init

f6405ed

mergify bot added the S2T asr/st label Jan 3, 2025

zxcd reviewed Jan 3, 2025

View reviewed changes

paddle-bot bot added the contributor label Jan 3, 2025

zxcd approved these changes Jan 6, 2025

View reviewed changes

zxcd merged commit 553a9db into PaddlePaddle:develop Jan 6, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hackathon 7th] 修复 Embedding 层的初始化参数 #3962

[Hackathon 7th] 修复 Embedding 层的初始化参数 #3962

megemini commented Jan 3, 2025 •

edited

Loading

zxcd left a comment

zxcd left a comment

[Hackathon 7th] 修复 Embedding 层的初始化参数 #3962

[Hackathon 7th] 修复 Embedding 层的初始化参数 #3962

Conversation

megemini commented Jan 3, 2025 • edited Loading

PR types

PR changes

Describe

zxcd left a comment

Choose a reason for hiding this comment

zxcd left a comment

Choose a reason for hiding this comment

megemini commented Jan 3, 2025 •

edited

Loading