add XLM-RoBERTa in paddlenlp #9720

jie-z-0607 · 2024-12-31T03:41:54Z

PR types

New features

PR changes

Models

Description

在PaddleNLP中增加对于XLM-RoBERTa模型的支持

JunnYu · 2024-12-31T03:53:12Z

paddlenlp/transformers/xlm_roberta/configuration.py

+    Examples:
+
+    ```python
+    >>> from ppdiffusers.transformers import XLMRobertaConfig, XLMRobertaModel


改一下文档

JunnYu · 2024-12-31T03:54:00Z

paddlenlp/transformers/xlm_roberta/configuration.py

+        classifier_dropout=None,
+        **kwargs,
+    ):
+        kwargs["return_dict"] = kwargs.pop("return_dict", True)


这里我当时是跟transformers逻辑一样，默认值return_dict为True，而paddlenlp基本上所有模型都是False，需要决策一下

改为False吧

JunnYu · 2024-12-31T03:55:17Z

paddlenlp/transformers/xlm_roberta/modeling.py

+            if self.gradient_checkpointing and not hidden_states.stop_gradient:
+                layer_outputs = self._gradient_checkpointing_func(


gradient_checkpointing -> recompute，参照paddlenlp的改一下吧

JunnYu · 2024-12-31T03:55:28Z

paddlenlp/transformers/xlm_roberta/modeling.py

+        all_self_attentions = () if output_attentions else None
+        all_cross_attentions = () if output_attentions and self.config.add_cross_attention else None
+
+        if self.gradient_checkpointing and self.training:


这里也是

JunnYu · 2024-12-31T03:55:41Z

paddlenlp/transformers/xlm_roberta/modeling.py

+        super().__init__()
+        self.config = config
+        self.layer = nn.LayerList([XLMRobertaLayer(config) for _ in range(config.num_hidden_layers)])
+        self.gradient_checkpointing = False


这里也改了吧

改成self.enable_recompute=False

JunnYu · 2024-12-31T03:56:20Z

paddlenlp/transformers/xlm_roberta/modeling.py

+    _deprecated_dict = {
+        "key": ".self_attn.q_proj.",
+        "name_mapping": {
+            # common
+            "encoder.layers.": "encoder.layer.",
+            # embeddings
+            "embeddings.layer_norm.": "embeddings.LayerNorm.",
+            # transformer
+            ".self_attn.q_proj.": ".attention.self.query.",
+            ".self_attn.k_proj.": ".attention.self.key.",
+            ".self_attn.v_proj.": ".attention.self.value.",
+            ".self_attn.out_proj.": ".attention.output.dense.",
+            ".norm1.": ".attention.output.LayerNorm.",
+            ".linear1.": ".intermediate.dense.",
+            ".linear2.": ".output.dense.",
+            ".norm2.": ".output.LayerNorm.",
+        },
+    }


这里删了，没有用

JunnYu · 2024-12-31T03:58:42Z

paddlenlp/transformers/xlm_roberta/tokenizer.py

+
+from paddlenlp.transformers.tokenizer_utils import AddedToken
+from paddlenlp.transformers.tokenizer_utils import (
+    PretrainedTokenizer as PPNLPPretrainedTokenizer,


这里不用as直接PretrainedTokenizer

改为相对路径

JunnYu · 2024-12-31T03:58:47Z

paddlenlp/transformers/xlm_roberta/tokenizer.py

+__all__ = ["XLMRobertaTokenizer"]
+
+
+class XLMRobertaTokenizer(PPNLPPretrainedTokenizer):


这里也修改

JunnYu · 2024-12-31T04:00:06Z

auto部分也要加

JunnYu · 2024-12-31T04:01:56Z

paddlenlp/transformers/model_utils.py

+class ModuleUtilsMixin:
+    """
+    A few utilities for `nn.Layer`, to be used as a mixin.
+    """
+
+    # @property
+    # def device(self):
+    #     """
+    #     `paddle.place`: The device on which the module is (assuming that all the module parameters are on the same
+    #     device).
+    #     """
+    #     try:
+    #         return next(self.named_parameters())[1].place
+    #     except StopIteration:
+    #         try:
+    #             return next(self.named_buffers())[1].place
+    #         except StopIteration:
+    #             return paddle.get_device()


这部分的代码加入可能会影响已有的很多模型，得仔细看一下

DrownFish19 · 2024-12-31T04:02:15Z

paddlenlp/transformers/xlm_roberta/configuration.py

@@ -0,0 +1,133 @@
+# coding=utf-8
+# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.


这里少一个paddle的copyright

DrownFish19 · 2024-12-31T04:02:52Z

paddlenlp/transformers/xlm_roberta/configuration.py

+        classifier_dropout=None,
+        **kwargs,
+    ):
+        kwargs["return_dict"] = kwargs.pop("return_dict", True)


改为False吧

DrownFish19 · 2024-12-31T04:03:12Z

paddlenlp/transformers/xlm_roberta/modeling.py

@@ -0,0 +1,1517 @@
+# coding=utf-8


增加paddle的copyright

DrownFish19 · 2024-12-31T04:04:06Z

paddlenlp/transformers/xlm_roberta/modeling.py

+from paddle import nn
+from paddle.nn import BCEWithLogitsLoss, CrossEntropyLoss, MSELoss
+
+from paddlenlp.transformers.activations import ACT2FN


from paddlenlp 这些都改成相对路径吧

DrownFish19 · 2024-12-31T04:04:49Z

paddlenlp/transformers/xlm_roberta/modeling.py

+        super().__init__()
+        self.config = config
+        self.layer = nn.LayerList([XLMRobertaLayer(config) for _ in range(config.num_hidden_layers)])
+        self.gradient_checkpointing = False


改成self.enable_recompute=False

DrownFish19 · 2024-12-31T04:06:10Z

paddlenlp/transformers/xlm_roberta/modeling.py

+        Example:
+
+        ```python
+        >>> from ppdiffusers.transformers import AutoTokenizer, XLMRobertaForCausalLM, AutoConfig


同上修改文档

DrownFish19 · 2024-12-31T04:07:34Z

paddlenlp/transformers/xlm_roberta/tokenizer.py

+
+from paddlenlp.transformers.tokenizer_utils import AddedToken
+from paddlenlp.transformers.tokenizer_utils import (
+    PretrainedTokenizer as PPNLPPretrainedTokenizer,


改为相对路径

DrownFish19 · 2024-12-31T04:09:53Z

在PaddleNLP/paddlenlp/transformers/auto文件里增加对应的模型、tokenizer映射

codecov · 2024-12-31T04:16:18Z

Codecov Report

Attention: Patch coverage is 17.89474% with 624 lines in your changes missing coverage. Please review.

Project coverage is 52.55%. Comparing base (dff62a1) to head (e4c1f12).
Report is 3 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/transformers/xlm_roberta/modeling.py	15.03%	548 Missing ⚠️
paddlenlp/transformers/xlm_roberta/tokenizer.py	32.18%	59 Missing ⚠️
...addlenlp/transformers/xlm_roberta/configuration.py	22.72%	17 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9720      +/-   ##
===========================================
- Coverage    53.20%   52.55%   -0.66%     
===========================================
  Files          719      722       +3     
  Lines       115583   113254    -2329     
===========================================
- Hits         61493    59515    -1978     
+ Misses       54090    53739     -351

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ZHUI · 2025-01-02T09:36:04Z

加两个单测，测试一下，模型初始化，tokenier 加载。

JunnYu · 2025-01-02T10:42:27Z

新增对应的单测脚本

add xlm_roberta in paddlenlp

521a424

JunnYu reviewed Dec 31, 2024

View reviewed changes

DrownFish19 reviewed Dec 31, 2024

View reviewed changes

jie-z-0607 added 3 commits December 31, 2024 16:08

fix1

46ab26e

fix_2

964ac27

fix_configuration

e4c1f12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add XLM-RoBERTa in paddlenlp #9720

add XLM-RoBERTa in paddlenlp #9720

jie-z-0607 commented Dec 31, 2024

JunnYu Dec 31, 2024

JunnYu Dec 31, 2024

DrownFish19 Dec 31, 2024

JunnYu Dec 31, 2024

JunnYu Dec 31, 2024

JunnYu Dec 31, 2024

DrownFish19 Dec 31, 2024 •

edited

Loading

JunnYu Dec 31, 2024

JunnYu Dec 31, 2024

DrownFish19 Dec 31, 2024

JunnYu Dec 31, 2024

JunnYu commented Dec 31, 2024 •

edited

Loading

JunnYu Dec 31, 2024 •

edited

Loading

DrownFish19 Dec 31, 2024

DrownFish19 Dec 31, 2024

DrownFish19 Dec 31, 2024

DrownFish19 Dec 31, 2024

DrownFish19 Dec 31, 2024 •

edited

Loading

DrownFish19 Dec 31, 2024

DrownFish19 Dec 31, 2024

DrownFish19 commented Dec 31, 2024

codecov bot commented Dec 31, 2024 •

edited

Loading

ZHUI commented Jan 2, 2025

JunnYu commented Jan 2, 2025

		if self.gradient_checkpointing and not hidden_states.stop_gradient:
		layer_outputs = self._gradient_checkpointing_func(

		__all__ = ["XLMRobertaTokenizer"]


		class XLMRobertaTokenizer(PPNLPPretrainedTokenizer):

		@@ -0,0 +1,133 @@
		# coding=utf-8
		# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.

add XLM-RoBERTa in paddlenlp #9720

Are you sure you want to change the base?

add XLM-RoBERTa in paddlenlp #9720

Conversation

jie-z-0607 commented Dec 31, 2024

PR types

PR changes

Description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DrownFish19 Dec 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JunnYu commented Dec 31, 2024 • edited Loading

JunnYu Dec 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DrownFish19 Dec 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DrownFish19 commented Dec 31, 2024

codecov bot commented Dec 31, 2024 • edited Loading

Codecov Report

ZHUI commented Jan 2, 2025

JunnYu commented Jan 2, 2025

DrownFish19 Dec 31, 2024 •

edited

Loading

JunnYu commented Dec 31, 2024 •

edited

Loading

JunnYu Dec 31, 2024 •

edited

Loading

DrownFish19 Dec 31, 2024 •

edited

Loading

codecov bot commented Dec 31, 2024 •

edited

Loading