Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepseek_lora custom keys in input_data fails #2259

Open
4 tasks
Zhaoyi-Yan opened this issue Dec 5, 2024 · 4 comments
Open
4 tasks

Deepseek_lora custom keys in input_data fails #2259

Zhaoyi-Yan opened this issue Dec 5, 2024 · 4 comments

Comments

@Zhaoyi-Yan
Copy link

System Info

For some models, eg, deepseek, we need to manually make the specified target_modules like : #2128 . It reminds us a link:

TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING = {
. For Qwen2, which exists in this link, supports custom keys in input_data and qwen2_lora works fine. However, for deepseek, deepseek_lora fails. The error is:

[rank0]: File "/zju_0038/yzy/miniconda3/envs/torch2.3_py3.10/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 197, in forward [rank0]: return self.model.forward(*args, **kwargs) [rank0]: TypeError: DeepseekV2ForCausalLM.forward() got an unexpected keyword argument 'data_index'

However, for qwen_lora, it seems that it does not trigger this function, or maybe it is due to the DeepseekV2ForCausalLM does not hold custom keys?

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

Expected behavior

@BenjaminBossan
Copy link
Member

I could not fully understand the issue. What I get is that for Qwen, we have default target_modules but we don't have them for Deepseek. This is correct, you need to specify the target modules yourself (you can try target_modules="all-linear" as a start).

Regarding the error you get, you would need to provide some minimal code so that I can try to reproduce the error, otherwise I can't help you.

@TVR28
Copy link

TVR28 commented Dec 5, 2024

Adding to @BenjaminBossan's answer, the typeerror occurs because the DeepseekV2ForCausalLM model's forward method doesn't accept a data_index parameter.

Remove the data_index from your input kwargs when calling the model. If you need to pass custom data, only use the parameters supported by the model's forward method.

@BenjaminBossan
Copy link
Member

Thanks for the additional insight @TVR28.

Copy link

github-actions bot commented Jan 4, 2025

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants