Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to pass in an attention _ mask that is one dimension more than input _ ids #2301

Open
2 of 4 tasks
Chinesehou97 opened this issue Dec 31, 2024 · 1 comment
Open
2 of 4 tasks

Comments

@Chinesehou97
Copy link

System Info

Hello, how can I pass in attention_mask that has one more dimension than input_ids, for example: output = peft_model.generate(input_ids,attention_mask=attention_mask,max_new_tokens=100) The input_ids dimension is [bitch_size,N], and the attention_mask dimension is [bitch_size,N,N].
Under this condition, when the above line of code is run, the following error will be reported:
File "/root/anaconda3/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 179, in _expand_mask bsz, src_len = mask.size()
ValueError: too many values ​​to unpack (expected 2)

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

`

                input_ids = torch.cat([
                    (torch.ones(input_ids.shape[0], 1) * uni_prompting.sptids_dict['<|mmu|>']).to(device),
                    (torch.ones(input_ids.shape[0], 1) * uni_prompting.sptids_dict['<|soi|>']).to(device),
                    image_tokens,
                    (torch.ones(input_ids.shape[0], 1) * uni_prompting.sptids_dict['<|eoi|>']).to(device),
                    (torch.ones(input_ids.shape[0], 1) * uni_prompting.sptids_dict['<|sot|>']).to(device),
                    input_ids
                ], dim=1).long()

                attention_mask = create_attention_mask_for_mmu(input_ids.to(device),
                                                            eoi_id=int(uni_prompting.sptids_dict['<|eoi|>']))
                cont_toks_list = peft_model.generate(input_ids,attention_mask=attention_mask,max_new_tokens=100)`

Expected behavior

Read the model for fine-tuning and reasoning.

@BenjaminBossan
Copy link
Member

Would it be possible for you to provide a complete code to reproduce the error, the given snippet is not enough. The model doesn't need to be trained, just ensure to configure the same PEFT method as in your initial problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants