a question about input_ids and attention_mask after prefix-tuning #2304

MaTengSYSU · 2025-01-06T08:44:01Z

System Info

there is a error report:

tensor([[ 1, 319, 13563, 1546, 263, 12758, 5199, 322, 385, 23116,
21082, 20255, 29889, 450, 20255, 4076, 8444, 29892, 13173, 29892,
322, 1248, 568, 6089, 304, 278, 5199, 29915, 29879, 5155,
29889, 3148, 1001, 29901, 29871, -200, 29871, 13, 4002, 29581,
278, 1967, 29889, 319, 1799, 9047, 13566, 29901]],
device='cuda:0')
input_ids shape: torch.Size([1, 48])
attention_mask shape: torch.Size([1, 68])
tensor([[True, True, True, True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True]], device='cuda:0')
Traceback (most recent call last):
File "/mnt/sda1/mateng/PEFT-MLLM/test.py", line 44, in
print(description)
File "/mnt/sda1/mateng/PEFT-MLLM/llava/eval/run_llava.py", line 113, in eval_model
output_ids = model.generate(
File "/mnt/sda1/mateng/PEFT-MLLM/peft/src/peft/peft_model.py", line 1130, in generate
outputs = self.base_model.generate(**kwargs)
File "/home/mateng/anaconda3/envs/peft-mllm/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/mateng/anaconda3/envs/peft-mllm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1602, in generate
return self.greedy_search(
File "/home/mateng/anaconda3/envs/peft-mllm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2450, in greedy_search
outputs = self(
File "/home/mateng/anaconda3/envs/peft-mllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/mateng/anaconda3/envs/peft-mllm/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/mnt/sda1/mateng/PEFT-MLLM/llava/model/language_model/llava_llama.py", line 84, in forward
) = self.prepare_inputs_labels_for_multimodal(
File "/mnt/sda1/mateng/PEFT-MLLM/llava/model/llava_arch.py", line 151, in prepare_inputs_labels_for_multimodal
input_ids = [cur_input_ids[cur_attention_mask] for cur_input_ids, cur_attention_mask in zip(input_ids, attention_mask)]
File "/mnt/sda1/mateng/PEFT-MLLM/llava/model/llava_arch.py", line 151, in
input_ids = [cur_input_ids[cur_attention_mask] for cur_input_ids, cur_attention_mask in zip(input_ids, attention_mask)]
IndexError: The shape of the mask [68] at index 0 does not match the shape of the indexed tensor [48] at index 0

Who can help?

@BenjaminBossan @sayakpaul @stev

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

this is my test.py :

from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from peft import PeftModel
from PIL import Image
import torch

# 加载 LLaVA 模型和处理器
model_name = get_model_name_from_path("liuhaotian/llava-v1.5-7b")  # 替换为你的 LLaVA 模型名称
print(f"Loading LLaVA model: {model_name}")
tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path="liuhaotian/llava-v1.5-7b",  # 模型路径
    model_base=None,  # 如果有基础模型（如 Vicuna），可以指定
    model_name=model_name,
    load_8bit=False,  # 是否加载 8bit 量化模型
    load_4bit=False,  # 是否加载 4bit 量化模型
    device_map="auto"  # 自动分配设备
)



peft_model_path = "/mnt/sda1/mateng/PEFT-MLLM/checkpoints/llava/sqa/llava-sqa-prefix"  # 替换为你的微调文件路径
model = PeftModel.from_pretrained(model, peft_model_path)
from llava.mm_utils import (
            tokenizer_image_token,
        )
from llava.eval.run_llava import eval_model


args = type('Args', (), {
            "model_path": "liuhaotian/llava-v1.5-7b",
            "model_base": None,
            "model_name": 'llava-v1.5-7b',
            "query": "Describe the image.",
            "conv_mode": None,
            "image_file": "/mnt/sda1/mateng/PEFT-MLLM/images/main_fig.jpg",
            "sep": ",",
            "temperature": 0,
            "top_p": None,
            "num_beams": 1,
            "max_new_tokens": 512
        })()

description = eval_model(args, model, tokenizer, image_processor, context_len)
print(description)

Expected behavior

I am a novice in this field. How can I solve the problem of input_ids and attention_mask mismatch caused by fine-tuning? And I don't want to affect the performance of the model, I want to use the fine tuned model

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2025-01-06T10:07:59Z

Is it possible for you to share the code that you used for training the model?

MaTengSYSU · 2025-01-07T03:57:43Z

Is it possible for you to share the code that you used for training the model?

The training code is from this link: https://github.com/alenai97/PEFT-MLLM

the fine-tuning file is generated by:

bash scripts/llava/peft_prefix.sh

BenjaminBossan · 2025-01-07T11:19:07Z

Thanks for the reference. I checked the repo and IIUC, this repo has its own patched version of PEFT and transformers. Did you use this patched code or normal PEFT and transformers? If you used the normal packages, please use their patched versions instead. If the problem still persists, open an issue on their repo (you can ping me in that issue).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a question about input_ids and attention_mask after prefix-tuning #2304

a question about input_ids and attention_mask after prefix-tuning #2304

MaTengSYSU commented Jan 6, 2025

BenjaminBossan commented Jan 6, 2025

MaTengSYSU commented Jan 7, 2025

BenjaminBossan commented Jan 7, 2025

a question about input_ids and attention_mask after prefix-tuning #2304

a question about input_ids and attention_mask after prefix-tuning #2304

Comments

MaTengSYSU commented Jan 6, 2025

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

BenjaminBossan commented Jan 6, 2025

MaTengSYSU commented Jan 7, 2025

BenjaminBossan commented Jan 7, 2025