QLoRa with Vision Transformers #1671

anwai98 · 2024-04-23T15:32:39Z

anwai98
Apr 23, 2024

Hi community,

Thanks to the huggingface team for this awesome repo 😉

I have a question: I would like to check out QLoRa on vision transformers. I found this gem which helps me to get a rough idea of how to integrate LoRa in computer vision models (haven't tested yet), and I see some hints here on quantization. Could someone guide me on how to integrate the two? (basically to achieve QLoRa-based finetuning for pretrained vision transformers).

(My best guess is both the config files (one for LoRa and/or one from the quantization library) come together in one place, I am a bit unsure of the right way to go ahead with this)

Thanks!

younesbelkada · 2024-04-23T15:40:54Z

younesbelkada
Apr 23, 2024

Hi @anwai98
Thanks very much !
What I would suggest you is the following protocol:

1- First, quantize your Vision Transformers with accelerate & bitsandbytes: https://huggingface.co/docs/accelerate/v0.29.3/en/usage_guides/quantization#bitsandbytes-integration to quantize it in 4-bit precision. You might leave the classification head un-quantized (that's what we do for LLMs)
2- Then use what is proposed here: https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/custom_models.md to inject the QLoRA layers by passing a LoraConfig and that will automatically init QLoRA layers

Hope this helps !

1 reply

anwai98 Apr 23, 2024
Author

Thanks for the fast reply @younesbelkada

Re: suggested protocol:

Okay, okay. I'll walk up in the suggested direction, and come back with my questions!

anwai98 · 2024-12-09T20:31:38Z

anwai98
Dec 9, 2024
Author

Hi @younesbelkada,

As always, things take longer than expected (one could argue this took wayyyyyyyyyyyyyyyy longer for me, apologies for coming back so late).

I think I have a workflow prepared for me, however I am encountering an issue with exploding gradients right after the first backprop (no success in debugging it so far). Maybe you (or someone in the community) can help me with this (I'll outlike my workflow below, in case we spot the obvious issues already):

My target is to finetune Segment Anything Model (SAM) with QLoRA (I haven't seen a direct integration anywhere / even in huggingface, hence I went with bootstrapping it)
- I have a full finetuning heuristic running without any issues.
I have my own LoRA wrapper around the attention blocks in the image encoder (i.e. a standard Vision Transformer).
- I have a LoRA finetuning heuristic running without any issues.
To make QLoRA work, firstly I update the linear layers for LoRA. Next I wrap the linear layers in the image encoder to 4 bit precision linear layers (thanks to bitsandbytes) (i.e. replace linear layers with bnb.nn.Linear4bit).
- I also did two things: used mixed precision in forward pass-only (using it in backprop gives me a strange error - so I do backprop in full precision); and used bnb.optim.AdamW8bit as the optimizer.

The first iteration with the forward pass works as expected (loss is computed, ~realistic float values -> and then right after backprop for the first iteration itself, I get NaN values for my loss).

Any help would be appreciated. Thanks in advance!

PS. I would be happy to share more details (here's the WIP branch computational-cell-analytics/micro-sam@dev...qlora)

1 reply

BenjaminBossan Dec 10, 2024
Maintainer

Since you made a lot of customizations, it is hard to tell where something could have possibly gone wrong. You mention that the loss after the first iteration looks good. But did you also check the individual gradients? Maybe you can iterate through those and look for suspicious values (non-finite, very large values in absolute numbers). Based on what you find there, you may be able to adapt the training process. Just as an example, if the values are too large for some layers, gradient clipping or normalization could be a solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QLoRa with Vision Transformers #1671

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

QLoRa with Vision Transformers #1671

anwai98 Apr 23, 2024

Replies: 2 comments · 2 replies

younesbelkada Apr 23, 2024

anwai98 Apr 23, 2024 Author

anwai98 Dec 9, 2024 Author

BenjaminBossan Dec 10, 2024 Maintainer

anwai98
Apr 23, 2024

Replies: 2 comments 2 replies

younesbelkada
Apr 23, 2024

anwai98 Apr 23, 2024
Author

anwai98
Dec 9, 2024
Author

BenjaminBossan Dec 10, 2024
Maintainer