Skip to content

Is LoRA actually working? #11599

Answered by jeejeelee
ftatp asked this question in Q&A
Dec 29, 2024 · 1 comments · 1 reply
Discussion options

You must be logged in to vote

IIUC, Your confusion mainly stems from cudagraph . I suggest you first learn about cudagraph .

Alternatively, you can try changing the code below and debug it.

  engine_args = EngineArgs(model="meta-llama/Llama-2-7b-hf",
                           enable_lora=True,
                           max_loras=1,
                           max_lora_rank=8,
                           max_cpu_loras=2,
                           max_num_seqs=256,
                           enforce_eager=True) # add `enforce_eager`

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@ftatp
Comment options

Answer selected by ftatp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants