Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issue with AWQ library when using AWQ models with Ctranslate2 #1821

Open
BBC-Esq opened this issue Nov 25, 2024 · 2 comments
Open

Comments

@BBC-Esq
Copy link

BBC-Esq commented Nov 25, 2024

Just FYI, I think that the autoawq library only supports up to a certain version of the torch library pursuant to this message, which I received after (1) installing autoawq and (2) subsequently installing torch version 2.4.0.

image

The reason I installed Torch 2.4.0 is that it supports CUDA 12.4, which, as I understand it, is the version that officially supports CUDNN 9+, which, in turn, ctranslate2 4.5.0 requires.

Correct me if I'm mistaken...

It's a little confusing because the autoawq repository has this portion in their readme.md:

image

Furthermore, when I pip install autoawq it installs autoawq-kernels by default. This somewhat conflicts with the readme.md, which states that triton is used by default...

The repository for the "kernels" is here: https://github.com/casper-hansen/AutoAWQ_kernels

It states that you can build this specific library based on torch version and cuda compute level so...Perhaps it's just a matter of what wheels are are available by default (and whether they're compatible with ctranslate2) versus building the library, which goes above and beyond the available wheels...

Anyhow, just FYI, because if the AWQ library requires a specific version of torch and it doesn't overlap with ctranslate2's requirements, it could create a serious conflict.

@BBC-Esq
Copy link
Author

BBC-Esq commented Nov 25, 2024

Following up...

I tried converting an AWQ model here:

https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-AWQ

The command I used and the error I received were as follows...I'm guessing because of some kind of torch conflict?

Starting conversion for awq with command:
ct2-transformers-converter --model "D:/Scripts/bench_chat/models/Qwen--Qwen2.5-3B-Instruct-AWQ" --output_dir "D:/Scripts/bench_chat/models\Qwen--Qwen2.5-3B-Instruct-AWQ-ct2-awq" --low_cpu_mem_usage --trust_remote_code --copy_files "generation_config.json" "LICENSE" "merges.txt" "tokenizer.json" "tokenizer_config.json" "vocab.json"
Command failed with return code 1: The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You have loaded an AWQ model on CPU and have a CUDA device available, make sure to set your model on a GPU device in order to run your model.
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\exllama.py:12: UserWarning: AutoAWQ could not load ExLlama kernels extension. Details: DLL load failed while importing exl_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load ExLlama kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\exllamav2.py:13: UserWarning: AutoAWQ could not load ExLlamaV2 kernels extension. Details: DLL load failed while importing exlv2_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load ExLlamaV2 kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\gemm.py:14: UserWarning: AutoAWQ could not load GEMM kernels extension. Details: DLL load failed while importing awq_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load GEMM kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\gemv.py:11: UserWarning: AutoAWQ could not load GEMV kernels extension. Details: DLL load failed while importing awq_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load GEMV kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\gemv_fast.py:10: UserWarning: AutoAWQ could not load GEMVFast kernels extension. Details: DLL load failed while importing awq_v2_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load GEMVFast kernels extension. Details: {ex}")
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "D:\Scripts\bench_chat\Scripts\ct2-transformers-converter.exe\__main__.py", line 8, in <module>
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 2715, in main
    converter.convert_from_args(args)
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\converter.py", line 50, in convert_from_args
    return self.convert(
           ^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\converter.py", line 89, in convert
    model_spec = self._load()
                 ^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 150, in _load
    spec = loader(model, tokenizer)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 202, in __call__
    spec = self.get_model_spec(model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 2004, in get_model_spec
    self.set_decoder(spec.decoder, model.model)
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 2048, in set_decoder
    self.set_linear(split_layers[0], layer.self_attn.q_proj)
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 230, in set_linear
    spec.weight = module.weight
                  ^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1729, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'WQLinear_GEMM' object has no attribute 'weight'. Did you mean: 'qweight'?

@minhthuc2502
Copy link
Collaborator

minhthuc2502 commented Nov 27, 2024

I do not yet support AWQ for Qwen2. A new pull request is coming soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants