Possible issue with AWQ library when using AWQ models with Ctranslate2 #1821

BBC-Esq · 2024-11-25T16:33:52Z

Just FYI, I think that the autoawq library only supports up to a certain version of the torch library pursuant to this message, which I received after (1) installing autoawq and (2) subsequently installing torch version 2.4.0.

The reason I installed Torch 2.4.0 is that it supports CUDA 12.4, which, as I understand it, is the version that officially supports CUDNN 9+, which, in turn, ctranslate2 4.5.0 requires.

Correct me if I'm mistaken...

It's a little confusing because the autoawq repository has this portion in their readme.md:

Furthermore, when I pip install autoawq it installs autoawq-kernels by default. This somewhat conflicts with the readme.md, which states that triton is used by default...

The repository for the "kernels" is here: https://github.com/casper-hansen/AutoAWQ_kernels

It states that you can build this specific library based on torch version and cuda compute level so...Perhaps it's just a matter of what wheels are are available by default (and whether they're compatible with ctranslate2) versus building the library, which goes above and beyond the available wheels...

Anyhow, just FYI, because if the AWQ library requires a specific version of torch and it doesn't overlap with ctranslate2's requirements, it could create a serious conflict.

The text was updated successfully, but these errors were encountered:

BBC-Esq · 2024-11-25T16:49:48Z

Following up...

I tried converting an AWQ model here:

https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-AWQ

The command I used and the error I received were as follows...I'm guessing because of some kind of torch conflict?

Starting conversion for awq with command:
ct2-transformers-converter --model "D:/Scripts/bench_chat/models/Qwen--Qwen2.5-3B-Instruct-AWQ" --output_dir "D:/Scripts/bench_chat/models\Qwen--Qwen2.5-3B-Instruct-AWQ-ct2-awq" --low_cpu_mem_usage --trust_remote_code --copy_files "generation_config.json" "LICENSE" "merges.txt" "tokenizer.json" "tokenizer_config.json" "vocab.json"
Command failed with return code 1: The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You have loaded an AWQ model on CPU and have a CUDA device available, make sure to set your model on a GPU device in order to run your model.
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\exllama.py:12: UserWarning: AutoAWQ could not load ExLlama kernels extension. Details: DLL load failed while importing exl_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load ExLlama kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\exllamav2.py:13: UserWarning: AutoAWQ could not load ExLlamaV2 kernels extension. Details: DLL load failed while importing exlv2_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load ExLlamaV2 kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\gemm.py:14: UserWarning: AutoAWQ could not load GEMM kernels extension. Details: DLL load failed while importing awq_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load GEMM kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\gemv.py:11: UserWarning: AutoAWQ could not load GEMV kernels extension. Details: DLL load failed while importing awq_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load GEMV kernels extension. Details: {ex}")
D:\Scripts\bench_chat\Lib\site-packages\awq\modules\linear\gemv_fast.py:10: UserWarning: AutoAWQ could not load GEMVFast kernels extension. Details: DLL load failed while importing awq_v2_ext: The specified procedure could not be found.
  warnings.warn(f"AutoAWQ could not load GEMVFast kernels extension. Details: {ex}")
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "D:\Scripts\bench_chat\Scripts\ct2-transformers-converter.exe\__main__.py", line 8, in <module>
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 2715, in main
    converter.convert_from_args(args)
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\converter.py", line 50, in convert_from_args
    return self.convert(
           ^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\converter.py", line 89, in convert
    model_spec = self._load()
                 ^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 150, in _load
    spec = loader(model, tokenizer)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 202, in __call__
    spec = self.get_model_spec(model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 2004, in get_model_spec
    self.set_decoder(spec.decoder, model.model)
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 2048, in set_decoder
    self.set_linear(split_layers[0], layer.self_attn.q_proj)
  File "D:\Scripts\bench_chat\Lib\site-packages\ctranslate2\converters\transformers.py", line 230, in set_linear
    spec.weight = module.weight
                  ^^^^^^^^^^^^^
  File "D:\Scripts\bench_chat\Lib\site-packages\torch\nn\modules\module.py", line 1729, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'WQLinear_GEMM' object has no attribute 'weight'. Did you mean: 'qweight'?

minhthuc2502 · 2024-11-27T10:49:57Z

I do not yet support AWQ for Qwen2. A new pull request is coming soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible issue with AWQ library when using AWQ models with Ctranslate2 #1821

Possible issue with AWQ library when using AWQ models with Ctranslate2 #1821

BBC-Esq commented Nov 25, 2024

BBC-Esq commented Nov 25, 2024

minhthuc2502 commented Nov 27, 2024 •

edited

Loading

Possible issue with AWQ library when using AWQ models with Ctranslate2 #1821

Possible issue with AWQ library when using AWQ models with Ctranslate2 #1821

Comments

BBC-Esq commented Nov 25, 2024

BBC-Esq commented Nov 25, 2024

minhthuc2502 commented Nov 27, 2024 • edited Loading

minhthuc2502 commented Nov 27, 2024 •

edited

Loading