Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: [0.5.12] Cannot disable GPU offloading #4369

Open
1 of 3 tasks
pguyennet opened this issue Dec 30, 2024 · 8 comments
Open
1 of 3 tasks

bug: [0.5.12] Cannot disable GPU offloading #4369

pguyennet opened this issue Dec 30, 2024 · 8 comments
Assignees
Labels
category: hardware type: bug Something isn't working
Milestone

Comments

@pguyennet
Copy link

Jan version

0.5.12

Describe the Bug

Can you help me disable GPU offloading ? I am talking about this setting :

screenshot-2024-12-30-21-19-12

I want to set it to 0.

In settings, GPU is disabled :

screenshot-2024-12-30-21-21-04

Thanks !

Steps to Reproduce

1 load model from GGUF
2 Locate ngl slider (bottom right, in model tab)
3 Try to disable/ set to 0

Screenshots / Logs

No response

What is your OS?

  • MacOS
  • Windows
  • Linux
@pguyennet pguyennet added the type: bug Something isn't working label Dec 30, 2024
@github-project-automation github-project-automation bot moved this to Investigating in Jan & Cortex Dec 30, 2024
@imtuyethan
Copy link
Contributor

@pguyennet Thanks for reporting this! Let me explain what's happening:
The NGL (Number of GPU Layers) slider is actually dependent on GPU acceleration being enabled - that's why you can't set it to 0 when GPU is disabled. If you want to run fully on CPU, you only need to disable GPU acceleration, NGL doesn't work in this case.

This makes sense technically, but I totally agree the UX could be clearer!

@pguyennet
Copy link
Author

Thanks for your answer @imtuyethan ! Can you tell me how to truly disable GPU acceleration then ? Because in my screenshot it is disabled but there is still the option ?

If you mean that this option doesn't do anything when GPU acceleration is disabled can you tell me why it affects my inference speed ? Here are the figures I am talking about :

ngl = 1 -> 17.8 tok/s
ngl = 50 -> 13.7 tok/s
ngl = 100 -> refuses to answer

Note : On ollama avx-512 cpu runner (ngl = 0 !) I've got 32 tok/s. Thanks !

@louis-jan
Copy link
Contributor

@pguyennet Could you please share the log files and the settings.json file located in the app data folder? We'll investigate then.

CleanShot 2025-01-02 at 09 02 58@2x

@pguyennet
Copy link
Author

Hey sure here are the files as requested :

settings.json
app.log

Thanks again ! I love your work the sole thing that prevent me from switching is the lower inference speed compared to Ollama.

@louis-jan
Copy link
Contributor

Hi @pguyennet, there's another log file named cortex.log in ~/.config/Jan/data/logs. Could you please upload it too?

@imtuyethan imtuyethan added this to the v0.5.14 milestone Jan 3, 2025
@imtuyethan imtuyethan changed the title bug: Cannot disable GPU offloading bug: [0.5.12] Cannot disable GPU offloading Jan 3, 2025
@pguyennet
Copy link
Author

Hi @louis-jan here you go :

cortex.log
Thanks !

@louis-jan
Copy link
Contributor

louis-jan commented Jan 4, 2025

@pguyennet Can you help me find the model yml file in the app data folder (models/source/author/repo..) and remove the ngl: line? Create a new thread to see if it's removed.

What quantized version of the model are you using and inference parameters such as context_length, cpu_threads? on both sides. It seems you don't have avx-512 support but avx2 (but ye it backward compatible), cmiiw.

avx-512 cpu runner (ngl = 0 !) I've got 32 tok/s

@louis-jan
Copy link
Contributor

Hi @louis-jan here you go :

cortex.log Thanks !

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: hardware type: bug Something isn't working
Projects
Status: Investigating
Development

No branches or pull requests

3 participants