-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amazing project! What about Groq and fast inference on open source models? #1
Comments
Hi @agnoldo I completely agree with you. This approach could be a first workaround to get Promptimizer to work with local models. Thanks for bringing this up! |
Hey @agnoldo! Sorry for responding late; I never saw your initial message! It should absolutely be possible! I've never used Groq, but if we could get it to work, that would be game-changing. Ollama is unfortunately too slow, at least on my computer. I don't have time to implement this, but I'm open to PR requests! It should be relatively straightforward to add. |
@agnoldo For local inference, Ollama should work - any other local inference engine that exposes a compatible API should work, right? LlamaCpp, etc. I haven't tried it myself yet in this project and don't have time to add a PR right now, but the breadcrumbs should all be here:
@austin-starks There's probably a much smarter way to do this but this looks enough like a nail to my hammer-minded approach... |
Congratulations on your achievements, @austin-starks ! I see a huge potential for this project!
I was wondering if you could implement support for Groq and open source fast models such as Llama 3.1 8B. Imagine improving prompts for such a fast model, running at 1200 tokens/second! And cheaply. Or even locally, for those who need complete privacy...
Do you think this is feasible?
Thanks!
The text was updated successfully, but these errors were encountered: