-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stream the output #52
Conversation
Currently, the logdetective waits a dozen minutes and then prints the result én bloc. I wanted to turn on the streaming, but with streaming on, the models continue answering and never finishes. |
I made the stream default, but added --no-stream option to workaround broken models like llama3. But instead of heuristics when not to keep streaming on, I choose rather to document the broken model in README. Ready for re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally and works really well!
Although there is a big gap (~30 seconds) after the 'Explanation:' text is printed and before the output starts appearing. We may investigate what's happening (is the model being loaded to memory and initialized?) and print some "progress bar" there.
LGTM, very nice!
README.md
Outdated
@@ -55,6 +55,10 @@ Example you want to use a different model: | |||
logdetective https://example.com/logs.txt --model https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q5_K_S.gguf?download=true | |||
logdetective https://example.com/logs.txt --model QuantFactory/Meta-Llama-3-8B-Instruct-GGUF | |||
|
|||
Note that streaming with some models (notably Meta-Llama-3 is broken) is brokend and can be workarounded by `no-stream` option: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/brokend/broken/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
so it is more interactive
addressing: logdetective/logdetective.py:12: R0915[too-many-statements]: main: Too many statements (55/50)
so it is more interactive