-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BaseEmbeddings.embed should be treated as callable #92
Comments
Hey @ascendant512! Thanks for opening an issue! 😊 Chonkie had that line as a If you can do one check, that would be great! Could you see by adding the def get_tokenizer_or_token_counter(self):
return self.count_tokens Since this returns a function that counts tokens, it should by-pass the issue. Please let me know the results! Thanks! 😊 |
Hi, thanks for the quick reply! I haven't use the other tokenizers yet, so I didn't know how they worked. However, I can confirm that your suggestion also didn't work. I get a different error: |
Hey @ascendant512! Sorry for the delay and thanks for trying it out; Thanks~ 😊 |
[FIX] Support class methods as `token_counter` objects for `CustomEmbeddings` (#92)
[Fix] Add fix for #92: Support `class.method` as a Tokenizer for `CustomEmbedding` +. minor changes
Hey @ascendant512! The patch has been merged into main and can be tested with an install from source. It would also be available from the next release onwards in Chonkie. Hopefully, it would resolve the errors you were seeing! Closing issue for now; please re-open if you are still seeing the issue~ Thanks! 😊 |
I tried to write a BaseEmbeddings implementation and it crashed due to this line:
chonkie/src/chonkie/chunker/base.py
Line 34 in 9904935
It should be
elif callable(tokenizer_or_token_counter):
The test fails because BaseEmbeddings.embed is a method, which
inspect.isfunction
does not consider to be a function.The symptom of the crash was in the line
Tokenizer backend {str(type(self.tokenizer))} not supported
where it referred to my class and its methodembed
.FYI, here is my class, I don't really know yet if it's correct, but I know that my change allowed it to run:
The text was updated successfully, but these errors were encountered: