You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Every mighty hippo's journey begins with a single CHONK!"
- the wisest person to ever live, probably
Hey Chonkers! 🦛
Your favorite hippo has some big plans for Q1 2025! After swimming through your feedback and munching on some feature requests, we've got a CHONK-tastic roadmap for you
For Q1 2025, we want to focus on the eight following themes. Each theme is discussed in detail in its own section
🚀 Add new core features!
🪓 Add more Chunkers!
✨ Add more Refineries
📄 Support for more document formats
🤝 Add more integrations
⚡ Improve Chonkie efficiency
🌐 Chonkie as a service
👥 OSS Community
We'd love your feedback on this roadmap! If you've got ideas or want to contribute, drop them in the comments below! Remember, even tiny hippos can make big splashes!
🚀 New Core Features
Chonkie plans to add a few core features as part of it’s API interface, that allows for seamless chunking~
Add initial support for Chomp (Chonkie’s Multi-step Pipeline)
Add initial support for Pre-chunkers (Document ingestion support)
Add initial support for Genie (Generative model Interaction Engine)
Add initial support for Porters (Ease of exporting chunks)
🪓 Add New Chunkers
Core to Chonkie is its Chunking capabilities. In addition to its already 7 supported chunking techniques, we hope to add a few additional chunking techniques based on the latest research
🪆 Recursive Chunking (Done early in v0.4.0!)
🔀 CrossEncoderChunker
🤵🏻♂️ Propositional/Agentic Chunker
🪓 LumberChunker
✨ Add New Refineries
Even more means to refine your chunks further!
Add ContextualRefinery (using Genie)
Introduce an EmbeddingRefinery to generate embeddings for the chunks
📄 Support For More Document Formats
Currently, Chonkie fully supports only .txt and text like formats. This quarter, we hope to add support for more popular formats. In order of priority, we will working to extend support to
Add initial support for Pre-chunkers (Document ingestion)
Support Markdown (.MD and .MDX)
Support PDF
Support HTML
Support JSON
🤝 Chonkie + Your Favorite Service
Integrations to different services helps Chonkie use embeddings and in the future generative models for chunking easily:
In addition to speed and space use, we also want to optimize Chonkie’s memory usage
Reduce Chonkie’s memory usage during batch chunking
Run memory usage test with Chonkie using all chunking techniques
Add support for stream=True to reduce peak memory usage
Add support for Async embed and chunk operations for APIs
🌐 Chonkie As A Service
To enable the use of Chonkie in live ingestion pipelines, we want to provide Chonkie as a service that watches a document source for changes and posts chunks of modified or new documents automatically.
Create database watchers for AWS S3 and GCP BigTable
Add support for posting chunk results to popular vector databases
Enable live chunking through Chonkie running on a hosted environment
👥 OSS Community
Add more code examples for the chunkers and core features!
Develop better documentation for onboarding new contributors
Create more “good first issues”
Support new contributors through peer mentorship
Chonkie is a friendly hippo! If you've got ideas not covered in this roadmap and want to contribute, drop them in the comments below! Remember what Mama Hippo always says: "It takes a village to raise a CHONK!"
The text was updated successfully, but these errors were encountered:
"Every mighty hippo's journey begins with a single CHONK!" - the wisest person to ever live, probably
Hey Chonkers! 🦛
Your favorite hippo has some big plans for Q1 2025! After swimming through your feedback and munching on some feature requests, we've got a CHONK-tastic roadmap for you
For Q1 2025, we want to focus on the eight following themes. Each theme is discussed in detail in its own section
We'd love your feedback on this roadmap! If you've got ideas or want to contribute, drop them in the comments below! Remember, even tiny hippos can make big splashes!
🚀 New Core Features
Chonkie plans to add a few core features as part of it’s API interface, that allows for seamless chunking~
🪓 Add New Chunkers
Core to Chonkie is its Chunking capabilities. In addition to its already 7 supported chunking techniques, we hope to add a few additional chunking techniques based on the latest research
✨ Add New Refineries
Even more means to refine your chunks further!
📄 Support For More Document Formats
Currently, Chonkie fully supports only .txt and text like formats. This quarter, we hope to add support for more popular formats. In order of priority, we will working to extend support to
🤝 Chonkie + Your Favorite Service
Integrations to different services helps Chonkie use embeddings and in the future generative models for chunking easily:
⚡Enhance Chonkie Performance
In addition to speed and space use, we also want to optimize Chonkie’s memory usage
🌐 Chonkie As A Service
To enable the use of Chonkie in live ingestion pipelines, we want to provide Chonkie as a service that watches a document source for changes and posts chunks of modified or new documents automatically.
👥 OSS Community
Chonkie is a friendly hippo! If you've got ideas not covered in this roadmap and want to contribute, drop them in the comments below! Remember what Mama Hippo always says: "It takes a village to raise a CHONK!"
The text was updated successfully, but these errors were encountered: