[FEAT] Add async support to SDPMChunker and to SemanticChunker #39

rodion-m · 2024-11-18T11:44:39Z

Proper async support will give an ability to properly parallelize the execution of remote embeddings computation.

bhavnicksm · 2024-11-18T11:56:29Z

Thanks for opening the issue 😊

Could you please elaborate on the benefits of adding async and the use case for it? Personally, I haven't used the async chunking support offered by other libraries before, nor am I aware of devs/companies who use it often. Maybe I just haven't been exposed to those use cases, so having some context would help here...

If you think this could be of wider utility, I'd be happy to accept PRs for it.

Thanks!

rodion-m · 2024-11-18T12:00:36Z

Sure, here is my use case:

Proper async support will give an ability to properly parallelize the execution of remote embeddings computation.

So, we need speed :)

bhavnicksm · 2024-11-18T12:13:23Z

@rodion-m,

Correct me if I am wrong, but essentially, this is for faster embeddings computation with APIs like OpenAI, Anthropic etc to pass a bunch of chunks to get their embeddings async?

By proper async support, you mean only for the embeddings right? Or do you mean that chunks should also be generated asynchronously?

I plan to add async/batching support for embedding APIs like OpenAI, because we would need that capability, yes!

I'm not sure of the second one i.e. async chunk generation yet.

Sorry for all the questions, just trying to make sure I understand this correctly

Thanks!

rodion-m · 2024-11-18T13:43:21Z

I mean we should be able to call IO operations (like API calls for embeddings) asynchronously to get an ability to speed up embeddings generation in cases when we need to embed huge datasets and batch possibilities are exhausted. Here is an example of LiteLLM's implementation: https://docs.litellm.ai/docs/embedding/async_embedding

bhavnicksm · 2024-11-18T23:11:51Z

Hey @rodion-m,

Yes, this makes a lot more sense to me now! Thanks for the link with the example... 😊

I'm thinking about adding this feature into the Embeddings API that would be hosting all these APIs like OpenAI, Anthropic etc, so from the Chunker perspective, it would call a sync batching method but the Embeddings class for the API would take care of the embeddings handling in an async way to speed it up properly. Probably, we can call the asyncio.run inside the OpenAIEmbeddings class for example, and we could expose the async fns inside these classes too.

What do you think about this?

rodion-m · 2024-11-19T04:45:38Z

@bhavnicksm and thank you for your attention to this topic!

I think that it's ok to call sync methods from async classes (let the problem of excessive parallelism fall on the developer himself). So, we just can add new async methods into the base class started with "a", like achunk.

Did I get your question right? If not, please provide code examples.

Also, I didn't get why do you want to call asyncio.run inside the OpenAIEmbeddings, could you please explain the need?

bhavnicksm · 2024-11-19T09:15:24Z

@rodion-m,

I think I finally understand...

Yes, so currently, the semantic chunkers work on a "Scan and Look-back" style algorithm, which is why they require all the sentence embeddings before they begin. This is for one piece of text which let say has M sentences.

sync chunk -> sync batched_embed

But as we go from one to many pieces, we currently use multiprocessing at the moment to tackle this with N texts.

Here's what I understand from our discussion:

I believe if we need to support async API calls for OpenAI etc, that would be there. We can expose these methods and have them being used inside a batch_embed fn as well.

sync batch_embed -> asyncio.run( async embed M times with await ) or asyncio.run( async batch_embed )

(NOTE: I'm assuming there's no seperate sync batch_embed for these APIs, correct me if I am wrong here)

Then we also have the option of doing async chunk or achunk which would then depend on async batch_embed and could be exposed to the user as an option.

async chunk -> await (async batch_embed) -> M times x await( async embed )

Did I understand this correctly?

rodion-m added the enhancement New feature or request label Nov 18, 2024

rodion-m assigned bhavnicksm Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Add async support to SDPMChunker and to SemanticChunker #39

[FEAT] Add async support to SDPMChunker and to SemanticChunker #39

rodion-m commented Nov 18, 2024

bhavnicksm commented Nov 18, 2024

rodion-m commented Nov 18, 2024

bhavnicksm commented Nov 18, 2024 •

edited

Loading

rodion-m commented Nov 18, 2024

bhavnicksm commented Nov 18, 2024

rodion-m commented Nov 19, 2024

bhavnicksm commented Nov 19, 2024 •

edited

Loading

[FEAT] Add async support to SDPMChunker and to SemanticChunker #39

[FEAT] Add async support to SDPMChunker and to SemanticChunker #39

Comments

rodion-m commented Nov 18, 2024

bhavnicksm commented Nov 18, 2024

rodion-m commented Nov 18, 2024

bhavnicksm commented Nov 18, 2024 • edited Loading

rodion-m commented Nov 18, 2024

bhavnicksm commented Nov 18, 2024

rodion-m commented Nov 19, 2024

bhavnicksm commented Nov 19, 2024 • edited Loading

bhavnicksm commented Nov 18, 2024 •

edited

Loading

bhavnicksm commented Nov 19, 2024 •

edited

Loading