How to use AzureOpenAITextEmbeddingGenerationService.UpsertBatchAsync #9846
-
My current confusion is that when I break down a document into many small fragments, using UpsertAsynchronous can lead to excessive database links. How can I solve this problem |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
@ordinaryAndConfident when considering a chunking strategy it's important to strike a balance on size. If you create chunks that are too small they may be less likely to be found when doing vector searches. If the chunks are too big, they can increase token usage when passed to the LLM for context. The right size will also depend on the type of data that you want to generate embeddings for, so experimenting with different sizes is important. |
Beta Was this translation helpful? Give feedback.
@ordinaryAndConfident when considering a chunking strategy it's important to strike a balance on size. If you create chunks that are too small they may be less likely to be found when doing vector searches. If the chunks are too big, they can increase token usage when passed to the LLM for context. The right size will also depend on the type of data that you want to generate embeddings for, so experimenting with different sizes is important.