Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Obs AI Assistant] Remove custom token count event #205479

Open
sorenlouv opened this issue Jan 3, 2025 · 3 comments
Open

[Obs AI Assistant] Remove custom token count event #205479

sorenlouv opened this issue Jan 3, 2025 · 3 comments
Labels
Team:Obs AI Assistant Observability AI Assistant

Comments

@sorenlouv
Copy link
Member

sorenlouv commented Jan 3, 2025

Background

Previously we implemented token counting in the AI Assistant in order to track usage. This is now done by the inference plugin so we do not have to handle this anymore.

One difference to call out: the AI Assistant counts the number of tokens used per conversation and persists it in the conversations index. This enables users to track token count per conversation. We have not exposed or documented this in any way, and I don't think the (unused) functionality justifies the added complexity.

Solution

Remove the custom StreamingChatResponseEventType.TokenCount event as well as token counting per conversation.

Technical background

The inference plugin emits the event InferenceChatCompletionEventType.ChatCompletionTokenCount that contains the number of tokens used for the LLM call. The Obs AI Assistant converts this event to StreamingChatResponseEventType.TokenCount:

case InferenceChatCompletionEventType.ChatCompletionTokenCount:
// Convert to TokenCountEvent
return {
type: StreamingChatResponseEventType.TokenCount,
tokens: {
completion: event.tokens.completion,
prompt: event.tokens.prompt,
total: event.tokens.total,
},
} as TokenCountEvent;

All the token count events are accumulated into a single result:

mergeOperator(
nextEvents$,
title$.pipe(filter((value): value is TokenCountEvent => typeof value !== 'string'))
).pipe(extractTokenCount()),
// get just the title, and drop the token count events

scan(
(acc, event) => {
acc.completion += event.tokens.completion;
acc.prompt += event.tokens.prompt;
acc.total += event.tokens.total;
return acc;
},
{ completion: 0, prompt: 0, total: 0 }
)

The total token count for every LLM call within a conversation is persisted in the conversation.

conversation: {
title: title || conversation._source?.conversation.title,
token_count: {
prompt: persistedTokenCount.prompt + tokenCountResult.prompt,
completion:
persistedTokenCount.completion + tokenCountResult.completion,
total: persistedTokenCount.total + tokenCountResult.total,
},
},
}

conversation: {
id: string;
title: string;
last_updated: string;
token_count?: TokenCount;
};

In many cases we have to manually filter out the token count event:

@sorenlouv sorenlouv added the Team:Obs AI Assistant Observability AI Assistant label Jan 3, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant)

@emma-raffenne
Copy link
Contributor

@sorenlouv Would that affect any of the BI pipelines and reports?

@sorenlouv
Copy link
Member Author

@sorenlouv Would that affect any of the BI pipelines and reports?

Not to my knowledge. But to be on the safe side I'll go look at the reports. Can you point me to them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Obs AI Assistant Observability AI Assistant
Projects
None yet
Development

No branches or pull requests

3 participants