This Azure Function acts as a proxy for Azure OpenAI, supporting both streaming and non-streaming responses. It provides a simple way to interact with Azure OpenAI services while handling both Server-Sent Events (SSE) streaming and standard JSON responses.
This is to resolve Event Hub Logging for APIM SSE Streaming limitation.
- Azure subscription
- Azure Function App (Python)
- Azure OpenAI service instance
Copy local.settings.sample.json
to local.settings.json
and set the following environment variables:
or
Set the following environment variables in your Azure Function App:
az functionapp config appsettings set \
--name fnsse \
--resource-group fnsse \
--settings \
"AZURE_OPENAI_KEY=your_key_here" \
"AZURE_OPENAI_API_VERSION=2024-08-01-preview" \
"AZURE_OPENAI_BASE_URL=https://your-instance.openai.azure.com/" \
"AZURE_EVENTHUB_CONN_STR=your_eventhub_connection_string" \
"AZURE_EVENTHUB_NAME=openai-logs" \
"PYTHON_ENABLE_INIT_INDEXING=1"
Note:
PYTHON_ENABLE_INIT_INDEXING=1
is required for proper Python module initialization in Azure Functions.
The endpoint matches the Azure OpenAI API signature:
POST /openai/deployments/{deployment_name}/chat/completions?api-version=2024-08-01-preview
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Tell me about Singapore in 1 sentence"
}
]
}
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Tell me about Singapore in 1 sentence"
}
],
"stream": true,
"stream_options": {
"include_usage": true
}
}
Standard Azure OpenAI response format:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Singapore is a highly developed city-state..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 31,
"total_tokens": 56
}
}
Server-Sent Events (SSE) format with chunks matching Azure OpenAI's format:
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"choices":[{"index":0,"delta":{"role":"assistant"}}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"choices":[{"index":0,"delta":{"content":"Singapore"}}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"choices":[{"index":0,"delta":{"content":" is"}}]}
... more chunks ...
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"usage":{"completion_tokens":31,"prompt_tokens":25,"total_tokens":56}}
data: [DONE]
The function automatically logs completion details to Azure Event Hub for both streaming and non-streaming responses:
{
"type": "completion",
"content": "Singapore is a vibrant city-state...",
"usage": {
"completion_tokens": 137,
"prompt_tokens": 25,
"total_tokens": 162
},
"model": "gpt-4o",
"prompt": [...],
"region": "Australia East",
"latency_ms": 1306,
"timestamp": "2024-11-13T06:59:30.584946"
}
{
"type": "stream_completion",
"content": "Singapore is a vibrant city-state...",
"model": "gpt-4o",
"usage": {
"completion_tokens": 137,
"prompt_tokens": 25,
"total_tokens": 162
},
"prompt": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Tell me about Singapore in 1 sentence"
}
],
"region": "Australia East",
"latency_ms": 2650,
"time_to_first_chunk_ms": 150,
"streaming_duration_ms": 2500,
"timestamp": "2024-11-13T06:59:30.584946"
}
For streaming responses, three timing metrics are captured:
time_to_first_chunk_ms
: Time from request start until first token (includes queue time and model startup)streaming_duration_ms
: Duration of token generation (actual model inference time)latency_ms
: Total request duration (time_to_first_chunk_ms + streaming_duration_ms)
These metrics help identify:
- Queue waiting time in different regions
- Model warm-up and startup time
- Token generation speed
- Overall request latency
- Python 3.9+
openai>=1.0.0
azure-functions
azure.functions.extensions.http.fastapi
httpx
func azure functionapp publish fnsse
View logs using Azure CLI:
az functionapp logs tail --name fnsse --resource-group fnsse