Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External Execution Interface #4616

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from
72 changes: 72 additions & 0 deletions docs/Execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,75 @@ graph TD
end
end
```

## Custom Execution

MsQuic also supports scenarios where the application layer creates the threads that MsQuic uses to execute on.
In this mode, the application creates one or more execution contexts that MsQuic will use to run all its internal logic.
The application is responsible for calling down into MsQuic to allow these execution contexts to run.

To create an execution context, the app much first create an event queue object, which is a platform specific type:

- Windows: IOCP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we expose a platform specific queue object rather than having a more abstracted interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such as what? The reason we use the platform queue object is because that is also a requirement for other (i.e. storage) IO. So, it's not expected to be a new requirement. Also, this is all opt-in. They can continue to use MsQuic without doing this, and it will simply create the background threads still.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a generic platform-independent interface for queue objects.

So, for the apps to opt in, they still have to create a new IOCP handle exclusively for msquic, right? i.e. they can't just share whatever IOCP handles they use for file IO for msquic unless they change all their IO logic to use the new sqe/cqe pattern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they shouldn't create anything specific for MsQuic. If they have an epoll queue (on Linux) for their storage IO, they should reuse that for MsQuic.

- Linux: epoll
- macOS: kqueue

On Windows, the following types are defined:

```c++
typedef HANDLE QUIC_EVENTQ;

typedef OVERLAPPED_ENTRY CXPLAT_CQE;

typedef
_IRQL_requires_max_(PASSIVE_LEVEL)
void
(CXPLAT_EVENT_COMPLETION)(
_In_ CXPLAT_CQE* Cqe
);
typedef CXPLAT_EVENT_COMPLETION *CXPLAT_EVENT_COMPLETION_HANDLER;

typedef struct CXPLAT_SQE {
OVERLAPPED Overlapped;
CXPLAT_EVENT_COMPLETION_HANDLER Completion;
} CXPLAT_SQE;
```

You will also notice the definiton for `QUIC_SQE` (SQE stands for submission queue entry), which defines the format that all completion events must take so they may be generically processed from the event queue (more on this below).

Once the app has the event queue, it may create the execution context with the `ExecutionCreate` function:

```c++
HANDLE IOCP = CreateIoCompletionPort(INVALID_HANDLE_VALUE, nullptr, 0, 1);
QUIC_EXECUTION_CONTEXT_CONFIG ExecConfig = { 0, &IOCP };

QUIC_EXECUTION_CONTEXT* ExecContext = nullptr;
QUIC_STATUS Status = MsQuic->ExecutionCreate(QUIC_EXECUTION_CONFIG_FLAG_NONE, 0, 1, &ExecConfig, &ExecContext);
```

The above code createa a new IOCP (for Windows), sets up an execution config, indicating an ideal processor of 0 and the pointer to the IOCP, and then calls MsQuic to create 1 execution context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A processor of 0 has no clear, unambiguous meaning on Win32. A PROCESSOR_NUMBER is the only unambiguous processor identifier in Windows user mode, unless MsQuic provides its own canonical representation to apps.

Copy link
Contributor

@mtfriesen mtfriesen Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could argue that 0 should always map to Group 0, Number 0 in any sensible CPU numbering scheme, but certainly once you reach the integer 1, the point stands. The NT processor index 1 could be Group 1, Number 0, even if Group 0, Number 1 exists, and vice versa.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can look around to see how other platform abstraction layers (std, boost) solve this problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to stackoverflow, boost sidesteps the problem by allowing you to get a native thread handle, and then go off and call platform-specific affinity routines yourself. Personally I think an int should be a reasonable way to identify CPUs, so if QUIC just exposes ConvertProcessorNumberToProcessorIndex and ConvertProcessorIndexToProcessorNumber helper routines on Windows, you'd be set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could typedef int QUIC_CPU_ID everywhere else and typedef PROCESSOR_NUMBER QUIC_CPU_ID on Windows and let apps deal with the difference if they also compile cross-plat.

An application may expand this code to create multiple execution contexts, depending on their needs.

To drive this execution context, the app will need to to periodically call `ExecutionPoll` and use the platform specific function to drain completion events from the event queue.

```c
bool AllDone = false;
while (!AllDone) {
uint32_t WaitTime = MsQuic->ExecutionPoll(ExecContext);

ULONG OverlappedCount = 0;
OVERLAPPED_ENTRY Overlapped[8];
if (GetQueuedCompletionStatusEx(IOCP, Overlapped, ARRAYSIZE(Overlapped), &OverlappedCount, WaitTime, FALSE)) {
for (ULONG i = 0; i < OverlappedCount; ++i) {
QUIC_SQE* Sqe = CONTAINING_RECORD(Overlapped[i].lpOverlapped, QUIC_SQE, Overlapped);
Sqe->Completion(&Overlapped[i]);
}
}
}
```

Above, you can see a simple loop that properly drives a single execution context on Windows.
`OVERLAPPED_ENTRY` objects received from `GetQueuedCompletionStatusEx` are used to get the submission queue entry and then call its completion handler.

In a real application, these completion events may come both from MsQuic and the application itself, therefore, this means **the application must use the same base format for its own submission entries**.
This is necessary to be able to share the same event queue object.
25 changes: 25 additions & 0 deletions src/cs/lib/msquic_generated.cs
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,22 @@ internal unsafe partial struct QUIC_EXECUTION_CONFIG
internal fixed ushort ProcessorList[1];
}

internal unsafe partial struct QUIC_EXECUTION_CONTEXT_CONFIG
{
[NativeTypeName("uint32_t")]
internal uint IdealProcessor;

[NativeTypeName("uint32_t")]
internal uint PollingIdleTimeoutUs;

[NativeTypeName("QUIC_EVENTQ *")]
internal void** EventQ;
}

internal partial struct QUIC_EXECUTION_CONTEXT
{
}

internal unsafe partial struct QUIC_REGISTRATION_CONFIG
{
[NativeTypeName("const char *")]
Expand Down Expand Up @@ -3255,6 +3271,15 @@ internal unsafe partial struct QUIC_API_TABLE

[NativeTypeName("QUIC_CONNECTION_COMP_CERT_FN")]
internal delegate* unmanaged[Cdecl]<QUIC_HANDLE*, byte, QUIC_TLS_ALERT_CODES, int> ConnectionCertificateValidationComplete;

[NativeTypeName("QUIC_EXECUTION_CREATE_FN")]
internal delegate* unmanaged[Cdecl]<QUIC_EXECUTION_CONFIG_FLAGS, uint, QUIC_EXECUTION_CONTEXT_CONFIG*, QUIC_EXECUTION_CONTEXT**, int> ExecutionCreate;

[NativeTypeName("QUIC_EXECUTION_DELETE_FN")]
internal delegate* unmanaged[Cdecl]<uint, QUIC_EXECUTION_CONTEXT**, void> ExecutionDelete;

[NativeTypeName("QUIC_EXECUTION_POLL_FN")]
internal delegate* unmanaged[Cdecl]<QUIC_EXECUTION_CONTEXT*, uint> ExecutionPoll;
}

internal static unsafe partial class MsQuic
Expand Down
69 changes: 67 additions & 2 deletions src/inc/msquic.h
Original file line number Diff line number Diff line change
Expand Up @@ -265,16 +265,16 @@ typedef enum QUIC_DATAGRAM_SEND_STATE {
#define QUIC_DATAGRAM_SEND_STATE_IS_FINAL(State) \
((State) >= QUIC_DATAGRAM_SEND_LOST_DISCARDED)

#ifdef QUIC_API_ENABLE_PREVIEW_FEATURES

typedef enum QUIC_EXECUTION_CONFIG_FLAGS {
QUIC_EXECUTION_CONFIG_FLAG_NONE = 0x0000,
#ifdef QUIC_API_ENABLE_PREVIEW_FEATURES
QUIC_EXECUTION_CONFIG_FLAG_QTIP = 0x0001,
QUIC_EXECUTION_CONFIG_FLAG_RIO = 0x0002,
QUIC_EXECUTION_CONFIG_FLAG_XDP = 0x0004,
QUIC_EXECUTION_CONFIG_FLAG_NO_IDEAL_PROC = 0x0008,
QUIC_EXECUTION_CONFIG_FLAG_HIGH_PRIORITY = 0x0010,
QUIC_EXECUTION_CONFIG_FLAG_AFFINITIZE = 0x0020,
#endif
} QUIC_EXECUTION_CONFIG_FLAGS;

DEFINE_ENUM_FLAG_OPERATORS(QUIC_EXECUTION_CONFIG_FLAGS)
Expand All @@ -295,6 +295,62 @@ typedef struct QUIC_EXECUTION_CONFIG {
#define QUIC_EXECUTION_CONFIG_MIN_SIZE \
(uint32_t)FIELD_OFFSET(QUIC_EXECUTION_CONFIG, ProcessorList)

#ifndef _KERNEL_MODE

//
// Execution Context abstraction, which allows the application layer to
// completely control execution of all MsQuic work.
//

typedef struct QUIC_EXECUTION_CONTEXT_CONFIG {
uint32_t IdealProcessor;
QUIC_EVENTQ* EventQ;
} QUIC_EXECUTION_CONTEXT_CONFIG;

typedef struct QUIC_EXECUTION_CONTEXT QUIC_EXECUTION_CONTEXT;

//
// This is called to create the execution contexts.
//
typedef
_IRQL_requires_max_(PASSIVE_LEVEL)
QUIC_STATUS
(QUIC_API * QUIC_EXECUTION_CREATE_FN)(
_In_ QUIC_EXECUTION_CONFIG_FLAGS Flags, // Used for datapath type
_In_ uint32_t PollingIdleTimeoutUs,
_In_ uint32_t Count,
_In_reads_(Count) QUIC_EXECUTION_CONTEXT_CONFIG* Configs,
_Out_writes_(Count) QUIC_EXECUTION_CONTEXT** ExecutionContexts
);

//
// This is called to delete the execution contexts.
//
typedef
_IRQL_requires_max_(PASSIVE_LEVEL)
void
(QUIC_API * QUIC_EXECUTION_DELETE_FN)(
_In_ uint32_t Count,
_In_reads_(Count) QUIC_EXECUTION_CONTEXT** ExecutionContexts
);

//
// This is called to allow MsQuic to process any polling work. It returns the
// number of milliseconds until the next scheduled timer expiration.
//
// TODO: Should it return an indication for if we should yield?
//
typedef
_IRQL_requires_max_(PASSIVE_LEVEL)
uint32_t
(QUIC_API * QUIC_EXECUTION_POLL_FN)(
_In_ QUIC_EXECUTION_CONTEXT* ExecutionContext
);

#endif // _KERNEL_MODE

#endif // QUIC_API_ENABLE_PREVIEW_FEATURES

typedef struct QUIC_REGISTRATION_CONFIG { // All fields may be NULL/zero.
const char* AppName;
QUIC_EXECUTION_PROFILE ExecutionProfile;
Expand Down Expand Up @@ -861,6 +917,7 @@ void
#endif
#define QUIC_PARAM_GLOBAL_TLS_PROVIDER 0x0100000A // QUIC_TLS_PROVIDER
#define QUIC_PARAM_GLOBAL_STATELESS_RESET_KEY 0x0100000B // uint8_t[] - Array size is QUIC_STATELESS_RESET_KEY_LENGTH

//
// Parameters for Registration.
//
Expand Down Expand Up @@ -1630,6 +1687,14 @@ typedef struct QUIC_API_TABLE {
QUIC_CONNECTION_COMP_RESUMPTION_FN ConnectionResumptionTicketValidationComplete; // Available from v2.2
QUIC_CONNECTION_COMP_CERT_FN ConnectionCertificateValidationComplete; // Available from v2.2

#ifdef QUIC_API_ENABLE_PREVIEW_FEATURES
#ifndef _KERNEL_MODE
QUIC_EXECUTION_CREATE_FN ExecutionCreate; // Available from v2.5
QUIC_EXECUTION_DELETE_FN ExecutionDelete; // Available from v2.5
QUIC_EXECUTION_POLL_FN ExecutionPoll; // Available from v2.5
#endif
#endif

} QUIC_API_TABLE;

#define QUIC_API_VERSION_1 1 // Not supported any more
Expand Down
52 changes: 52 additions & 0 deletions src/inc/msquic_posix.h
Original file line number Diff line number Diff line change
Expand Up @@ -515,6 +515,58 @@ QuicAddrToString(
return TRUE;
}

//
// Event Queue Abstraction
//

#if __linux__ // epoll

#include <sys/epoll.h>
#include <sys/eventfd.h>

typedef int QUIC_EVENTQ;

typedef struct epoll_event QUIC_CQE;

typedef
void
(QUIC_EVENT_COMPLETION)(
_In_ QUIC_CQE* Cqe
);
typedef QUIC_EVENT_COMPLETION *QUIC_EVENT_COMPLETION_HANDLER;

typedef struct QUIC_SQE {
int fd;
QUIC_EVENT_COMPLETION_HANDLER Completion;
} QUIC_SQE;

#elif __APPLE__ || __FreeBSD__ // kqueue

#include <sys/event.h>
#include <fcntl.h>

typedef int QUIC_EVENTQ;

typedef struct kevent QUIC_CQE;

typedef
void
(QUIC_EVENT_COMPLETION)(
_In_ QUIC_CQE* Cqe
);
typedef QUIC_EVENT_COMPLETION *QUIC_EVENT_COMPLETION_HANDLER;

typedef struct QUIC_SQE {
uintptr_t Handle;
QUIC_EVENT_COMPLETION_HANDLER Completion;
} QUIC_SQE;

#else

#error Unsupported Platform

#endif

#if defined(__cplusplus)
}
#endif
Expand Down
24 changes: 24 additions & 0 deletions src/inc/msquic_winuser.h
Original file line number Diff line number Diff line change
Expand Up @@ -373,4 +373,28 @@ QuicAddrToString(

#endif // WINAPI_FAMILY != WINAPI_FAMILY_GAMES

//
// Event Queue Abstraction
//

typedef HANDLE QUIC_EVENTQ;

typedef OVERLAPPED_ENTRY QUIC_CQE;

typedef
_IRQL_requires_max_(PASSIVE_LEVEL)
void
(QUIC_EVENT_COMPLETION)(
_In_ QUIC_CQE* Cqe
);
typedef QUIC_EVENT_COMPLETION *QUIC_EVENT_COMPLETION_HANDLER;

typedef struct QUIC_SQE {
OVERLAPPED Overlapped;
QUIC_EVENT_COMPLETION_HANDLER Completion;
#if DEBUG
BOOLEAN IsQueued; // Debug flag to catch double queueing.
#endif
} QUIC_SQE;

#endif // _MSQUIC_WINUSER_
5 changes: 5 additions & 0 deletions src/tools/execution/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

add_quic_tool(quicexecution execution_windows.cpp)
quic_tool_warnings(quicexecution)
Loading
Loading