-
-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uncaught exceptions lead to on_execute
/ on_operation
lifecycle hooks completing before some resolvers
#3414
Comments
on_execute
lifecycle hook completing before some resolverson_execute
lifecycle hook completing before some resolvers
I think that's expected when field aren't optional, if you make the failing field option it should work as expected. Here's a test without the exception: https://play.strawberry.rocks/?gist=05181d8366f2584d390ccdac21d5d000 |
Thanks!
When you say this, do you mean according to the GraphQL spec or Strawberry's design decision? |
@kkom from the spec: https://spec.graphql.org/October2021/#sec-Handling-Field-Errors Here's an example that better shows the behaviour of failures in nested fields: https://play.strawberry.rocks/?gist=b1ca6f1c33325331a76e36fb0d80ebfa |
Ah, I see it now – thanks! Yeah, the short-circuiting behaviour makes perfect sense :) I tested it in anger in a lot of configurations and every behaviour agreed with the GraphQL spec. PS: I only found some small non-determinism in the |
However, we still have the problem of We use a custom Strawberry extension for context management. Basically same as this – it opens and closes a SQLAlchemy session which can be used in the resolvers. If a response is prematurely concluded because of an incoercible field error, the unfinished resolvers will run in the background. However, it's very likely they'll raise more errors – which we do see in our Datadog logs. The errors are varied, but all of them are noise – they don't matter anymore and happen because the context for the field resolvers is just messed up by the lifecycle hooks... Solutions I'm thinking of
What do you think? Any ideas appreciated! 🙏 |
I guess there is theoretically the 4th option of the lifecycle hooks waiting for also the unneeded resolvers to finish. I wonder though if the HTTP response concluding before the |
Oh, I think there is a decent variation on option 4! What if it was possible to configure a schema extension to wait for all resolvers to finish before concluding? Then we could selectively apply it to the extension where it matters (e.g. the one that manages our DB session). Not sure how difficult it would be to implement and how much complexity it would add to the code, but I think it would solve all our problems. 🤞 |
on_execute
lifecycle hook completing before some resolverson_execute
/ on_operation
lifecycle hooks completing before some resolvers
Is there any particular reason why you're doing this on an extension and not using your web framework functionality? In any case I think we could introduce a hook for when the request is done 😊 @nrbnlulu what do you think? |
What an endeavour @kkom 🙃! So what I understand from this issue When a I think this issue does not relate to extensions but for graphql-core here BTW:
|
That's a very good question! The context we prepare is highly dependent on whether the GraphQL operation is a query or a mutation. In queries we instantiate a special DB client optimised for reads – without the overhead of explicit transactions and leveraging separate sessions for each SQL query (to support concurrent reads). In writes we instantiate a DB client that works off a single transaction for the entire mutation. from collections.abc import AsyncGenerator
from copy import copy
from typing import cast, override
from strawberry.extensions import SchemaExtension
from strawberry.types.graphql import OperationType
from backend.api.context import get_read_context, get_write_context
from backend.domain.context import ReadContext, RequestContext, WriteContext
class ContextManagerExtension(SchemaExtension):
"""
Depending on whether it's a mutation or query, replace appropriate keys in the context. At the
end, go back to the original context injected by FastAPI.
Mutating context in this way is suggested in Strawberry docs, see:
https://strawberry.rocks/docs/guides/custom-extensions#execution-context
"""
# We are managing the context in the more narrow `on_execute` hook (which wraps just the
# execution of the query/mutation, after parsing & validation have completed), instead of the
# overarching `on_operation` hook, because we need the knowledge whether the operation is a
# query or a mutation (so that we can prepare a "read" or "write" context).
@override
async def on_execute(self) -> AsyncGenerator[None, None]:
# Initially, we have just RequestContext injected by FastAPI
context = cast(RequestContext, self.execution_context.context)
def mutate_context(new_context: RequestContext | ReadContext | WriteContext) -> None:
# Clear existing keys
keys = list(context.keys())
for k in keys:
del context[k]
# Replace keys with new values
for k, v in new_context.items():
context[k] = v
request_context = copy(context)
# `self.execution_context.operation_type` relies on the parsing step having already run.
# This is why this extension uses the `on_execute` lifecycle hook.
match self.execution_context.operation_type:
case OperationType.MUTATION:
async with get_write_context(context) as write_context:
mutate_context(write_context)
try:
yield
finally:
mutate_context(request_context)
case OperationType.QUERY:
read_context = get_read_context(context)
mutate_context(read_context)
try:
yield
finally:
mutate_context(request_context)
case OperationType.SUBSCRIPTION:
raise NotImplementedError() @asynccontextmanager
async def get_write_context(
request_context: Annotated[RequestContext, Depends(get_request_context)],
) -> AsyncGenerator[WriteContext, None]:
async with request_context["session_maker"]() as session:
db_client = ReadWriteDbClient(session)
actions = build_actions(
db_client,
request_context["authentication_context"],
request_context["clients"]["event_logger"],
)
adhoc_data_loaders = build_adhoc_data_loaders(db_client)
queries = build_queries(
db_client,
request_context["authentication_context"],
request_context["clients"]["blob_store"],
actions,
)
services = build_services(
db_client,
request_context["authentication_context"],
request_context["clients"],
)
yield {
"actions": actions,
"authentication_context": request_context["authentication_context"],
"clients": request_context["clients"],
"adhoc_data_loaders": adhoc_data_loaders,
"queries": queries,
"services": services,
"db_client": db_client,
}
await session.commit()
def get_read_context(
request_context: Annotated[RequestContext, Depends(get_request_context)],
) -> ReadContext:
db_client = ReadOnlyDbClient(request_context["read_only_session_maker"])
actions = build_actions(
db_client,
request_context["authentication_context"],
request_context["clients"]["event_logger"],
)
adhoc_data_loaders = build_adhoc_data_loaders(db_client)
queries = build_queries(
db_client,
request_context["authentication_context"],
request_context["clients"]["blob_store"],
actions,
)
return {
"actions": actions,
"authentication_context": request_context["authentication_context"],
"clients": request_context["clients"],
"adhoc_data_loaders": adhoc_data_loaders,
"queries": queries,
"db_client": db_client,
} That's why we use the But thanks for the suggestion – maybe there is a decent way to leverage the web framework for it! We could use a heuristics on the body of the request to distinguish queries from mutations before Strawberry kicks in. Or always construct both kinds of contexts (though this may be inefficient).
Thanks so much for the pointer! I'll try to report it there – feels like fixing this should be valuable whichever way we go! :) |
Ok, reported this to @patrick91 – I'll play with using FastAPI for it, but I've realised that there may be a problem. If these tasks are somehow abandoned on the event loop, I'm not sure if even FastAPI would be able to wrap around them. Remember that the HTTP response is returned by that time... |
A little update from graphql-core - it was acknowledged as something the maintainer did want to address as well. :) graphql-python/graphql-core#217 (comment) So the issue is definitely real - but not yet sure when it'll be addressed. |
TL;DR
Strawberry short-circuits the HTTP response whenever there is an uncaught exception. This reduces the latency, but leads to:
(i) (a) incomplete and (b) nondeterministic responses(edit: established in the comments that it's expected)(ii) hooks being completed before some resolvers, leading to apparent violation of a contract
I wonder if it would be possible to make Strawberry run all resolves to the end, even if some of them raise uncaught exceptions?
Describe the Bug
errors
, as soon as an (edit: incoercible) exception is raised.on_execute
andon_operation
.This last point can lead to issues – it violates the invariant that
on_execute
/on_operation
lifecycle hooks wrap around all resolver executions.This can be problematic when these hooks do state management, like in the example given in Strawberry's docs. As a result, in addition to seeing the original uncaught exception in our observability suite, we have additional noise from knock-on failures – caused by premature completion of hooks.
Is this natural behaviour given how various async tasks are orchestrated, or is possible to tweak this a little? I'm thinking:
In fact, 2 may have other benefits – making the responses more (a) complete and (b) predictable. Currently, the GraphQL responses (i.e. which fields will return data and which won't) are non-deterministic (albeit a little faster thanks to the uncaught exception short-circuit).(edit: established in the comments that the short-circuiting is expected)Repro code
Schema:
Logging extension:
Example query:
Example response:
Logs demonstrating that the resolvers continue being executed after hooks complete:
System Information
0.220.0
Additional Context
Upvote & Fund
The text was updated successfully, but these errors were encountered: