Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefect automation not being triggered #16545

Open
samlawler-entain opened this issue Dec 30, 2024 · 2 comments
Open

Prefect automation not being triggered #16545

samlawler-entain opened this issue Dec 30, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@samlawler-entain
Copy link

samlawler-entain commented Dec 30, 2024

Bug summary

I have a basic automation that has a trigger of:
When any flow run stays in Suspended, Paused, Pending, Running, Retrying for 10 hours
The action changes the flow run state to be Failed - likely irrelevant to the issue.

When this automation was first made (~4 months ago), this had worked a few times. However since then I had discovered that it had not been triggering despite having flow runs that met the conditions of the automation.

I have attached 3 screenshots (in order):

  • The automation trigger settings
  • An example of a flow run that was in a running state for almost 2 days which should have been triggered the automation. I had manually cancelled this.
  • The events of that automation covering the timeframe of the mentioned example. Showing that the automation was not triggered during that timeframe.

Automation Id: 1fa3621d-0371-4e19-862a-dbf9969cce94
Example flow run id: 49beefa3-5e14-4baa-a5ad-1242ddce9724

Therefore, the issue being raised is the automation is not being triggered, despite having multiple flow runs that would have triggered this automation if working correctly.

It should also be noted that i have other automations running, which execute as expected. It's only this automation with these conditions that i am experiencing this issue with.

image (15)
image (16)
image (17)

Version info

Version:             2.15.0
API version:         0.8.4
Python version:      3.11.9
Git commit:          e90804b8
Built:               Thu, Feb 15, 2024 2:54 PM
OS/Arch:             win32/AMD64
Profile:             default
Server type:         cloud

Additional context

No response

@samlawler-entain samlawler-entain added the bug Something isn't working label Dec 30, 2024
@jeanluciano
Copy link
Contributor

Hi @samlawler-entain, reproduced the issue. Looking further into it.

@jeanluciano
Copy link
Contributor

Okay after further looking into it, it seems like UI could be misleading here. It creates an event triggers that would trigger if the events happen in the within timeframe. In this case:

  • PENDING event is seen, we open a 10 hour bucket
  • RUNNING event is seen soon after, it counts towards the 10 hour bucket (because both are defined in after)
  • We now close the bucket cause we've seen another event (when we're looking to NOT see another event to detect a stuck run)

If a user needs an automation to trigger for any state, what is needed is a composite_trigger. i.e:

{
    "type": "compound",
    "require": "any",
    "triggers": [
        {
            "type": "event",
            "after": ["prefect.flow-run.Running"],  # Only start watching after RUNNING
            "expect": ["prefect.flow-run.*"],       # Watch for ANY state change
            "within": 36000.0.0,                    # Look for 10 hours
            "posture": "Proactive",
            "threshold": 1,                         # Fire if no events seen
            "for_each": ["prefect.resource.id"],
            "match": {
                "prefect.resource.id": "prefect.flow-run.*"
            }
        },
        {
            "type": "event", 
            "after": ["prefect.flow-run.Pending"],  # Only start watching after PENDING
            "expect": ["prefect.flow-run.*"],       # Watch for ANY state change
            "within": 36000.0.0,                    # Look for 10 hours
            "posture": "Proactive",
            "threshold": 1,                         # Fire if no events seen
            "for_each": ["prefect.resource.id"],
            "match": {
                "prefect.resource.id": "prefect.flow-run.*"
            }
        },
  {
            "type": "event", 
            "after": ["prefect.flow-run.Suspended"],  # Only start watching after SUSPENDED
            "expect": ["prefect.flow-run.*"],       # Watch for ANY state change
            "within": 36000.0.0,                    # Look for 10 hours
            "posture": "Proactive",
            "threshold": 1,                         # Fire if no events seen
            "for_each": ["prefect.resource.id"],
            "match": {
                "prefect.resource.id": "prefect.flow-run.*"
            }
        },
  {
            "type": "event", 
            "after": ["prefect.flow-run.Retrying"],  # Only start watching after RETRYING
            "expect": ["prefect.flow-run.*"],       # Watch for ANY state change
            "within": 36000.0.0,                    # Look for 10 hours
            "posture": "Proactive",
            "threshold": 1,                         # Fire if no events seen
            "for_each": ["prefect.resource.id"],
            "match": {
                "prefect.resource.id": "prefect.flow-run.*"
            }
        },

    ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants