Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Show reason for Agent / Endpoint uninstallation #197731

Open
Tracked by #484
ycombinator opened this issue Oct 24, 2024 · 13 comments · May be fixed by #205815
Open
Tracked by #484

[Fleet] Show reason for Agent / Endpoint uninstallation #197731

ycombinator opened this issue Oct 24, 2024 · 13 comments · May be fixed by #205815
Assignees
Labels
good first issue low hanging fruit Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@ycombinator
Copy link
Contributor

Describe the feature:

With the work @michel-laterman has done for elastic/elastic-agent#484, Agent will send Fleet Server a reason for uninstalling when it is uninstalled. In the future, Endpoint will do the same. These components do this by calling the POST /api/fleet/agents/:id/audit/unenroll Fleet Server API. Note that this is a best effort API call.

We should surface this reason in the Fleet UI, which would be an improvement over what happens today when an Agent is uninstalled: the Agent leaves "offline" entries in the UI.

Describe a specific use case for the feature:

Clarifying the Agent listing in the Fleet UI by distinguishing between Agents that are active but currently offline with Agents that have been uninstalled.

@ycombinator ycombinator added the Team:Fleet Team label for Observability Data Collection Fleet team label Oct 24, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@juliaElastic
Copy link
Contributor

Should we introduce a new status like "orphaned" or "uninstalled" or include them in "unenrolled"? Leaving the agents offline is misleading, there is a bug logged because those agents can't be unenrolled: #197180

@jlind23
Copy link
Contributor

jlind23 commented Oct 25, 2024

"uninstalled" would definitely make sense in this case but why would we need "orphaned"?

@intxgo
Copy link
Contributor

intxgo commented Oct 25, 2024

We need to add Endpoint orphaned please!

Agent is neither Tamper Protected against admin user, neither can guarantee delivering the audit about uninstallation. However Tamper Protected Endpoint won't give up easily 🙂 so in most cases Agent will be surprisingly lost whilst Endpoint will keep running and protection the machine albeit invisible in Fleet 🙁 so the orphaned status will clearly signal the need to fix Agent on those machines.

PS. 8.16.0 Endpoint is already sending the orphaned audit as agreed with @michel-laterman

@Supplementing
Copy link
Contributor

Im working on this now, using the audit_unenroll_reason passed back to determine what agents have been uninstalled. I have a few questions but not sure who exactly to ask for each. @juliaElastic @jlind23

  1. For clarification, in the UI, do we want the chip to show 'Orphaned' or 'Uninstalled'? I see both mentioned above but not sure which was decided on. Presumably uninstalled as its more clear.
  2. Also to further clarify, we will want the status dropdown as well as the legend to reflect the same correct?
  3. Im assuming we will want some CTA letting the user know that the agent needs to be fixed/investigated/etc. Do we want to use the tooltip for that or have some other mechanism?
  4. As for the color coding, should we use danger or warning (or something else)?

@cmacknz
Copy link
Member

cmacknz commented Jan 6, 2025

For clarification, in the UI, do we want the chip to show 'Orphaned' or 'Uninstalled'? I see both mentioned above but not sure which was decided on. Presumably uninstalled as its more clear.

'Orphaned" and "Uninstalled" are two distinct states:

  • Uninstalled: someone executed elastic-agent uninstall successfully and the agent is offline because it was removed.
  • Orphaned: this is an error state where the elastic-agent service is removed but the endpoint-security service was not. When Elastic Defend is used, the part that provides endpoint protection is a distinct service that can be left behind if things go wrong in the "right" way. This is a consequence of the tamper protection implementation. See https://www.elastic.co/guide/en/security/current/agent-tamper-protection.html.

Im assuming we will want some CTA letting the user know that the agent needs to be fixed/investigated/etc

Only for the orphaned state IMO. An uninstall happening is not necessarily notable if it was on purpose. The orphaned state is always an error condition.

@kpollich
Copy link
Member

kpollich commented Jan 7, 2025

Also to further clarify, we will want the status dropdown as well as the legend to reflect the same correct?

When we say "legend" what do we mean here?

As for the color coding, should we use danger or warning (or something else)?

I think we have a specific set of color mappings for the status badges defined here that we'd want to expand:

export function getColorForAgentStatus(
agentStatus: SimplifiedAgentStatus,
euiTheme: EuiThemeComputed<{}>
): string {
switch (agentStatus) {
case 'healthy':
return euiTheme.colors.backgroundFilledSuccess;
case 'offline':
return euiTheme.colors.lightShade;
case 'inactive':
return euiTheme.colors.darkShade;
case 'unhealthy':
return euiTheme.colors.backgroundFilledWarning;
case 'updating':
return euiTheme.colors.backgroundFilledPrimary;
case 'unenrolled':
return euiTheme.colors.backgroundBaseDisabled;
default:
throw new Error(`Unsupported Agent status ${agentStatus}`);
}
}

Maybe @jillguyonnet or @criamico or @jen-huang would be able to offer an opinion on what we should do for color choices here? I wonder if we should just have uninstalled display with the same color as offline and orphaned display the same as unhealthy.

@Supplementing
Copy link
Contributor

By legend, I meant this part above the table with the statuses and colors: Image

I had the same thought on the colors after I commented, but wanted to hold tight for more insight.

@jillguyonnet
Copy link
Contributor

Semantically, it makes sense to me to use the same colour for uninstalled and offline, and the same for unhealthy and orphaned. The colours should match between the status chip ("legend") and the badge.

FYI, I have filed a followup issue to the EUI visual refresh work for deprecations cleanup, part of which is the colours used for agent status in Fleet (Replace deprecated color tokens section). This is to draw attention to the fact that some of the current tokens are deprecated. offline, in particular, uses the deprecated lightShade token for the status chip, which is intended for borders (+ there is a slight mismatch with the corresponding badge colour).

The current colour use is:

Status Status chip Status badge
Healthy euiTheme.colors.backgroundFilledSuccess success
Updating euiTheme.colors.backgroundFilledPrimary primary
Offline euiTheme.colors.lightShade default
Inactive euiTheme.colors.darkShade euiVars.euiColorDarkShade
Unenrolled euiTheme.colors.backgroundBaseDisabled euiVars.euiColorDisabled
Unhealthy euiTheme.colors.backgroundFilledWarning warning

So offline, inactive and unenrolled have different shades of grey.

I don't think any of this should affect this change, but we might eventually want to revise this.

@Supplementing
Copy link
Contributor

Supplementing commented Jan 7, 2025

Thanks @jillguyonnet! Do we want the status filter to have uninstalled and orphaned as their own filters and also be broken out in the 'legend' as well? Since they're technically all considered 'offline' (based on the status), they're all falling into that bucket and being counted under the 'legend' as just offline which might be misleading. Since uninstalled and orphaned are derived from the new audit_unenroll_reason field rather than status, the current implementation doesn't account for it.

@kpollich
Copy link
Member

kpollich commented Jan 7, 2025

Do we want the status filter to have uninstalled and orphaned as their own filters and also be broken out in the 'legend' as well?

I'm a little worried about screen real estate if we add two new status to this list, but they should probably appear in that list for now so there's not an inconsistent treatment of certain statuses.

@Supplementing
Copy link
Contributor

Dropping this here to get eyes on, this is how it looks with all the statuses added. Not horrible, but could certainly be improved. Im going to work on hiding the labels that have zero matches to clean it up a bit per @kpollich recommendation. Image

@Supplementing Supplementing linked a pull request Jan 7, 2025 that will close this issue
7 tasks
@jillguyonnet
Copy link
Contributor

Since they're technically all considered 'offline' (based on the status), they're all falling into that bucket

I agree it would be misleading to group three status under one, especially if the new status are shown on agent badges (1 offline agent + 1 orphaned agent -> 2 offline agents in the status summary, not good). orphaned, in particular, seems semantically related to unhealthy rather than offline.

I personally like the clarity of all status being explicitly represented and available for filtering. Real estate is definitely a valid concern for the status summary though, hiding status with zero agents would certainly help with that. Perhaps it could be worth involving UX folks on this question (for future improvement), since there is also the question of the deprecated non semantic colour tokens. Some of these shades of grey are a little difficult to tell apart. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue low hanging fruit Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants