-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat] Elasticsearch state storage for httpjson and cel inputs #41446
base: main
Are you sure you want to change the base?
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
@belimawr @cmacknz (or whoever wants/have time to be involved)
|
@leehinman I'd appreciate a review here to make sure this can co-exist with Beats receivers in agent since that would be the long term way we plan to run agentless inputs. |
filebeat/features/features.go
Outdated
|
||
// List of input types Elasticsearch state store is enabled for | ||
var esTypesEnabled = map[string]void{ | ||
"httpjson": {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be configuration instead of in the code, maybe another env var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure can do. Something like this?
AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES=httpjson,cel
/test |
type UnsubscribeFunc func() | ||
|
||
type Notifier struct { | ||
mx sync.RWMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mx sync.RWMutex | |
mx sync.Mutex |
Another perhaps overly cautious observation: The only purpose of a RWMutex
is to allow calling Notify()
concurrently. However, I don't see the benefit of that: Notify()
should finish quite fast since it just starts goroutines and more importantly, concurrent calls of Notify()
lead to a race condition between the same callback which could mean that the wrong item gets stored due to timing issues.
Changing to a simple Mutex
doesn't fix the race condition but there's no reason for that RW
there.
The only purpose of a `RWMutex` is to allow calling `Notify()` concurrently. However, I don't see the benefit of that: `Notify()` should finish quite fast since it just starts goroutines and more importantly, concurrent calls of `Notify()` lead to a race condition between the same callback which _could_ mean that the wrong item gets stored due to timing issues. Changing to a simple `Mutex` doesn't fix the race condition but there's no reason for that `RW` there.
This pull request is now in conflicts. Could you fix it? 🙏
|
/test |
/test |
Proposed commit message
[filebeat] Elasticsearch state storage for httpjson input
This is a POC for Elasticsearch as State Store Backend for Security Integrations for Agentless solution.
The scope of this change was narrowed down to supporting only
httpjson
inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before.This is a short term solution for the state storage for k8s environment.
This is the first cut and the details can change depending on the feedback.
Current feature currently could be enabled
AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED
, to be decided how this would be configurable in k8s.This change currently contains the hacky approach to the
AGENTLESS_ELASTICSEARCH_APIKEY
overwrite. This allows to the user to provide the ApiKey with elevated permissions that are required in order to be able to create/write/read the state index per input. THIS IS FOR DEVELOPMENT/TESTING ONLY. REMOVE BEFORE THE MERGE.The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of
httpjson
input to the time when the actual configuration is received from the Agent.There is an assumption that the index template for the state storage indices is already in place before the storage is used
Example of the state storage index content for Okta integration:
The naming convention for all state store is
agentless-state-<input id>
, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral.Currently in order to run the agent with Elasticsearch state storage a couple of environment variables would be required:
where the ApiKey in the
DEPENDENCIES / TODOS:
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
The change should have no impact, and without the feature enabled the filebeat should work as before using the file system storage for the state.