Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change parsing for JSON logs #143

Merged
merged 2 commits into from
Sep 18, 2024
Merged

Change parsing for JSON logs #143

merged 2 commits into from
Sep 18, 2024

Conversation

markdboyd
Copy link

Changes Proposed

Currently, when Logstash receives an input log that is JSON, it parses the JSON properties into a field called app.

While this behavior is nice in that it allows logs to be searched by the custom properties from the JSON log (based on the detected field type for each JSON property), it has the problem that the first log received for an index will determine the type of any fields parsed from the JSON log.

In our case where all customer logs are ingested into a shared index per day, this is problematic, since subsequent JSON logs may have different field types for keys that existed on JSON logs that have already been ingested. When this occurs, Elasticsearch will reject the incoming JSON log due to the mismatch in field types.

The proposed fix here is to use a flattened type for handling dynamic fields from JSON logs on a new custom field. When you use a flattened field type, the entire object for a custom JSON log is indexed and still searchable, but only using basic search functionality.

So for a log like:

{"custom":{"foo":"bar"}}

You could search for the document like so:

custom.foo: "bar"

The only downside of flattened fields is that they can only be queried using basic search functionality, but that should be sufficient for letting users query their documents in Kibana. The enormous upside is that a flattened field can contain a complex data-structure for searching, but only takes up 1 field in your index mappings, not 1 field per custom property in the JSON logs. Also, flattened fields will not be subject to the field type mismatch issues that are causing some customer logs to fail ingestion.

So this PR:

  • Adds a default mapping for the custom field to be a flattened type
  • Updates Logstash to parse incoming JSON into the custom field

Security Considerations

There are no changes to security configuration for Elasticsearch here, just changing the destination field type for custom JSON logs.

@markdboyd markdboyd changed the title Change parsing json logs Change parsing for JSON logs Sep 18, 2024
Copy link

@JasonTheMain JasonTheMain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might work, lets give it a shot

@markdboyd markdboyd merged commit 5d706d6 into develop Sep 18, 2024
2 checks passed
@markdboyd markdboyd deleted the change-parsing-json-logs branch September 18, 2024 15:37
markdboyd added a commit that referenced this pull request Sep 19, 2024
markdboyd added a commit that referenced this pull request Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants