Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stdlib): Add main log format to parse_nginx_log function #1202

Merged

Conversation

virtualtam
Copy link
Contributor

@virtualtam virtualtam commented Jan 2, 2025

Summary

This PR adds support for the main log format to parse_nginx_logs, which is defined by the default configuration file for official Nginx Docker images.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

  • New test cases have been added to cover the main log format for Nginx log lines
  • Test input correspond to actual log lines emitted by Nginx when:
    • accessed directly (client -> Nginx)
    • accessed behind a single reverse proxy (client -> proxy -> Nginx)
    • accessed behind a chain of two reverse proxies (client -> proxy1 -> proxy2 -> Nginx)

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on
    our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Our CONTRIBUTING.md is a good starting place.
  • If this PR introduces changes to LICENSE-3rdparty.csv, please
    run dd-rust-license-tool write and commit the changes. More details here.
  • For new VRL functions, please also create a sibling PR in Vector to document the new function.

PR adding documentation: vectordotdev/vector#22119

References

Closes #1201

Notes for reviewers

@virtualtam virtualtam force-pushed the feature/stdlib/nginx-format-main branch from ccbe547 to 6996298 Compare January 2, 2025 18:17
@pront pront self-assigned this Jan 2, 2025
@jszwedko jszwedko added the vrl: stdlib Changes to the standard library label Jan 2, 2025
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @virtualtam! This mostly looks good. There's a failing test though.

@virtualtam virtualtam force-pushed the feature/stdlib/nginx-format-main branch from 6996298 to 39802b6 Compare January 2, 2025 21:43
@virtualtam
Copy link
Contributor Author

virtualtam commented Jan 2, 2025

Thanks for the feedback @pront 👍

I have fixed the failing test, and the CI shows two unrelated failed tests (that seem to depend on the year the tests are being run?):

./scripts/vrl_tests.sh

# [...]

functions/parse_klog
  valid.......................................................FAILED (expectation)  {
  "file": "klog.go",
  "id": 28133,
  "level": "info",
  "line": 70,
  "message": "hello from klog",
   "timestamp": "2024-05-05T17:59:40.692994Z"
   "timestamp": "2025-05-05T17:59:40.692994Z"
}

functions/parse_linux_authorization
  parse authorization event...................................FAILED (expectation)  {
  "appname": "sshd",
  "hostname": "localhost",
  "message": "Accepted publickey for eng from 10.1.1.1 port 8888 ssh2: RSA SHA256:foobar",
  "procid": 1111,
   "timestamp": "2024-03-23T01:49:58Z"
   "timestamp": "2025-03-23T01:49:58Z"
}

EDIT: The two year-dependent failed tests were previously discussed in #8 and most recently bumped in #1203, so I'll just rebase this PR :)

@virtualtam virtualtam force-pushed the feature/stdlib/nginx-format-main branch from 39802b6 to 022a5cf Compare January 2, 2025 21:53
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @virtualtam, this looks good. I would also add one more test case for invalid input.

@virtualtam virtualtam force-pushed the feature/stdlib/nginx-format-main branch from 022a5cf to e58d4b8 Compare January 3, 2025 17:01
@virtualtam
Copy link
Contributor Author

virtualtam commented Jan 3, 2025

Thanks for the review @pront 👍

I have updated the PR with the following changes:

  • update the example to feature a log line with all fields set
  • add a test case with all fields set
  • add a test case with an "invalid" log line (attempt to parse an error line as a main line)

Additionally, I have signed the Vector CLA and will proceed to open a PR on the Vector repository to update usage documentation for the parse_nginx_log remap function 📖

EDIT: PR adding documentation: vectordotdev/vector#22119

@pront
Copy link
Member

pront commented Jan 3, 2025

Thanks for the review @pront 👍

I have updated the PR with the following changes:

  • update the example to feature a log line with all fields set
  • add a test case with all fields set
  • add a test case with an "invalid" log line (attempt to parse an error line as a main line)

Additionally, I have signed the Vector CLA and will proceed to open a PR on the Vector repository to update usage documentation for the parse_nginx_log remap function 📖

EDIT: PR adding documentation: vectordotdev/vector#22119

Awesome, thank you!

💭 It would be preferable if you don't force push to the branch in the future so I can do incremental reviews. But all good, I will schedule this PR for merging soon.

@pront pront closed this Jan 3, 2025
auto-merge was automatically disabled January 3, 2025 17:37

Pull request was closed

@pront pront reopened this Jan 3, 2025
@pront pront enabled auto-merge January 3, 2025 17:37
@pront pront added this pull request to the merge queue Jan 3, 2025
@virtualtam
Copy link
Contributor Author

It would be preferable if you don't force push to the branch in the future so I can do incremental reviews.

I'm sorry for that, point taken for future PRs 👍

Maybe the incremental review workflow could be documented as part of the CONTRIBUTING guidelines for VRL/Vector to help new contributors?

@pront
Copy link
Member

pront commented Jan 3, 2025

Maybe the incremental review workflow could be documented as part of the CONTRIBUTING guidelines for VRL/Vector to help new contributors?

Good point, I will add this to the next batch of improvements. Thanks again @virtualtam!

Merged via the queue into vectordotdev:main with commit 4574f8d Jan 3, 2025
29 of 30 checks passed
@virtualtam virtualtam deleted the feature/stdlib/nginx-format-main branch January 3, 2025 18:18
pront pushed a commit that referenced this pull request Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vrl: stdlib Changes to the standard library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VRL: add main log format for parse_nginx_log
3 participants