-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k3s crash loop nf_tables failed to verify rule exists "Error: cmp sreg undef" #11493
Comments
The crash is in the kube-router netpol controller: https://github.com/cloudnativelabs/kube-router I would suggest opening an issue there. The error suggests that you have something else creating native nft rules that the nftables wrapper cannot parse when it is trying to modify the existing ruleset. What else are you running on this node? |
Thanks. I think you might have pinpointed where my issues are coming from. I'm also running netbird (wireguard overlay network) on the nodes, but didn't think about it as I'm not running in kubernetes, just running it for remote access to the host. I'll jump into cloudnativelabs/kube-router#1777 to continue the discussion. Here is the nft ruleset after netbird has started (different machine to kube cluster, just to see what rules netbird may be adding).
|
@brandond I'm trying to confirm some things for the upstream kube-router issue. It appears k3s is running an embedded kube-router, so I can't easily exec into the kube-router container and check iptables versions. Can you point me to a) a way to check the iptables version and b) if there is a way to override the iptables binary being used to debug things? Thanks |
Actually, I think it's just Which means I'm struggling to see if this is upstream related. |
Even the latest release of buildroot is only on iptables 1.8.10, although we do have the ability to override that version in k3s-root when setting up the buildroot env. How confident are you that this is fixed in iptables 1.8.11? |
I'm not sure anyone is confident yet, but being able to test it would help. Hopefully this week I can find some time to try and build k3s from source and I'll try iptables 1.8.11. Building stuff from source is something I've done before, so hopefully k3s isn't too difficult. |
The problem is usually caused by things using different versions of iptables, or things using nft directly and creating rules that iptables-nft cannot parse. You might be able to address your problem just by eliminating use of other versions of iptables, and direct use of nft, so that everything is compatible. |
Environmental Info:
K3s Version:
No iptables command is installed, the iptables is provided by k3s
Node(s) CPU architecture, OS, and Version:
Debian 12, x86_64
Linux me5 6.1.0-26-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.112-1 (2024-09-30) x86_64 GNU/Linux
Linux me4 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
Cluster Configuration:
3 nodes, 1 node running server (with workloads), 2 nodes running agents. Longhorn for shared storage, as well as NFS to external NFS server.
Describe the bug:
k3s goes into a crash loop due to an error verifying a firewall rule (nft). The k3s is attempting to run does produce the error message if run manually. The error can be cleared and k3s restarted by flushing the nft tables, but returns randomly again in the future.
The main error message is of the form:
Steps To Reproduce:
curl -sfL https://get.k3s.io | sh -s - --disable servicelb
curl -sfL https://get.k3s.io | K3S_URL=https://1.2.3.4:6443 K3S_TOKEN=XXX
Expected behavior:
k3s not to crash when an iptables command fails. While this bug is probably an iptables bug, I can't find any references online for iptables crashing in this way, and k3s uses a bundled iptables. Ideally k3s should handle the error without going into a crash loop.
Actual behavior:
k3s goes into crash loop. The iptables command in the logs fails even when run manually (with quotes around the comment) until a
nft flush ruleset
is run.Additional context / logs:
The text was updated successfully, but these errors were encountered: