Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k3s crash loop nf_tables failed to verify rule exists "Error: cmp sreg undef" #11493

Open
timwhite opened this issue Dec 22, 2024 · 7 comments
Open

Comments

@timwhite
Copy link

Environmental Info:
K3s Version:

k3s version v1.31.3+k3s1 (6e6af988)
go version go1.22.8
nftables version:
nft -V
nftables v1.0.6 (Lester Gooch #5)
  cli:		editline
  json:		yes
  minigmp:	no
  libxtables:	yes

No iptables command is installed, the iptables is provided by k3s

Node(s) CPU architecture, OS, and Version:
Debian 12, x86_64
Linux me5 6.1.0-26-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.112-1 (2024-09-30) x86_64 GNU/Linux
Linux me4 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux

Cluster Configuration:
3 nodes, 1 node running server (with workloads), 2 nodes running agents. Longhorn for shared storage, as well as NFS to external NFS server.

Describe the bug:
k3s goes into a crash loop due to an error verifying a firewall rule (nft). The k3s is attempting to run does produce the error message if run manually. The error can be cleared and k3s restarted by flushing the nft tables, but returns randomly again in the future.

The main error message is of the form:

Failed to verify rule exists in FORWARD chain due to running [/var/lib/rancher/k3s/data/3b5b4bda92acada40dbf3251e7e040c82cd2a79760260b5cec3e42f7c1cd0a17/bin/aux/iptables -t filter -C FORWARD -m comment --comment kube-router netpol - TEMCG2JMHZYE7H7T -j KUBE-ROUTER-FORWARD --wait]: exit status 3: Error: cmp sreg undef

Steps To Reproduce:

  • Installed K3s: curl -sfL https://get.k3s.io | sh -s - --disable servicelb
  • Installed k3s nodes: curl -sfL https://get.k3s.io | K3S_URL=https://1.2.3.4:6443 K3S_TOKEN=XXX
  • Using metallb, with L2Advertisement, range of 9 IP addresses from LAN subnet. Advertisement is limited to a 2 of the nodes, but the issue is occurring on all nodes at random.

Expected behavior:
k3s not to crash when an iptables command fails. While this bug is probably an iptables bug, I can't find any references online for iptables crashing in this way, and k3s uses a bundled iptables. Ideally k3s should handle the error without going into a crash loop.

Actual behavior:
k3s goes into crash loop. The iptables command in the logs fails even when run manually (with quotes around the comment) until a nft flush ruleset is run.

Additional context / logs:

Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.715937  555128 volume_manager.go:289] "Starting Kubelet Volume Manager"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.716098  555128 dynamic_serving_content.go:135] "Starting controller" name="kubelet-server-cert-files::/var/lib/rancher/k3s/agent/serving-kubelet.crt::/var/lib/rancher/k3s/agent/serving-kubelet.key"
Dec 22 10:54:52 me5 k3s[555128]: E1222 10:54:52.719468  555128 kubelet.go:1478] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.723440  555128 factory.go:221] Registration of the systemd container factory successfully
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.724641  555128 desired_state_of_world_populator.go:146] "Desired state populator starts to run"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.726407  555128 factory.go:219] Registration of the crio container factory failed: Get "http://%2Fvar%2Frun%2Fcrio%2Fcrio.sock/info": dial unix /var/run/crio/crio.sock: connect: no such file or directory
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.731858  555128 reconciler.go:26] "Reconciler: start to sync state"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.738881  555128 factory.go:221] Registration of the containerd container factory successfully
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.739103  555128 kubelet_network_linux.go:50] "Initialized iptables rules." protocol="IPv4"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.741138  555128 kubelet_network_linux.go:50] "Initialized iptables rules." protocol="IPv6"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.741171  555128 status_manager.go:217] "Starting to sync pod status with apiserver"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.741193  555128 kubelet.go:2321] "Starting kubelet main sync loop"
Dec 22 10:54:52 me5 k3s[555128]: E1222 10:54:52.741250  555128 kubelet.go:2345] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Dec 22 10:54:52 me5 k3s[555128]: time="2024-12-22T10:54:52+08:00" level=info msg="Starting network policy controller version v2.2.1, built on 2024-12-05T21:29:39Z, go1.22.8"
Dec 22 10:54:52 me5 k3s[555128]: time="2024-12-22T10:54:52+08:00" level=info msg="k3s agent is up and running"
Dec 22 10:54:52 me5 k3s[555128]: I1222 10:54:52.790235  555128 network_policy_controller.go:164] Starting network policy controller
Dec 22 10:54:52 me5 systemd[1]: Started k3s-agent.service - Lightweight Kubernetes.
Dec 22 10:54:52 me5 k3s[555128]: F1222 10:54:52.796106  555128 network_policy_controller.go:400] Failed to verify rule exists in FORWARD chain due to running [/var/lib/rancher/k3s/data/3b5b4bda92acada40dbf3251e7e040c82cd2a79760260b5cec3e42f7c1cd0a17/bin/aux/iptables -t filter -C FORWARD -m comment --comment kube-router netpol - TEMCG2JMHZYE7H7T -j KUBE-ROUTER-FORWARD --wait]: exit status 3: Error: cmp sreg undef
Dec 22 10:54:52 me5 k3s[555128]: iptables v1.8.9 (nf_tables): Parsing nftables rule failed
Dec 22 10:54:52 me5 k3s[555128]: Perhaps iptables or your kernel needs to be upgraded.
Dec 22 10:54:52 me5 k3s[555128]: panic: F1222 10:54:52.796106  555128 network_policy_controller.go:400] Failed to verify rule exists in FORWARD chain due to running [/var/lib/rancher/k3s/data/3b5b4bda92acada40dbf3251e7e040c82cd2a79760260b5cec3e42f7c1cd0a17/bin/aux/iptables -t filter -C FORWARD -m comment --comment kube-router netpol - TEMCG2JMHZYE7H7T -j KUBE-ROUTER-FORWARD --wait]: exit status 3: Error: cmp sreg undef
Dec 22 10:54:52 me5 k3s[555128]: iptables v1.8.9 (nf_tables): Parsing nftables rule failed
Dec 22 10:54:52 me5 k3s[555128]: Perhaps iptables or your kernel needs to be upgraded.
Dec 22 10:54:52 me5 k3s[555128]: goroutine 686 [running]:
Dec 22 10:54:52 me5 k3s[555128]: k8s.io/klog/v2.(*loggingT).output(0xb3c6f00, 0x3, 0xc0028a97c0, 0xc00109c930, 0x1, {0x8d844fc?, 0x2?}, 0xc00189cff0?, 0x0)
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/klog/[email protected]/klog.go:965 +0x73d
Dec 22 10:54:52 me5 k3s[555128]: k8s.io/klog/v2.(*loggingT).printfDepth(0xb3c6f00, 0x3, 0xc0028a97c0, {0x0, 0x0}, 0x1, {0x68b8a66, 0x32}, {0xc0033e1c40, 0x2, ...})
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/klog/[email protected]/klog.go:767 +0x1f0
Dec 22 10:54:52 me5 k3s[555128]: k8s.io/klog/v2.(*loggingT).printf(...)
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/klog/[email protected]/klog.go:744
Dec 22 10:54:52 me5 k3s[555128]: k8s.io/klog/v2.Fatalf(...)
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/klog/[email protected]/klog.go:1655
Dec 22 10:54:52 me5 k3s[555128]: github.com/cloudnativelabs/kube-router/v2/pkg/controllers/netpol.(*NetworkPolicyController).ensureTopLevelChains.func2({0x779ca60, 0xc001564550}, {0x675dfb2, 0x7}, {0xc001317aa0, 0x6, 0x6}, {0xc0028608c0, 0x10}, 0x1)
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/kube-router/[email protected]/pkg/controllers/netpol/network_policy_controller.go:400 +0x1b2
Dec 22 10:54:52 me5 k3s[555128]: github.com/cloudnativelabs/kube-router/v2/pkg/controllers/netpol.(*NetworkPolicyController).ensureTopLevelChains(0xc00332d560)
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/kube-router/[email protected]/pkg/controllers/netpol/network_policy_controller.go:467 +0x1be9
Dec 22 10:54:52 me5 k3s[555128]: github.com/cloudnativelabs/kube-router/v2/pkg/controllers/netpol.(*NetworkPolicyController).Run(0xc00332d560, 0xc00287c120, 0xc0000d6360, 0xc002c3afd0)
Dec 22 10:54:52 me5 k3s[555128]:         /go/pkg/mod/github.com/k3s-io/kube-router/[email protected]/pkg/controllers/netpol/network_policy_controller.go:168 +0x171
Dec 22 10:54:52 me5 k3s[555128]: created by github.com/k3s-io/k3s/pkg/agent/netpol.Run in goroutine 1
Dec 22 10:54:52 me5 k3s[555128]:         /go/src/github.com/k3s-io/k3s/pkg/agent/netpol/netpol.go:184 +0xe34
Dec 22 10:54:52 me5 systemd[1]: k3s-agent.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 22 10:54:52 me5 systemd[1]: k3s-agent.service: Failed with result 'exit-code'.
Dec 22 10:54:52 me5 systemd[1]: k3s-agent.service: Unit process 146212 (containerd-shim) remains running after unit stopped.
@brandond
Copy link
Member

brandond commented Dec 22, 2024

The crash is in the kube-router netpol controller: https://github.com/cloudnativelabs/kube-router

I would suggest opening an issue there.

The error suggests that you have something else creating native nft rules that the nftables wrapper cannot parse when it is trying to modify the existing ruleset. What else are you running on this node?

@timwhite
Copy link
Author

timwhite commented Dec 22, 2024

Thanks. I think you might have pinpointed where my issues are coming from. I'm also running netbird (wireguard overlay network) on the nodes, but didn't think about it as I'm not running in kubernetes, just running it for remote access to the host.
#11415 and cloudnativelabs/kube-router#1777 sound like they are similar issues. However according to the nft tables output after just starting netbird (before kubernetes), it says it's managed with iptables-nft, so same as k3s, and so shouldn't be causing issues like the above issue is.

I'll jump into cloudnativelabs/kube-router#1777 to continue the discussion.

Here is the nft ruleset after netbird has started (different machine to kube cluster, just to see what rules netbird may be adding).

# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter {
	chain NETBIRD-RT-FWD {
		ct state related,established counter packets 0 bytes 0 accept
	}

	chain NETBIRD-ACL-INPUT {
		counter packets 0 bytes 0 accept
	}

	chain NETBIRD-ACL-OUTPUT {
		counter packets 0 bytes 0 accept
	}

	chain FORWARD {
		type filter hook forward priority filter; policy accept;
		oifname "wt0" ct state related,established counter packets 0 bytes 0 accept
		meta mark 0x0001bd01 counter packets 0 bytes 0 jump NETBIRD-ACL-INPUT
		iifname "wt0" counter packets 0 bytes 0 jump NETBIRD-RT-FWD
		iifname "wt0" counter packets 0 bytes 0 drop
	}

	chain INPUT {
		type filter hook input priority filter; policy accept;
		iifname "wt0" ct state related,established counter packets 0 bytes 0 accept
		iifname "wt0" counter packets 0 bytes 0 jump NETBIRD-ACL-INPUT
		iifname "wt0" counter packets 0 bytes 0 drop
	}

	chain OUTPUT {
		type filter hook output priority filter; policy accept;
		oifname "wt0" ct state related,established counter packets 0 bytes 0 accept
		oifname "wt0" ip daddr != 100.77.0.0/16 counter packets 0 bytes 0 accept
		oifname "wt0" counter packets 0 bytes 0 jump NETBIRD-ACL-OUTPUT
		oifname "wt0" counter packets 0 bytes 0 drop
	}
}

@timwhite
Copy link
Author

@brandond I'm trying to confirm some things for the upstream kube-router issue. It appears k3s is running an embedded kube-router, so I can't easily exec into the kube-router container and check iptables versions. Can you point me to a) a way to check the iptables version and b) if there is a way to override the iptables binary being used to debug things?

Thanks

@timwhite
Copy link
Author

Actually, I think it's just /var/lib/rancher/k3s/data/3b5b4bda92acada40dbf3251e7e040c82cd2a79760260b5cec3e42f7c1cd0a17/bin/aux/iptables on the filesystem.

Which means I'm struggling to see if this is upstream related. /var/lib/rancher/k3s/data/3b5b4bda92acada40dbf3251e7e040c82cd2a79760260b5cec3e42f7c1cd0a17/bin/aux/iptables is a symlink to xtables-nft-multi. This isn't coming from the kube-router container (as we are running an embedded kube-router) but appears to be coming from https://github.com/k3s-io/k3s-root. I think if we can build a k3s-root with iptables version 1.8.11 we can test if that fixes the issue. Am I understanding the k3s setup correctly, that k3s-root is the base distro that k3s is extracting to provide userspace tools? If so, what do I need to do to build a k3s with a k3s-root with the updated iptables?

@brandond
Copy link
Member

brandond commented Dec 28, 2024

Even the latest release of buildroot is only on iptables 1.8.10, although we do have the ability to override that version in k3s-root when setting up the buildroot env.

How confident are you that this is fixed in iptables 1.8.11?

@timwhite
Copy link
Author

How confident are you that this is fixed in iptables 1.8.11?

I'm not sure anyone is confident yet, but being able to test it would help. Hopefully this week I can find some time to try and build k3s from source and I'll try iptables 1.8.11. Building stuff from source is something I've done before, so hopefully k3s isn't too difficult.

@brandond
Copy link
Member

The problem is usually caused by things using different versions of iptables, or things using nft directly and creating rules that iptables-nft cannot parse. You might be able to address your problem just by eliminating use of other versions of iptables, and direct use of nft, so that everything is compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

No branches or pull requests

2 participants