-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Starting EKS Anywhere on VMware, etcd fails to start #9047
Comments
/usr/bin/docker` exec -i eksa_1733493551451751668 kubectl get --ignore-not-found -o json --kubeconfig kn01/generated/kn01.kind.kubeconfig Cluster.v1alpha1.anywhere.eks.amazonaws.com --namespace default kn01 { |
Could you share your cluster config? And could you also validate that this |
I have not kubeconfig. I am relying on eks anywhere to write one. Maybe tht was a first mistake? I am a noob with kubernetes and was hoping that the config was written for me. I will attempt to figure out how to write one. That directory did not exist, but now I made it and I am trying the whole procedure again. |
pre-creating the directory did not work :-( .. I will try to figur eout the kubeconfig |
@tcooperma - seems like it could be similar to my issue in #9040 . Are any of the nodes spinning up in VMware - can you look at the console and see if it has any errors similar to: "Container kubeadm-bootstrap exited with non-zero status" |
I tried the following kubeconfig file which did not work There are no errors on the spun-up VMware machine kn01-etcd-9b7dp I cannot figure out what ip number it's onto log into it, I could attached the screen shot, but it does not show anything.. going to try to figure that out
|
If you dont have any IPs in vSphere, it sounds like the VMs are not being assigned any. Do you have DHCP dishing out IPs to the VMs? It is a requirement for EKS-A. |
I just do not know how to get the ip numbers which I assume you get from eksctl or kubectl |
So if the node is up it should be in the Virtual Machine Details in vSphere - if there is no IP, it sounds like it is not being assigned an IP and maybe the reason why it is failing. |
I am a noob with VMware also .. I am fairly sure it's getting an IP number just cannot figure it out |
I figured that out and I am on the machine which has an IP number 10.0.0.101 |
In vCenter, it you click on the etcd host, there should be a details page whcih has the IP address in it. If the node doesnt have an IP, then that is likely the reason why it is failing to join. If it is just the one host this would make sense as the other hosts would need that IP to join the quorum. |
What can I look at on the bottle rocket OS for etcd? Is there a log file somewhere? BTW, thanks for all the help |
https://repost.aws/knowledge-center/eks-anywhere-etcdadm-controller-issues You will need to follow the steps for logging in using the ssh key and then you should be able to access the logs. It is detailed in the Check VM Logs |
I followed the page the b est I could. mcooper@ubuntu-server-2404:~$ kubectl -n etcdadm-bootstrap-provider-system logs etcdadm-bootstrap-provider-controller-8hc97 --kubeconfig kubeconfig.yaml Please enter Username: Administrator Please enter Password: E1209 17:43:35.433035 108342 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://vmcenter.local:8500/api?timeout=32s\": dial tcp: lookup vmcenter.local on 127.0.0.53:53: server misbehaving" E1209 17:43:35.499930 108342 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://vmcenter.local:8500/api?timeout=32s\": dial tcp: lookup vmcenter.local on 127.0.0.53:53: server misbehaving" E1209 17:43:35.648916 108342 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://vmcenter.local:8500/api?timeout=32s\": dial tcp: lookup vmcenter.local on 127.0.0.53:53: server misbehaving" E1209 17:43:35.730851 108342 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://vmcenter.local:8500/api?timeout=32s\": dial tcp: lookup vmcenter.local on 127.0.0.53:53: server misbehaving" Unable to connect to the server: dial tcp: lookup vmcenter.local on 127.0.0.53:53: server misbehaving tmcooper@ubuntu-server-2404:~$ The bottle rocket etcd machine has no usable logs [ec2-user@admin]$ ls -l /var/log total 296 -rw-------. 1 root utmp 0 Oct 9 21:09 btmp -rw-r--r--. 1 root root 292292 Dec 9 17:38 lastlog -rw-------. 1 root root 0 Oct 9 21:09 tallylog -rw-rw-r--. 1 root utmp 1152 Dec 9 17:38 wtmp -rw-------. 1 root root 2871 Oct 9 21:10 yum.log |
so it is doubtful you will be able to connect to the cluster using kubectl as it would need the control plane nodes for the api. As for the node - did you follow the instructions in the post? You need to run Once you are in there I would recommend running |
It looks like the etcd is starting OK in its log. the journal has no errors in it. Dec 10 18:44:57 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: Waiting for etcd static pods Dec 10 18:44:57 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: Running etcdadm init health phase Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: Phase command output: Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: -------- Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: time="2024-12-10T18:44:57Z" level=info msg="[health] Checking local etcd endpoint health" Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: time="2024-12-10T18:45:06Z" level=info msg="[health] Local etcd endpoint is healthy" Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: -------- Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: Bottlerocket bootstrap was successful. Disabled bootstrap container Dec 10 18:45:06 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: time="2024-12-10T18:45:06Z" level=info msg="container task exited" code=0 Dec 10 18:45:07 kn01-etcd-z5hbt host-containers@kubeadm-bootstrap[1849]: time="2024-12-10T18:45:07Z" level=info msg="received signal: terminated" |
Might be worth trying to do a `systemctl list-units’ To see if any other processes have failed. Have you got any other etcd nodes or is it just the one? |
I have 3 etcd machines every time this starts up I will check the logs on all 3, but I suspect that it will not be different .. I cannot tell what is normal with the list-units, but it looks like then are all ready for processing UNIT LOAD ACTIVE SUB DESCRIPTION sys-devices-pci0000:00-0000:00:15.0-0000:03:00.0-net-eth0.device loaded active plugged /sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/net/eth0 sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p1.device loaded active plugged VMware Virtual NVMe Disk BIOS-BOOT sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p10.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-HASH-B sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p11.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-RESERVED-B sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p12.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-PRIVATE sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p13.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p2.device loaded active plugged VMware Virtual NVMe Disk EFI-SYSTEM sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p3.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-BOOT-A sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p4.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-ROOT-A sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p5.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-HASH-A sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p6.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-RESERVED-A sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p7.device loaded active plugged VMware Virtual NVMe Disk EFI-BACKUP sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p8.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-BOOT-B sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1-nvme0n1p9.device loaded active plugged VMware Virtual NVMe Disk BOTTLEROCKET-ROOT-B sys-devices-pci0000:00-0000:00:16.0-0000:0b:00.0-nvme-nvme0-nvme0n1.device loaded active plugged VMware Virtual NVMe Disk sys-devices-platform-serial8250-tty-ttyS0.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS0 sys-devices-platform-serial8250-tty-ttyS1.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS1 sys-devices-platform-serial8250-tty-ttyS2.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS2 sys-devices-platform-serial8250-tty-ttyS3.device loaded active plugged /sys/devices/platform/serial8250/tty/ttyS3 sys-devices-virtual-block-dm\x2d0.device loaded active plugged /sys/devices/virtual/block/dm-0 sys-devices-virtual-block-loop0.device loaded active plugged /sys/devices/virtual/block/loop0 sys-devices-virtual-block-loop1.device loaded active plugged /sys/devices/virtual/block/loop1 sys-module-configfs.device loaded active plugged /sys/module/configfs sys-module-fuse.device loaded active plugged /sys/module/fuse sys-subsystem-net-devices-eth0.device loaded active plugged /sys/subsystem/net/devices/eth0 -.mount loaded active mounted Root Mount boot.mount loaded active mounted /boot dev-hugepages.mount loaded active mounted Huge Pages File System dev-mqueue.mount loaded active mounted POSIX Message Queue File System etc-cni.mount loaded active mounted CNI Configuration Directory (/etc/cni) etc-containerd.mount loaded active mounted Containerd Configuration Directory (/etc/containerd) etc-host\x2dcontainers.mount loaded active mounted Host containers Configuration Directory (/etc/host-containers) etc-kubernetes-pki-private.mount loaded active mounted Kubernetes PKI private directory (/etc/kubernetes/pki/private) local-mnt.mount loaded active mounted local-mnt.mount local-opt.mount loaded active mounted local-opt.mount local-var.mount loaded active mounted local-var.mount local.mount loaded active mounted Local Directory (/local) mnt.mount loaded active mounted Mnt Directory (/mnt) opt-cni.mount loaded active mounted CNI Plugin Directory (/opt/cni) opt-csi.mount loaded active mounted CSI Helper Directory (/opt/csi) opt.mount loaded active mounted Opt Directory (/opt) root-.aws.mount loaded active mounted AWS configuration directory (/root/.aws) run-containerd-io.containerd.grpc.v1.cri-sandboxes-6fa742b1bfcf08582901f94e49a43cae0f730c83bf27f531f665358a4addb92f-shm.mount loaded active mounted /run/containerd/io.containerd.grpc.v1.cri/sandboxes/6fa742b1bfcf08582901f94e49a43cae0f730c83bf27f531f665358a4addb92f/shm run-containerd-io.containerd.runtime.v2.task-k8s.io-6fa742b1bfcf08582901f94e49a43cae0f730c83bf27f531f665358a4addb92f-rootfs.mount loaded active mounted /run/containerd/io.containerd.runtime.v2.task/k8s.io/6fa742b1bfcf08582901f94e49a43cae0f730c83bf27f531f665358a4addb92f/rootfs run-containerd-io.containerd.runtime.v2.task-k8s.io-b6a8b1cc23aedb31da386bb23be4f2503a19afdab92822b9d35b935f95c1bb59-rootfs.mount loaded active mounted /run/containerd/io.containerd.runtime.v2.task/k8s.io/b6a8b1cc23aedb31da386bb23be4f2503a19afdab92822b9d35b935f95c1bb59/rootfs run-credentials-systemd\x2dsysctl.service.mount loaded active mounted /run/credentials/systemd-sysctl.service run-credentials-systemd\x2dsysusers.service.mount loaded active mounted /run/credentials/systemd-sysusers.service run-credentials-systemd\x2dtmpfiles\x2dsetup.service.mount loaded active mounted /run/credentials/systemd-tmpfiles-setup.service run-credentials-systemd\x2dtmpfiles\x2dsetup\x2ddev.service.mount loaded active mounted /run/credentials/systemd-tmpfiles-setup-dev.service run-host\x2dcontainerd-io.containerd.runtime.v2.task-default-admin-rootfs.mount loaded active mounted /run/host-containerd/io.containerd.runtime.v2.task/default/admin/rootfs run-netdog.mount loaded active mounted Ephemeral netdog configuration directory sys-fs-fuse-connections.mount loaded active mounted FUSE Control File System sys-kernel-config.mount loaded active mounted Kernel Configuration File System sys-kernel-debug.mount loaded active mounted Kernel Debug File System sys-kernel-tracing.mount loaded active mounted Kernel Trace File System tmp.mount loaded active mounted Temporary Directory /tmp var-lib-bottlerocket.mount loaded active mounted Private Directory (/var/lib/bottlerocket) var-lib-kernel\x2ddevel-.overlay-lower.mount loaded active mounted Kernel Development Sources (Read-Only) var.mount loaded active mounted Var Directory (/var) x86_64\x2dbottlerocket\x2dlinux\x2dgnu-sys\x2droot-usr-lib-modules.mount loaded active mounted Kernel Modules (Read-Write) x86_64\x2dbottlerocket\x2dlinux\x2dgnu-sys\x2droot-usr-share-licenses.mount loaded active mounted License files x86_64\x2dbottlerocket\x2dlinux\x2dgnu-sys\x2droot-usr-src-kernels.mount loaded active mounted Kernel Development Sources (Read-Write) cri-containerd-6fa742b1bfcf08582901f94e49a43cae0f730c83bf27f531f665358a4addb92f.scope loaded active running libcontainer container 6fa742b1bfcf08582901f94e49a43cae0f730c83bf27f531f665358a4addb92f cri-containerd-b6a8b1cc23aedb31da386bb23be4f2503a19afdab92822b9d35b935f95c1bb59.scope loaded active running libcontainer container b6a8b1cc23aedb31da386bb23be4f2503a19afdab92822b9d35b935f95c1bb59 init.scope loaded active running System and Service Manager acpid.service loaded active running ACPI event daemon activate-configured.service loaded active exited Isolates configured.target activate-multi-user.service loaded active exited Isolates multi-user.target apiserver.service loaded active running Bottlerocket API server audit-rules.service loaded active exited Load audit rules bootstrap-commands.service loaded active exited Bootstrap Commands chronyd.service loaded active running A versatile implementation of the Network Time Protocol containerd.service loaded active running containerd container runtime dbus-broker.service loaded active running D-Bus System Message Bus disable-kexec-load.service loaded active exited Disable kexec load syscalls disable-udp-offload.service loaded active exited Disables UDP offload generate-network-config.service loaded active exited Generate network configuration has-boot-ever-succeeded.service loaded active exited Checks and marks if boot has ever succeeded before host-containerd.service loaded active running containerd runtime for host containers [email protected] loaded active running Host container: admin kmod-static-nodes.service loaded active exited Create List of Static Device Nodes kubelet.service loaded active running Kubelet ldconfig.service loaded active exited Rebuild Dynamic Linker Cache load-crash-kernel.service loaded active exited Load crash kernel mark-successful-boot.service loaded active exited Call signpost to mark the boot as successful after all required targets are met. mask-local-mnt.service loaded active exited Mask Local Mnt Directory (/local/mnt) mask-local-opt.service loaded active exited Mask Local Opt Directory (/local/opt) mask-local-var.service loaded active exited Mask Local Var Directory (/local/var) migrator.service loaded active exited Bottlerocket data store migrator [email protected] loaded active exited Load Kernel Module configfs [email protected] loaded active exited Load Kernel Module drm modprobe@efi_pstore.service loaded active exited Load Kernel Module efi_pstore [email protected] loaded active exited Load Kernel Module fuse prepare-boot.service loaded active exited Prepare Boot Directory (/boot) prepare-local-fs.service loaded active exited Prepare Local Filesystem (/local) prepare-opt.service loaded active exited Prepare Opt Directory (/opt) prepare-var-lib-containerd.service loaded active exited Prepare Containerd Directory (/var/lib/containerd) prepare-var-lib-kubelet.service loaded active exited Prepare Kubelet Directory (/var/lib/kubelet) prepare-var.service loaded active exited Prepare Var Directory (/var) repart-local.service loaded active exited Resize Data Partition selinux-policy-files.service loaded active exited Copy SELinux policy files send-boot-success.service loaded active exited Send boot success set-hostname.service loaded active exited Sets the hostname settings-applier.service loaded active exited Applies settings to create config files storewolf.service loaded active exited Datastore creator sundog.service loaded active exited User-specified setting generators systemd-journal-flush.service loaded active exited Flush Journal to Persistent Storage systemd-journald.service loaded active running Journal Service systemd-logind.service loaded active running User Login Management systemd-machine-id-commit.service loaded active exited Commit a transient machine-id on disk systemd-modules-load.service loaded active exited Load Kernel Modules systemd-network-generator.service loaded active exited Generate network units from Kernel command line systemd-networkd-wait-online.service loaded active exited Wait for Network to be Configured systemd-networkd.service loaded active running Network Configuration systemd-random-seed.service loaded active exited Load/Save Random Seed systemd-remount-fs.service loaded active exited Remount Root and Kernel File Systems systemd-resolved.service loaded active running Network Name Resolution systemd-sysctl.service loaded active exited Apply Kernel Variables systemd-sysusers.service loaded active exited Create System Users systemd-tmpfiles-setup-dev.service loaded active exited Create Static Device Nodes in /dev systemd-tmpfiles-setup.service loaded active exited Create Volatile Files and Directories systemd-udev-trigger.service loaded active exited Coldplug All udev Devices systemd-udevd.service loaded active running Rule-based Manager for Device Events and Files systemd-update-done.service loaded active exited Update is Completed vmtoolsd.service loaded active running VMware Tools service write-network-status.service loaded active exited Write network status -.slice loaded active active Root Slice kubepods-besteffort.slice loaded active active libcontainer container kubepods-besteffort.slice kubepods-burstable-pod685bdd4b55003427ad0393d955f76f2e.slice loaded active active libcontainer container kubepods-burstable-pod685bdd4b55003427ad0393d955f76f2e.slice kubepods-burstable.slice loaded active active libcontainer container kubepods-burstable.slice kubepods.slice loaded active active libcontainer container kubepods.slice runtime.slice loaded active active Kubernetes and container runtime slice system-host\x2dcontainers.slice loaded active active Slice /system/host-containers system-modprobe.slice loaded active active Slice /system/modprobe system.slice loaded active active System Slice user.slice loaded active active User and Session Slice dbus.socket loaded active running D-Bus System Message Bus Socket systemd-journald-audit.socket loaded active running Journal Audit Socket systemd-journald-dev-log.socket loaded active running Journal Socket (/dev/log) systemd-journald.socket loaded active running Journal Socket systemd-networkd.socket loaded active running Network Service Netlink Socket systemd-udevd-control.socket loaded active running udev Control Socket systemd-udevd-kernel.socket loaded active running udev Kernel Socket basic.target loaded active active Basic System configured.target loaded active active Bottlerocket final configuration complete first-boot-complete.target loaded active active First Boot Complete getty.target loaded active active Login Prompts local-fs-pre.target loaded active active Preparation for Local File Systems local-fs.target loaded active active Local File Systems multi-user.target loaded active active Multi-User System network-online.target loaded active active Network is Online network-pre.target loaded active active Preparation for Network network.target loaded active active Network nss-lookup.target loaded active active Host and Network Name Lookups paths.target loaded active active Path Units preconfigured.target loaded active active Bottlerocket initial configuration complete remote-fs.target loaded active active Remote File Systems slices.target loaded active active Slice Units sockets.target loaded active active Socket Units swap.target loaded active active Swaps sysinit.target loaded active active System Initialization timers.target loaded active active Timer Units metricdog.timer loaded active waiting Scheduled Metricdog Pings systemd-tmpfiles-clean.timer loaded active waiting Daily Cleanup of Temporary Directories LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 164 loaded units listed. Pass --all to see loaded but inactive units, too. To show all installed unit files use 'systemctl list-unit-files'. |
so I can't see anything wrong there... all the modules seem to be ok and none have failed... can your bootstrap node (assuming its your laptop) access the nodes on their network? I think the bootstrap node needs to communicate directly with the etcd service to know it is healthy. But it is now getting to the limit of my debugging abilities..! |
I have a desktop running Ubuntu 24.04 I was starting it up with. I also tried a VM running Ubuntu 24.04 on VMWare. They both have the same problem starting up the system and both can ssh into the etcd machine ... is there a port number for the etcd daemon I can try to communicate with to see if it is starting? What do I say to it when I connect to try a low level debug? Is there someone with the EKS anywhere project who wrote the code for VMware or ported the code to VMware? Is there a different configuration I can modify to my network which I could test with? Maybe it's my configuration? |
Starting classes on kubernetes. Would it make sense to use |
The communication with the etcd with etcdctl looks like the TLS is not working. Do I need to specify the certificate somehow? [ec2-user@admin]$ /tmp/etcd-download-test/etcdctl --endpoints=localhost:2379 endpoint status {"level":"warn","ts":"2024-12-12T17:45:24.635105Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000ae000/localhost:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: EOF\""} Failed to get the status of endpoint localhost:2379 (context deadline exceeded) [ec2-user@admin]$ /tmp/etcd-download-test/etcdctl --endpoints=localhost:2379 endpoint status {"level":"warn","ts":"2024-12-12T17:45:56.815841Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0002e6000/localhost:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: EOF\""} Failed to get the status of endpoint localhost:2379 (context deadline exceeded) [ec2-user@admin]$ /tmp/etcd-download-test/etcdctl --endpoints=localhost:2379 endpoint health {"level":"warn","ts":"2024-12-12T17:46:55.045687Z","logger":"client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0002e0000/localhost:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: EOF\""} localhost:2379 is unhealthy: failed to commit proposal: context deadline exceeded Error: unhealthy cluster [ec2-user@admin]$ The conatiner log looks like: 2024-12-12T17:46:52.538570028Z stderr F {"level":"warn","ts":"2024-12-12T17:46:52.538280Z","caller":"embed/config_logging.go:170","msg":"rejected connection on client endpoint","remote-addr":"127.0.0.1:58616","server-name":"","error":"tls: first record does not look like a TLS handshake"} |
Could it be etcd is configured to listen on the IPv6 port? Also there must be a log file somewhere on disk? |
What happened:
tmcooper@ubuntu-server-2404:~$ eksctl anywhere create cluster -f eksa-w01-cluster.yaml
Warning: VSphereDatacenterConfig configured in insecure mode
Performing setup and validations
Warning: VSphereDatacenterConfig configured in insecure mode
✅ Connected to server
✅ Authenticated to vSphere
✅ Datacenter validated
✅ Network validated
✅ Datastore validated
✅ Folder validated
✅ Resource pool validated
✅ Datastore validated
✅ Folder validated
✅ Resource pool validated
✅ Datastore validated
✅ Folder validated
✅ Resource pool validated
✅ Machine config tags validated
✅ Control plane and Workload templates validated
✅ [email protected] user vSphere privileges validated
✅ Vsphere Provider setup is valid
✅ Validate OS is compatible with registry mirror configuration
✅ Validate certificate for registry mirror
✅ Validate authentication for git provider
✅ Validate cluster's eksaVersion matches EKS-A version
✅ Validate cluster's kubelet configuration for Bottlerocket OS
✅ Validate cluster's worker node kubelet configuration for Bottlerocket OS
Creating new bootstrap cluster
Provider specific pre-capi-install-setup on bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific post-setup
Installing EKS-A custom components on bootstrap cluster
Installing EKS-D components
Installing EKS-A custom components (CRD and controller)
Creating new management cluster
(gets hung)
From the log where it is repeating the problem:
{"T":1733494047021445202,"M":"Sleeping before next retry","time":"1s"}
{"T":1733494048021896267,"M":"Executing command","cmd":"/usr/bin/docker exec -i eksa_1733493551451751668 kubectl get --ignore-not-found -o json --kubeconfig kn01/generated/kn01.kind.kubeconfig Cluster.v1alpha1.anywhere.eks.amazonaws.com --namespace default kn01"}
{"T":1733494048518935993,"M":"Cluster generation and observedGeneration","Generation":1,"ObservedGeneration":1}
{"T":1733494048519028611,"M":"Error happened during retry","error":"cluster condition ControlPlaneReady is False: Etcd is not ready","retries":59}
{"T":1733494048519061258,"M":"Sleeping before next retry","time":"1s"}
Why is etcd not becoming ready and does it have some log?
The file with the YAML configuration is attached.
What you expected to happen:
My EKS Kubernetes to be set up
How to reproduce it (as minimally and precisely as possible):
Set up eks anywhere, docker,
set environment passwords for VMWare
export EKSA_VSPHERE_USERNAME=[email protected]
export EKSA_VSPHERE_PASSWORD=
run
eksctl anywhere create cluster -f eksa-w01-cluster.yaml
Anything else we need to know?:
Environment:
Client version: 2.14.0
Client build number: 21993070
ESXi version: 8.0.2
ESXi build number: 22380479
Client Version: v1.31.3
Kustomize Version: v5.4.2
The text was updated successfully, but these errors were encountered: