Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: [Service Tags] add public ips with service tags for LBs during cluster creation #3277

Open
wants to merge 56 commits into
base: master
Choose a base branch
from

Conversation

k-routhu
Copy link

@k-routhu k-routhu commented Dec 18, 2024

Reason for Change:

Adding service tags to all public ips. This change adds public ips and attaches these ips as load balancer outbound ips during cluster creation. Branch spathak/add-service-tags is where all the changes were made and tested -- created this branch based off of that one.

Testing:

  • All testing was done on branch spathak/add-service-tags
  • We had a couple of PRs out by Esteban and Shubham to update service tags. However, all of the PRs targeted hack/aks/Makefile, so I went ahead and made all of the updates in this branch.
  • Also added ipv6 public ips to load balancer outbound ips for dualstack clusters to resolve errors

Tested changes in a couple of Cilium pipelines:

CNI Nightly Pipeline - Task
The public ips created have service tags attached to them, and the ips fall within the allowed ranges for the locations they were created

CNI Release Test - Task
All of the public ips created have service tags attached to them, and the ips fall within the allowed ranges for the locations they were created

CNI - LSG Integration Test - Task
The clusters get created here successfully with the correct public ips. The integration tests do fail, but I checked with John Payne -- he mentioned that the integration tests fail frequently because they're heavily dependent on a partner team (Canonical). He suggested running Cilium Private Test pipelines on v1.14 and v1.16 (the next two on the list) and the remaining pipelines with the service tags. The remaining pipelines look good and create public ips as expected, so I believe we should be good to go on that front.

ACN PR Pipeline - Task
All of the public ips created have service tags attached to them, and the ips fall within the allowed ranges for the locations they were created

@k-routhu k-routhu requested a review from a team as a code owner December 18, 2024 20:50
@k-routhu k-routhu self-assigned this Dec 18, 2024
@k-routhu k-routhu changed the title Krouthu/service tag Add public ips with service tags for LBs during cluster creation Dec 18, 2024
@k-routhu k-routhu changed the title Add public ips with service tags for LBs during cluster creation [Service Tags] add public ips with service tags for LBs during cluster creation Dec 18, 2024
Comment on lines 19 to 25
REGION ?= westus2
VM_SIZE ?= Standard_B2s
VM_SIZE_WIN ?= Standard_B2s
IP_TAG ?= FirstPartyUsage=/DelegatedNetworkControllerTest
IP_PREFIX ?= serviceTaggedIp
PUBLIC_IPv4 ?= $(IP_PREFIX)-$(CLUSTER)-v4
PUBLIC_IPv6 ?= $(IP_PREFIX)-$(CLUSTER)-v6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All ?= should be lined up.

Comment on lines 49 to 67
public-ipv4: rg-up
$(AZCLI) network public-ip create --name $(PUBLIC_IPv4) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv4

public-ipv6: rg-up
$(AZCLI) network public-ip create --name $(PUBLIC_IPv6) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate rg-up calls when creating cluster. I would remove the rg-up from the make calls public-ipv4 and public-ipv6

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated this

$(AZCLI) aks create -n $(CLUSTER) -g $(GROUP) -l $(REGION) \
--auto-upgrade-channel $(AUTOUPGRADE) \
--node-os-upgrade-channel $(NODEUPGRADE) \
--kubernetes-version $(K8S_VER) \
--node-count $(NODE_COUNT) \
--node-vm-size $(VM_SIZE) \
--load-balancer-sku standard \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent removal of --load-balancer-sku. Does it need to be standard or can we keep it as basic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be basic, I meant to remove --load-balancer-sku standard for nodesubnet-byocni-nokubeproxy-up since I'm adding --load-balancer-outbound-ips <publicip v4>,<publicip v6>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this implicitly set every cluster to standard for the load balancer sku?

@@ -98,6 +120,7 @@ up: swift-up ## Alias to swift-up


nodesubnet-byocni-nokubeproxy-up: rg-up overlay-net-up ## Brings up an NodeSubnet BYO CNI cluster without kube-proxy
@$(MAKE) public-ipv4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this rerun the public ip creation? if so, should we have the ip creation as a global step so it provisions just once?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this reruns public ip creation, but since the Cilium pipelines do cluster creation in parallel, we need to provision different public ips

Copy link
Contributor

@rbtr rbtr Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you reinvoking Make for this? unless you are overriding or changing args and options versus the Make invocation you are already in, you could make the IP target a dependency here, like rg-up etc.

@@ -98,6 +120,7 @@ up: swift-up ## Alias to swift-up


nodesubnet-byocni-nokubeproxy-up: rg-up overlay-net-up ## Brings up an NodeSubnet BYO CNI cluster without kube-proxy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these steps run in parallel or in a sequential order?
Public IPs from ST are limited so we should try to either reuse or not create too many public ips at the same time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these steps are run in parallel, but I do believe we have the capacity to create the public ips. The majority of the clusters are created in westus and westus2, and they have 254 and 512 available vips respectively, and we create about 10-12 clusters (and public ips) in each location, and delete them all at the end

--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we request IPV6 ips for our test ST ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline, should be good to go on ipv6

@jpayne3506 jpayne3506 added the ci Infra or tooling. label Dec 23, 2024
@jpayne3506 jpayne3506 changed the title [Service Tags] add public ips with service tags for LBs during cluster creation ci: [Service Tags] add public ips with service tags for LBs during cluster creation Dec 23, 2024
KUBE_PROXY_JSON_PATH ?= ./kube-proxy.json

# overrideable variables
SUB ?= $(AZURE_SUBSCRIPTION)
CLUSTER ?= $(USER)-$(REGION)
GROUP ?= $(CLUSTER)
VNET ?= $(CLUSTER)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Lets keep this line in.

Comment on lines 19 to 25
REGION ?= westus2
VM_SIZE ?= Standard_B2s
VM_SIZE_WIN ?= Standard_B2s
IP_TAG ?= FirstPartyUsage=/DelegatedNetworkControllerTest
IP_PREFIX ?= serviceTaggedIp
PUBLIC_IPv4 ?= $(IP_PREFIX)-$(CLUSTER)-v4
PUBLIC_IPv6 ?= $(IP_PREFIX)-$(CLUSTER)-v6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All ?= should be lined up.

$(AZCLI) aks create -n $(CLUSTER) -g $(GROUP) -l $(REGION) \
--auto-upgrade-channel $(AUTOUPGRADE) \
--node-os-upgrade-channel $(NODEUPGRADE) \
--kubernetes-version $(K8S_VER) \
--node-count $(NODE_COUNT) \
--node-vm-size $(VM_SIZE) \
--load-balancer-sku standard \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this implicitly set every cluster to standard for the load balancer sku?

Comment on lines +49 to +67
public-ipv4:
$(AZCLI) network public-ip create --name $(PUBLIC_IPv4) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv4

public-ipv6:
$(AZCLI) network public-ip create --name $(PUBLIC_IPv6) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv6
Copy link
Contributor

@rbtr rbtr Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these identical except for "v4" and "v6"? consider how you could simplify and parameterize throughout, like this:

Suggested change
public-ipv4:
$(AZCLI) network public-ip create --name $(PUBLIC_IPv4) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv4
public-ipv6:
$(AZCLI) network public-ip create --name $(PUBLIC_IPv6) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IPv6
ip:
$(AZCLI) network public-ip create --name $(IP_PREFIX)-$(CLUSTER)-$(IPVERSION) \
--resource-group $(GROUP) \
--allocation-method Static \
--ip-tags $(IP_TAG) \
--location $(REGION) \
--sku Standard \
--tier Regional \
--version IP$(IPVERSION)
ipv4:
@$(MAKE) ip IPVERSION=v4
ipv6:
@$(MAKE) ip IPVERSION=v6

Comment on lines +22 to +23
IP_TAG ?= FirstPartyUsage=/DelegatedNetworkControllerTest
IP_PREFIX ?= serviceTaggedIp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these strings pre-determined?

@@ -98,6 +120,7 @@ up: swift-up ## Alias to swift-up


nodesubnet-byocni-nokubeproxy-up: rg-up overlay-net-up ## Brings up an NodeSubnet BYO CNI cluster without kube-proxy
@$(MAKE) public-ipv4
Copy link
Contributor

@rbtr rbtr Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you reinvoking Make for this? unless you are overriding or changing args and options versus the Make invocation you are already in, you could make the IP target a dependency here, like rg-up etc.

@timraymond
Copy link
Member

Why are we using make for this? Can't most of this be done more predictably by Bicep/ARM?

@rbtr
Copy link
Contributor

rbtr commented Jan 3, 2025

Why are we using make for this? Can't most of this be done more predictably by Bicep/ARM?

@timraymond yes and no.
AzCLI is basically a wrapper to ARM when used like this and that's what it actually sends, so in a literal sense, yes. Bicep would be even better for declarative infrastructure.

I originally wrote this Makefile because I needed to build infra that required certain subscription flags enabled (which ARM may be able to do, but was documented as AzCLI commands), and a certain preview CLI installed locally and/or a preview AZ AKS plugin installed locally. This made it easy for everyone to deploy infra identically without copy/pasting out of some install steps OneNote every time.
And, more importantly to me at the time, AzCLI (including ARM, Bicep, and anything that depends on it) was and is still completely busted on Fedora. Platforming everything in the azure-cli container makes this totally portable.

I also find it nice to have a central list of capabilities. We could write Bicep for everything here, but I think I would still want the Makefile to exist as the single entrypoint to all of the bicep apply <scenario> so that it's all easy to find (and to wrap in the azcli image so that IT WORKS AT ALL 😤)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Infra or tooling.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants