Upgrade or Resize node pools in AKS
Introduction
In the Microservices containerized applications, it's often necessary to adjust the capacity of your Azure Kubernetes Service (AKS) clusters. This guide will walk you through the process of upgrading or resizing a node pool in AKS.
Prerequisites:
Before proceeding, ensure you have the following:
AKS
Azure Kubernetes Service cluster set up.kubectl
command-line tool installed.
Objective
In this exercise we will accomplish & learn how to implement following:
- Step 1: Check status of existing nodes and pods
- Step 2: Create a new node pool with the desired SKU
- Step 3: Cordon the existing nodes
- Step 4: Drain the existing nodes
- Step 5: Validate new node status
- Step 6: Remove the existing node pool
login to Azure
Verify that you are logged into the right Azure subscription before start anything in visual studio code
Connect to Cluster
Use the following command to connect to your AKS cluster.
# Azure Kubernetes Service Cluster User Role
az aks get-credentials -g "rg-aks-dev" -n "aks-cluster1-dev"
# Azure Kubernetes Service Cluster Admin Role
az aks get-credentials -g "rg-aks-dev" -n "aks-cluster1-dev" --admin
# get nodes
kubectl get no
kubectl get namespace -A
Technical Scenario
We'll assume a scenario where you need to resize an existing node pool named nodepool1
from the SKU size Standard_B8ms
to Standard_D8s_v5
. This process involves creating a new node pool nodepool2
, moving workloads, and then removing the old node pool.
- Existing node pool named
nodepool1
- Existing VM size
Standard_B8ms
- New node pool named
nodepool2
- New node pool named
Standard_D8s_v5
Step-1: Check status of existing nodes and pods
Use the following command to check the current status of nodes and pods in your AKS cluster:
kubectl get nodes -o wide
# output
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-nodepool1-25316841-vmss000000 Ready agent 198d v1.23.12 10.64.4.4 <none> Ubuntu 18.04.6 LTS 5.4.0-1101-azure containerd://1.6.15+azure-1
aks-nodepool1-25316841-vmss000001 Ready agent 198d v1.23.12 10.64.4.113 <none> Ubuntu 18.04.6 LTS 5.4.0-1101-azure containerd://1.6.15+azure-1
kubectl get pods -o wide -A
# output
kube-system azure-policy-557988f6df-q5rgf 1/1 Running 0 23d 10.64.4.171 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system azure-policy-webhook-5dfcfc5998-cn7p6 1/1 Running 0 23d 10.64.4.161 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system cloud-node-manager-j8zg4 1/1 Running 1 (23d ago) 72d 10.64.4.4 aks-nodepool1-25316841-vmss000000 <none> <none>
kube-system cloud-node-manager-p8nk5 1/1 Running 0 72d 10.64.4.113 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system coredns-785fcf7bdd-6d84r 1/1 Running 0 82d 10.64.4.214 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system coredns-785fcf7bdd-n8bvt 1/1 Running 0 23d 10.64.4.199 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system coredns-autoscaler-65bb858f95-6lwkf 1/1 Running 0 23d 10.64.4.117 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system csi-azuredisk-node-2n6md 3/3 Running 0 10d 10.64.4.4 aks-nodepool1-25316841-vmss000000 <none> <none>
kube-system csi-azuredisk-node-r7snn 3/3 Running 0 10d 10.64.4.113 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system csi-azurefile-node-7tjxl 3/3 Running 0 3d1h 10.64.4.113 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system csi-azurefile-node-nfntk 3/3 Running 0 3d1h 10.64.4.4 aks-nodepool1-25316841-vmss000000 <none> <none>
kube-system konnectivity-agent-6d4987776d-2wbc2 1/1 Running 0 23d 10.64.4.142 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system konnectivity-agent-6d4987776d-hdgx9 1/1 Running 0 82d 10.64.4.204 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system kube-proxy-dvl5k 1/1 Running 0 44d 10.64.4.113 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system kube-proxy-jx2qw 1/1 Running 1 (23d ago) 44d 10.64.4.4 aks-nodepool1-25316841-vmss000000 <none> <none>
kube-system metrics-server-7757d565cf-rrktr 2/2 Running 0 41d 10.64.4.181 aks-nodepool1-25316841-vmss000001 <none> <none>
kube-system metrics-server-7757d565cf-zz97t 2/2 Running 0 23d 10.64.4.190 aks-nodepool1-25316841-vmss000001 <none> <none>
sample aks-helloworld-one-6965865b8b-dk6w9 1/1 Running 0 192d 10.64.4.116 aks-nodepool1-25316841-vmss000001 <none> <none>
sample aks-helloworld-two-66c5cf894b-vmztv 1/1 Running 0 192d 10.64.4.115 aks-nodepool1-25316841-vmss000001 <none> <none>
sample aspnet-api-79b4cbf4bb-54cng 1/1 Running 0 10d 10.64.4.100 aks-nodepool1-25316841-vmss000000 <none> <none>
Check the status of your node pools
az aks nodepool list --resource-group myResourceGroup --cluster-name myAKSCluster
# example
az aks nodepool list --resource-group 'rg-aks-dev' --cluster-name 'aks-cluster1-dev'
# output
[
{
"availabilityZones": null,
"capacityReservationGroupId": null,
"count": 2,
"creationData": null,
"currentOrchestratorVersion": "1.24.9",
"enableAutoScaling": true,
"enableCustomCaTrust": false,
"enableEncryptionAtHost": false,
"enableFips": false,
"enableNodePublicIp": false,
"enableUltraSsd": false,
"gpuInstanceProfile": null,
"hostGroupId": null,
"id": "/subscriptions/b635d52c-5170-4366-b262-cc12cba2d9be/resourcegroups/rg-aks-dev/providers/Microsoft.ContainerService/managedClusters/aks-cluster1-dev/agentPools/agentpool",
"kubeletConfig": null,
"kubeletDiskType": "OS",
"linuxOsConfig": null,
"maxCount": 5,
"maxPods": 110,
"messageOfTheDay": null,
"minCount": 2,
"mode": "System",
"name": "agentpool",
"networkProfile": null,
"nodeImageVersion": "AKSUbuntu-1804gen2containerd-2023.02.09",
"nodeLabels": null,
"nodePublicIpPrefixId": null,
"nodeTaints": null,
"orchestratorVersion": "1.24.9",
"osDiskSizeGb": 128,
"osDiskType": "Managed",
"osSku": "Ubuntu",
"osType": "Linux",
"podSubnetId": null,
"powerState": {
"code": "Running"
},
"provisioningState": "Failed",
"proximityPlacementGroupId": null,
"resourceGroup": "rg-aks-dev",
"scaleDownMode": "Delete",
"scaleSetEvictionPolicy": null,
"scaleSetPriority": null,
"spotMaxPrice": null,
"tags": null,
"type": "Microsoft.ContainerService/managedClusters/agentPools",
"typePropertiesType": "VirtualMachineScaleSets",
"upgradeSettings": {
"maxSurge": null
},
"vmSize": "Standard_B4ms",
"vnetSubnetId": "/subscriptions/b635d52c-5170-4366-b262-cc12cba2d9be/resourceGroups/rg-vnet1-dev/providers/Microsoft.Network/virtualNetworks/vnet-spoke-dev/subnets/snet-aks1",
"windowsProfile": null,
"workloadRuntime": null
}
]
az aks get-versions --location eastus --output table
# or
az aks get-versions --location eastus --query "orchestrators" -o table
# output
OrchestratorType OrchestratorVersion Default
------------------ --------------------- ---------
Kubernetes 1.27.3
Kubernetes 1.27.1
Kubernetes 1.26.6 True
Kubernetes 1.26.3
Kubernetes 1.25.11
Kubernetes 1.25.6
Step-2: Create a new node pool with the desired SKU
Use the az aks nodepool add command to create a new node pool called nodepool2
with required number of nodes using the Standard_DS3_v2
VM SKU:
- Existing node pool named
nodepool1
- Existing VM size
Standard_B8ms
- New node pool named
nodepool2
- New node pool named
Standard_D8s_v5
az aks nodepool add \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name nodepool2 \
--node-count 3 \
--node-vm-size Standard_D8s_v5 \
--mode System \
--no-wait
# bash example
az aks nodepool add \
--resource-group 'rg-aks-dev' \
--cluster-name 'aks-cluster1-dev' \
--name nodepool2 \
--node-count 3 \
--node-vm-size Standard_D8s_v5 \
--mode System \
--no-wait
# powershell System example
az aks nodepool add `
--resource-group 'rg-aks-dev' `
--cluster-name 'aks-cluster1-dev' `
--enable-cluster-autoscaler `
--name nodepool2 `
--node-count 2 `
--min-count 1 `
--max-count 3 `
--max-pods 50 `
--node-vm-size Standard_D8s_v5 `
--mode System `
--no-wait
# output
The behavior of this command has been altered by the following extension: aks-preview
After a few minutes, the new node pool has been created:
kubectl get nodes
# output
NAME STATUS ROLES AGE VERSION
aks-nodepool1-20823458-vmss000000 Ready agent 23m v1.21.9
aks-nodepool1-20823458-vmss000001 Ready agent 23m v1.21.9
aks-nodepool2-31721111-vmss000000 Ready agent 10d v1.21.9
aks-nodepool2-31721111-vmss000001 Ready agent 10d v1.21.9
Step-3: Cordon the existing nodes
Cordoning marks specified nodes as unschedulable and prevents any more pods from being added to the nodes.
Next, using kubectl cordon <node-names>
, specify the desired nodes in a space-separated list:
kubectl cordon `
aks-nodepool1-23546727-vmss00001r `
aks-nodepool1-23546727-vmss00001s
# output
node/aks-nodepool1-23546727-vmss00001r cordoned
aks-nodepool1-23546727-vmss00001s cordoned
Step-4: Drain the existing nodes
Draining nodes will cause pods running on them to be evicted and recreated on the other, schedulable nodes.
To drain nodes, use kubectl drain <node-names> --ignore-daemonsets --delete-emptydir-data
, again using a space-separated list of node names:
kubectl drain `
aks-nodepool1-23546727-vmss00001r `
aks-nodepool1-23546727-vmss00001s `
--ignore-daemonsets --delete-emptydir-data
This ensures that the pods are safely moved to other nodes.
Step 5: Validate new node status
Check the status of the new nodes in the nodepool2
:
kubectl get nodes
#output
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-nodepool1-40415315-vmss000000 Ready,SchedulingDisabled agent 53d v1.28.0 10.65.4.4 <none> Ubuntu 22.04.3 LTS 5.15.0-1041-azure containerd://1.7.5-1
aks-nodepool1-40415315-vmss000001 Ready,SchedulingDisabled agent 53d v1.28.0 10.65.4.113 <none> Ubuntu 22.04.3 LTS 5.15.0-1041-azure containerd://1.7.5-1
aks-nodepool2-44502347-vmss000000 Ready agent 5m49s v1.28.3 10.65.4.222 <none> Ubuntu 22.04.3 LTS 5.15.0-1051-azure containerd://1.7.5-1
aks-nodepool2-44502347-vmss000001 Ready agent 5m51s v1.28.3 10.65.5.22 <none> Ubuntu 22.04.3 LTS 5.15.0-1051-azure containerd://1.7.5-1
Ready
state.
kubectl get pods -o wide -A
# output
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default sampleapp2-74b4b974ff-676sz 1/1 Running 0 15m 10.244.4.5 aks-nodepool2-20823458-vmss000002 <none> <none>
default sampleapp2-76b6c4c59b-rhmzq 1/1 Running 0 16m 10.244.4.3 aks-nodepool2-20823458-vmss000002 <none> <none>
kube-system azure-ip-masq-agent-4n66k 1/1 Running 0 10d 10.240.0.6 aks-nodepool2-31721111-vmss000002 <none> <none>
kube-system azure-ip-masq-agent-9p4c8 1/1 Running 0 10d 10.240.0.4 aks-nodepool2-31721111-vmss000000 <none> <none>
kube-system azure-ip-masq-agent-nb7mx 1/1 Running 0 10d 10.240.0.5 aks-nodepool2-31721111-vmss000001 <none> <none>
kube-system azure-ip-masq-agent-sxn96 1/1 Running 0 49m 10.240.0.9 aks-nodepool2-20823458-vmss000002 <none> <none>
kube-system azure-ip-masq-agent-tsq98 1/1 Running 0 49m 10.240.0.8 aks-nodepool2-20823458-vmss000001 <none> <none>
kube-system azure-ip-masq-agent-xzrdl 1/1 Running 0 49m 10.240.0.7 aks-nodepool2-20823458-vmss000000 <none> <none>
kube-system coredns-845757d86-d2pkc 1/1 Running 0 17m 10.244.3.2 aks-nodepool2-20823458-vmss000000 <none> <none>
kube-system coredns-845757d86-f8g9s 1/1 Running 0 17m 10.244.5.2 aks-nodepool2-20823458-vmss000001 <none> <none>
kube-system coredns-autoscaler-5f85dc856b-f8xh2 1/1 Running 0 17m 10.244.4.2 aks-nodepool2-20823458-vmss000002 <none> <none>
kube-system csi-azuredisk-node-7md2w 3/3 Running 0 49m 10.240.0.7 aks-nodepool2-20823458-vmss000000 <none> <none>
kube-system csi-azuredisk-node-9nfzt 3/3 Running 0 10d 10.240.0.4 aks-nodepool2-31721111-vmss000000 <none> <none>
kube-system csi-azuredisk-node-bblsb 3/3 Running 0 10d 10.240.0.5 aks-nodepool2-31721111-vmss000001 <none> <none>
kube-system csi-azuredisk-node-lcmtz 3/3 Running 0 49m 10.240.0.9 aks-nodepool2-20823458-vmss000002 <none> <none>
kube-system csi-azuredisk-node-mmncr 3/3 Running 0 49m 10.240.0.8 aks-nodepool2-20823458-vmss000001 <none> <none>
kube-system csi-azuredisk-node-tjhj4 3/3 Running 0 10d 10.240.0.6 aks-nodepool2-31721111-vmss000002 <none> <none>
kube-system csi-azurefile-node-29w6z 3/3 Running 0 49m 10.240.0.9 aks-nodepool2-20823458-vmss000002 <none> <none>
kube-system csi-azurefile-node-4nrx7 3/3 Running 0 49m 10.240.0.7 aks-nodepool2-20823458-vmss000000 <none> <none>
kube-system csi-azurefile-node-9pcr8 3/3 Running 0 3d11h 10.240.0.6 aks-nodepool2-31721111-vmss000002 <none> <none>
kube-system csi-azurefile-node-bh2pc 3/3 Running 0 3d11h 10.240.0.5 aks-nodepool2-31721111-vmss000001 <none> <none>
kube-system csi-azurefile-node-gqqnv 3/3 Running 0 49m 10.240.0.8 aks-nodepool2-20823458-vmss000001 <none> <none>
kube-system csi-azurefile-node-h75gq 3/3 Running 0 3d11h 10.240.0.4 aks-nodepool2-31721111-vmss000000 <none> <none>
kube-system konnectivity-agent-6cd55c69cf-2bbp5 1/1 Running 0 17m 10.240.0.7 aks-nodepool2-20823458-vmss000000 <none> <none>
kube-system konnectivity-agent-6cd55c69cf-7xzxj 1/1 Running 0 16m 10.240.0.8 aks-nodepool2-20823458-vmss000001 <none> <none>
kube-system kube-proxy-4wzx7 1/1 Running 0 10d 10.240.0.4 aks-nodepool2-31721111-vmss000000 <none> <none>
kube-system kube-proxy-7h8r5 1/1 Running 0 49m 10.240.0.7 aks-nodepool2-20823458-vmss000000 <none> <none>
kube-system kube-proxy-g5tvr 1/1 Running 0 10d 10.240.0.6 aks-nodepool2-31721111-vmss000002 <none> <none>
kube-system kube-proxy-mrv54 1/1 Running 0 10d 10.240.0.5 aks-nodepool2-31721111-vmss000001 <none> <none>
kube-system kube-proxy-nqmnj 1/1 Running 0 49m 10.240.0.9 aks-nodepool2-20823458-vmss000002 <none> <none>
kube-system kube-proxy-zn77s 1/1 Running 0 49m 10.240.0.8 aks-nodepool2-20823458-vmss000001 <none> <none>
kube-system metrics-server-774f99dbf4-2x6x8 1/1 Running 0 16m 10.244.4.4 aks-nodepool2-20823458-vmss000002 <none> <none>
Step-6: Remove the existing node pool
To delete the existing node pool, use the Azure portal or the az aks nodepool delete command:
az aks nodepool delete `
--resource-group 'rg-aks-dev' `
--cluster-name 'aks-cluster1-dev' `
--name agentpool
Conclusion
You have successfully upgraded or resized a node pool in Azure Kubernetes Service. This process ensures a smooth transition of workloads to the new nodes, minimizing disruptions to your applications. Adjust the commands and configurations based on your specific requirements and cluster setup.