Check List #
| Prerequisites | |
|---|---|
| Upgrade path | TBC |
| Non Prod Upgrade | TBC |
| Openshift Release Notes | TBC |
| Operator Release Notes | TBC |
| Proactive Tickets | TBC |
| Get approval (CAB) | TBC |
| Plan for Test team resources | TBC |
| Send Communications | TBC |
| Backup | NA |
| Before Upgrade Checks | ||
|---|---|---|
| Check cluster health | TBC | |
| Check core micro services | TBC | |
| Check and remove duplicates | TBC | |
| Stop Traffic (Front Door) | TBC |
| Apply the Upgrade | ||
|---|---|---|
| Cluster Upgrade | TBC | |
| Upgrade Operators | TBC |
| ** Post Upgrade Checks** | ||
|---|---|---|
| Review the status of the Cluster Version Operator | TBC | |
| Review clusteroperators, nodes, workloads | TBC | |
| Verify core microservices | TBC | |
| Tahi Smoke Tests | TBC | |
| Check ArgoCD | TBC | |
| Check 3scale | TBC | |
| Send Communications | TBC | |
| Resume Traffic (Front Door) | TBC | |
| Close Tickets | TBC |
Cluster Upgrade #
Prerequisites #
Verify Upgrade path #
Log in to OpenShift cluster
oc login --token=xxxxxxxxxxxxxxxxx --server=https://xxxxxxxxxxxxxxxxxx
Ensure that cluster is available:
oc get clusterversion
If we are upgrading to a next channel, set the correct channel for the version that we want to update to. In this example, updating next channel to 4.12, as current channel is 4.11:
Review the current update channel information and confirm that the channel is set to stable-4.12 :
oc get clusterversion -o json|jq ".items[0].spec"
If it’s not set to stable-4.12 , patch the channel to stable-4.12 :
oc patch clusterversion version --type="merge" -p '{"spec":{"channel":"stable-4.12"}}'
View the available updates and note the version number of the update that we want to apply:
oc adm upgrade
Non Prod Upgrade #
Maks sure upgrade is tested in Nonprod. Nonpord Upgrade Change No :
Openshift Release Notes #
Anlyse the openshift release notes, and do necessary changes.
Operator Release Notes #
Check the installed operators for upgrade.
Proactive Tickets #
Raise Proactive support cases with Redhat and include case number Redhat Support Case No :
Get approvals (CAB) #
Raise chnage request and get approval from CAB.
Chane Request No
Plan for Test team resources #
Infor any external members for testing and make sure their availability during the change window.
Send Communications #
Send comms to stakeholders regarding the upgrade
Backup NA #
Before Upgrade Checks #
Check cluster health #
Confirm the general cluster status, no degraded or progressing operators , all pods running, all nodes ready, etc:
oc get clusterversion
oc get clusteroperators
oc get nodes -o wide
oc get pods -A | grep -v "Running\|Completed\|Terminated|\Succeeded"
Check cluster utilization
oc adm top node
Check there is enough capacity to drain nodes. Check each node for cpu/memeory requests utilization/
oc describe node <nodename> | grep -A10 Allocated
Check core micro services #
Check core micro services logs. Make sure services are healthy before doing any upgrade.
Check ArgoCD #
Login to each Argocd and make sure all apps are synced properly
Apply the Upgrade #
Cluster Upgrade #
Apply the upgrade
oc adm upgrade --to=<targeted version>
oc adm upgrade --to=4.12.36
Monitor the upgrade
watch -n10 "oc get clusterversion && echo && oc get co && echo && oc get nodes -o wide"
Upgrade Operators #
Upgrade any operators if required. Sometimes this may be prior to the cluster upgrade. You can verify this during operator release notes review.
Post Upgrade Checks #
Review the status of the Cluster Version Operator #
Confirm the general cluster status, no degraded or progressing operators etc:
oc get clusterversion
oc get clusteroperators
Review nodes, workloads #
Confirm the all pods running, all nodes ready, etc:
oc get nodes -o wide
oc get pods -A | grep -v "Running\|Completed\|Terminated|\Succeeded"
Check cluster utilization
oc adm top node
Verify core micro services #
Check core micro services logs. Make sure services are healthy before doing any upgrade.
Check ArgoCD #
Login to each Argocd and make sure all apps are synced properly
Send Communications #
Inform that upgrade is completed
Close Tickets #
Cluster upgrade is completed close Redhat support tickets and Change Request
Issues #
Log any issues during/post the upgrade