ISSU Upgrade of Cisco Catalyst 6880-X VSS Cluster and 6800ia FEX extenders

For a shorter update procedure guide check abbreviated article: Short list of upgrade steps without extensive explanations “Cisco Catalyst 6880-X VSS ISSU Upgrade Steps

Intro

Cisco spoiled us over the years with great and detailed documentation on each technology and hardware component they support. Still, I managed to find a part where documentation is not detailed enough to give you definite number of steps to get things done.

While preparing for software upgrade of Cisco Catalyst 6880-X VSS cluster I stumbled on one of the first examples of outdated and vague procedure for upgrade of Cisco device. Here is my successful ISSU (In-Service Software Upgrade) procedure which I done few days ago. I hope it will help you avoid sweating and hoping that you typed the right thing on a VSS cluster that should not go down at any point 🙂

I included an Acronym Guide at the bottom of the post to guide you trough VSS, ISSU, Cluster, and other mentioned abbreviation which are not described in details here

In my case the environment was Catalyst 6880-X and four 6800ia Fabric Extenders FEX. The same procedure is valid for more on for no FEX extenders.

Cisco Catalyst 6880-X VSS

Cisco Catalyst 6880-X VSS

Get the info on which IOS version is supported to be upgraded with ISSU

Not all IOS images can be upgraded to new IOS versions using In Service procedure to avoid network traffic downtime. In order to get things working, you need to get into Cisco docs and find ISSU supported upgrade matrix document. I found one that contained ISSU and EFSU information “SX_SY_EFSU_Compatibility_Matrix1

cisco.com is constantly changing links to documents and pages so I downloaded this document and made it available from my server directly. @Cisco If you don’t like this let me know and I will remove it. Until then… thanks.

I had c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin and planned to upgrade to latest suggested IOS. On cisco software download site, you can see which is the last suggested IOS because it has a little yellow star on the side. (ED) – Early Deployment and it should not be installed except in case when TAC support suggested it or you are running this device in LAB and you want to test some new features released in this software version. (MD) – Maintenance Deployment is usually suggested by Cisco if there is no (GD) – General Deployment version available. GB becomes available after extensive testing and bug fixes.

Cisco ISSU Upgrade

So I decided to go from c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin to
c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin, one version above the suggested one. Before deciding which version I can use for ISSU I needed to open above mentioned “SX_SY_EFSU_Compatibility_Matrix1” in which you can see that for 151-2.SY4 you can do ISSU up to 151-2.SY7.

Cisco ISSU Upgrade matrix

Cisco ISSU Upgrade matrix

Get your IOS image ready

First step is to upload new image to the device via TFTP, FTP or SCP. You need to upload new image to both bookdisk: and slavebootdisk:

copy ftp://admin:[email protected]/c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin bootdisk:

and same for slavebootdisk:

copy ftp://admin:[email protected]/c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin slavebootdisk:

Check if you have this new image on both chassis:

C-6880X#dir bootdisk:
Directory of bootdisk:/

    1  -rw-    33554432  Aug 18 2015 00:11:24 +02:00  sea_console.dat
    2  -rw-   102771080  Aug 18 2015 00:16:36 +02:00  c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
    3  -rw-    33554432  Aug 18 2015 00:12:02 +02:00  sea_log.dat
    4  -rw-        8577  Oct 13 2015 11:14:10 +02:00  startup-config.converted_vs
    5  -rw-   102843784  Aug 24 2016 13:36:12 +02:00  c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin

1928724480 bytes total (1655980032 bytes free)
C21-03-MER-11-C-6880X#dir
C21-03-MER-11-C-6880X#dir sl
C21-03-MER-11-C-6880X#dir slaveb
C21-03-MER-11-C-6880X#dir slavebootdisk:
Directory of slavebootdisk:/

    1  -rw-    33554432  Aug 17 2015 16:46:40 +02:00  sea_console.dat
    2  -rw-   102771080  Aug 17 2015 16:51:24 +02:00  c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
    3  -rw-    33554432  Aug 17 2015 16:47:18 +02:00  sea_log.dat
    4  -rw-        8577  Oct 13 2015 11:14:40 +02:00  startup-config.converted_vs
    5  -rw-   102843784  Aug 24 2016 13:30:02 +02:00  c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin

1928724480 bytes total (1655980032 bytes free)
C21-03-MER-11-C-6880X#

If you have it, like I do on the example above, you are ready to start with the upgrade.

Start ISSU Upgrade (1. command “issu loadversion”)

“issu loadversion” will start the ISSU upgrade procedure by reloading the standby VSS chassis and booting it back up with new IOS image.

Command to start the upgrade in my case was:

C-6880X#issu loadversion 1/5 bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin 2/5 slavebootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin

If everything went fine you should see something like this:

C21-03-MER-11-C-6880X#$vebootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin
%issu loadversion executed successfully, Standby is being reloaded
C21-03-MER-11-C-6880X#
Sep 6 12:37:12.952: %VSLP-SW1-3-VSLP_LMP_FAIL_REASON: Te1/5/1: Disabled by Peer Reload Request
Sep 6 12:37:12.952: %VSLP-SW1-3-VSLP_LMP_FAIL_REASON: Te1/5/2: Disabled by Peer Reload Request
Sep 6 12:37:12.952: %VSLP-SW1-2-VSL_DOWN: Last VSL interface Te1/5/2 went down

Sep 6 12:37:13.068: %VSLP-SW1-2-VSL_DOWN: All VSL links went down while switch is in ACTIVE role

Sep 6 12:37:13.744: %SATMGR-SW1-3-ERR_DUAL_ACTIVE_DETECT_INCAPABLE: channel group 101 is no longer dual-active detection capable
Sep 6 12:37:13.744: %SATMGR-SW1-3-ERR_DUAL_ACTIVE_DETECT_INCAPABLE: channel group 102 is no longer dual-active detection capable
Sep 6 12:37:13.744: %SATMGR-SW1-3-ERR_DUAL_ACTIVE_DETECT_INCAPABLE: channel group 104 is no longer dual-active detection capable
Sep 6 12:37:13.744: %SATMGR-SW1-3-ERR_DUAL_ACTIVE_DETECT_INCAPABLE: channel group 103 is no longer dual-active detection capable
Sep 6 10:34:05.442: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/0/1, changed state to down (FEX-104)
Sep 6 10:34:20.321: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/0/2, changed state to down (FEX-101)
Sep 6 10:33:14.286: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/0/2, changed state to down (FEX-102)
Sep 6 10:35:02.044: %LINK-3-UPDOWN: Interface TenGigabitEthernet1/0/1, changed state to down (FEX-103)

Please note that I had 4 6800ia FEX extenders connected to C6880-X Core each connected with MEC etherchannel to both chassis. This is the reason that you see FEX interfaces going down among other things.

Standby Chassis is going down for the reboot in which it will load new IOS image and join the VSS Cluster back. When reloaded and joined in VSS it is making itself ready for SSO Switchover from Active Chassis with old IOS to this Upgraded Chassis with new IOS without traffic loss.

Possible Issue

There is one issue I stepped on while starting ISSU Upgrade. By default on C6880-X there is no boot system image defined in running configuration. If that is the case on your side to it could lead to this error message when starting the issu loadversion.

C-6880X#issu loadversion 1/5 bootdisk:c6880x-adventerprisek9-mz.$
 % CV [ bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin ] must be named first in BOOT [ bootdisk: ]

You can check the config of boot system image like this:

C-6880X#sh runn | i boot
boot-start-marker
 boot system bootdisk:
boot-end-marker

If you have similar boot system bootdisk: without the image name c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin or similar in it you should delete that row and enter it with old IOS image mentioned in it:

C-6880X#conf t
Enter configuration commands, one per line. End with CNTL/Z.
C-6880(config)#no boot system bootdisk:
C-6880(config)#no boot system bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
C-6880(config)#do wr
Building configuration...
[OK]

And check if it’s fine now:

C-6880(config)#do sh runn | i boot
boot-start-marker
boot system bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
boot-end-marker

Upgrade should now start without issues with the command

C-6880X#issu loadversion 1/5 bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin 2/5 slavebootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin

Slave Chassis reboot time with two linecards is approximately 12-15 minutes

After that time you should see ISSU status showing something like this below. You can see that Chassis 1 (Slot 1/5) Is curently running 151-2.SY4a and Chassis 2 (Slot 2/5) is already Upgraded to 151-2.SY7

C21-03-MER-11-C-6880X#show issu state detail
Slot = 1/5
RP State = Active
ISSU State = Load Version
Boot Variable = bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin,12
Operating Mode = sso
Primary Version = bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
Secondary Version = bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin
Current Version = bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
Variable Store = PrstVbl

Slot = 2/5
RP State = Standby
ISSU State = Load Version
Boot Variable = bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin,12;bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin,12
Operating Mode = sso
Primary Version = bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin
Secondary Version = bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin
Current Version = bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin
This system is Fex-capable
Fex-ID ISSU Status

101 FEX_UPGRADE_INIT
102 FEX_UPGRADE_INIT
103 FEX_UPGRADE_INIT
104 FEX_UPGRADE_INIT

If both chassis are SSO ready, you can go to the next step in the upgrade procedure

Use the issu acceptversion command to stop the Rollback Timer. This is necessary because if the timer expires, the upgraded chassis reloads
and reverts to the previous software version.

ISSU Upgrade Accept new IOS on upgraded chassis (3. command “issu acceptversion”)

You need to accept upgraded IOS version on Upgraded standby chassis because if you don’t chassis will assume that something went wrong and it will reboot one more time loading back old IOS version.

C-6880X#issu acceptversion
   % Rollback timer stopped. Please issue the 'issu commitversion' command..

6880 will tell you to execute “issu commitversion” now but don’t do that if you have FEX extenders. They need to be upgraded before “issu commitversion”.

Go and upgrade the FEXs…

Continue ISSU Upgrade by upgrading FEX extenders (3. command “issu runversion fex ..”)

Of course, you run this part only if you have FEX extenders connected to your CORE SW.

Command can be executed to upgrade all FEX at once (issu runversion fex all) which will bring them all down for about 15-20 minutes. Other option is to initiate upgrade of each FEX separately (issu runversion fex 1) which can be helpful if you have something connected to more that one FEX and it should not experience complete network outage.

C-6880X#issu runversion fex all

With command show issu state you can follow the upgrade procedure and see when all FEX extenders are upgraded. Just wait that FEX_UPGRADE_IN_PROGRESS goes away from all FEX status lines.

C-6880X#show issu state
Slot = 2/5
RP State = Active
ISSU State = Run Version
Boot Variable = bootdisk:c6880x-adventerprisek9-mz.SPA.151-2.SY7.bin,12;bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin,12

Slot = 1/5
RP State = Standby
ISSU State = Run Version
Boot Variable = bootdisk:/c6880x-adventerprisek9-mz.SPA.151-2.SY4a.bin,12

This system is Fex-capable
Fex-ID ISSU Status

101 FEX_UPGRADE_IN_PROGRESS
102 FEX_UPGRADE_IN_PROGRESS
103 FEX_UPGRADE_IN_PROGRESS
104 FEX_UPGRADE_IN_PROGRESS

When you see something like this:

This system is Fex-capable
Fex-ID ISSU Status

101 FEX_UPGRADE_COMPLETE
102 FEX_UPGRADE_COMPLETE
103 FEX_UPGRADE_COMPLETE
104 FEX_UPGRADE_COMPLETE

You can go to the last step of commiting the new IOS version on active chassis.

ISSU Upgrade of Active Chassis (4. command “issu commitversion”)

C21-03-MER-11-C-6880X#issu commitversion

ISSU commitversion will initiate Active Chassis reboot and stateful switchover to Standby chassis which is already Upgraded. It will reboot active chassis and boot it back up with new IOS image effectively finishing our Upgrade process without none or just few lost pings during this last switchover.

After about 12-15 minutes you should see that chassis booted back up and joined VSS cluster which is now running new IOS on all its members.

Verify

To verify that you succeeded with the upgrade check the output of:

show issu state detail

show redundancy

show module switch all

They should show both chassis online and joined in VSS cluster with same new IOS image.

Acronym Guide

ISSU – If you have two Switches in VSS cluster, ISSU will enable you to perform software upgrade without network outage by upgrading one chassis at the time and swhitching over in-between. It will upgrade standby chassis by rebooting with new IOS, switchower to that upgraded chassis and perform upgrade with reboot of the second chassis.

EFSU – If ISSU is not available for some reason, usually IOS incompatibility or box incompatibility you have eFSU which is enhanced software upgrade procedure. It causes downtime but much shorter that normal complete cluster reboot. It will do that by pre loading new module software into memory and in that way avoid a hard reset.

VSS – A VSS creates a single network element or a cluster out of two Catalyst 6880 series switches. In this configuration of of them has Active Supervisor for all traffic in the cluster and the other one is Hot Standby with replicated states with SSO. It manages redundant links, which externally are a single port channel.

MEC – Multichassis EtherChannel is normal Ehterchanel created on VSS cluster with one port member from one Chassis and other port member from other chassis. Whatever disaster happens to one of the cluster nodes, half of the etherchannel members will stay alive.

SSO – Stateful Switchover enables sync of features, user sessions, routing infos and all other line card states between active and hot standby chassis so that supervisor on Standby chassis can continue to forward traffic with no loss of sessions when Active to Hot Standby Supervisor switchover happens. One of the examples is that with switchover without SSO and NSF we would need to re-establish all routing adjacencies and with this we would cause routing flap, routing table rebuild and it will cause network outage for some time depending on our routing protocol timers. It is similar for STP etc..

NSF -Non-Stop Forwarding gets the state of routing service running on out Active VSS Chassis from SSO. By getting those states it is always ready to use current network adjacencies without re-establishing new ones and thus preventing routing flap in the time of switchower.

One Response

  1. Belmar October 11, 2016

Leave a Reply

%d bloggers like this: