After initiating Cisco DNA Appliance version 2.1.2.4 and starting an upgrade towards 2.2.2.8 in order to get to 2.2.3.5 I got a strange issue where the appliance system update went fine but the switch to 2.2.2.8 was disabled until Application Updates did not finish.
The real issue here was that Application Updates of Cloud Connectivity – Data Hub got stuck on 12% for 4 days without timing out or finishing. Tried several appliance reboots from CIMC which didn’t help.
Below are the steps that helped sort out Application Updates issues with container pods being stuck at the point of pooling images from the cloud and creation on the Cisco DNA appliance. Of course, this was a long troubleshoot and here I listed only maglev CLI commands that were useful.
I SSH into the appliance and started digging the maglev CLI tool.
#ssh -l maglev 10.10.10.10 -p 2222
I listed all Application Updates and it was visible that there are multiple applications that waited to be downloaded and installed and there was also my Cloud Connectivity – Data Hub app stuck on its 12% of completion
$ maglev catalog package display NAME DISPLAY_NAME VERSION STATE INFO -------------------------------------------------------------------------------------------------------- access-control-application Access Control Application 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded ai-network-analytics AI Network Analytics 2.6.9.455 PARTIAL Package needs to be pulled/downloaded app-hosting Application Hosting 1.6.6.2112161504 ERROR Unable to update ManagedService=app-hosting/database/postgres:1.6.1. LibraryServiceBundle=postgres:1.6.1 does not exist in repository 'None' application-policy Application Policy 2.1.368.170003 READY application-registry Application Registry 2.1.368.170003 READY application-visibility-service Application Visibility Service 2.1.368.170003 PARTIAL Package needs to be pulled/downloaded assurance Assurance - Base 2.2.2.485 PARTIAL Package needs to be pulled/downloaded automation-core NCP - Services 2.1.368.60015 READY base-provision-core Automation - Base 2.1.368.60015 READY cloud-connectivity-contextual-content Cloud Connectivity - Contextual Content 1.3.1.364 READY cloud-connectivity-data-hub Cloud Connectivity - Data Hub 1.6.0.380 PARTIAL ==> [ 12%] cloud-connectivity-tethering Cloud Connectivity - Tethering 2.12.1.2 READY cloud-provision-core Cloud Device Provisioning Application 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded command-runner Command Runner 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded device-onboarding Device Onboarding 2.1.368.60015 READY disaster-recovery Disaster Recovery 2.1.367.360196 PARTIAL Package needs to be pulled/downloaded dna-core-apps Network Experience Platform - Core 2.1.368.60015 READY dnac-platform Cisco DNA Center Platform 1.5.1.180 READY dnac-search Cisco DNA Center Global Search 1.5.0.466 PARTIAL Package needs to be pulled/downloaded endpoint-analytics AI Endpoint Analytics 1.4.375 PARTIAL Package needs to be pulled/downloaded group-based-policy-analytics Group-Based Policy Analytics 2.2.1.401 PARTIAL Package needs to be pulled/downloaded icap-automation Automation - Intelligent Capture 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded image-management Image Management 2.1.368.60015 READY machine-reasoning Machine Reasoning 2.1.368.210017 PARTIAL Package needs to be pulled/downloaded ncp-system NCP - Base 2.1.368.60015 ERROR Unable to update ManagedService=fusion/database/postgres:1.6.1. LibraryServiceBundle=postgres:1.6.1 does not exist in repository 'None' ndp-base-analytics Network Data Platform - Base Analytics 1.6.1028 PARTIAL Package needs to be pulled/downloaded ndp-platform Network Data Platform - Core 1.6.596 ERROR Unable to update ManagedService=ndp/database/elasticsearch:1.7.4. LibraryServiceBundle=elasticsearch:1.7.4 does not exist in repository 'None' ndp-ui Network Data Platform - Manager 1.6.543 PARTIAL Package needs to be pulled/downloaded network-visibility Network Controller Platform 2.1.368.60015 READY path-trace Path Trace 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded platform-ui Cisco DNA Center UI 1.6.2.446 READY rbac-extensions RBAC Extensions 2.1.368.1910001 PARTIAL Package needs to be pulled/downloaded rogue-management Rogue and aWIPS 2.2.0.51 PARTIAL Package needs to be pulled/downloaded sd-access SD Access 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded sensor-assurance Assurance - Sensor 2.2.2.484 PARTIAL Package needs to be pulled/downloaded sensor-automation Automation - Sensor 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded ssa Stealthwatch Security Analytics 2.1.368.1091226 PARTIAL Package needs to be pulled/downloaded system System 1.6.594 READY system-commons System Commons 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded umbrella Cisco Umbrella 2.1.368.592015 PARTIAL Package needs to be pulled/downloaded wide-area-bonjour Wide Area Bonjour 2.4.368.75006 READY
After going deeper in detail I listed the status of service containers of Cloud Connectivity – Data Hub and there it was visible that not all pods were running. It was as if some dependencies were missing from other services (the list above) or something just got stuck and didn’t try again to recreate those container pods.
$ maglev catalog package status cloud-connectivity-data-hub KIND RESOURCE STATE MESSAGE ---------------------------------------------------------------------------------------------------------------------- AppStack dxhub:1.6.0.380 PARTIAL One or more child resources are not available yet Package cloud-connectivity-data-hub:1.6.0.380 PARTIAL One or more child resources are not available yet ServiceBundle dxhub/dxhub-api-proxy:1.6.0.380 PARTIAL Pulling artifacts for service dxhub-api-proxy (pulled=0, remaining=1) ServiceBundle dxhub/dxhub-registry:1.6.0.380 PARTIAL Pulling artifacts for service dxhub-registry (pulled=0, remaining=1)
The next command that I found useful here was “maglev catalog package edit cloud-connectivity-data-hub” which ended up being a way to edit K8S manifest for those PODs running Cloud Connectivity – Data Hub service.
I searched the manifest listed below and found at the end the setting which stated: “requiresPull: false”. Didn’t have anything to lose here so I just edited that part: “requiresPull: true”.
$ maglev catalog package edit cloud-connectivity-data-hub
capabilityStatus:
modified: 1653544689.9558897
_dependsOn:
capabilities: []
_labels:
release:
- dnac:2.1.2.7
- dnac:2.1.2.8
_manifestCount: 4
_provides:
capabilities:
- level: 2
minLevel: 0
name: dxhub:registry
- level: 2
minLevel: 0
name: dxhub:api-proxy
_pullStatus:
artifactPullCompleted: false
bytesToDownload: 20606255
manifestPullCompleted: true
manifestsDownloaded: 2
memberId: catalogserver-77dc8dcdbc-t4mmk
modified: 1653544610.1903415
progress: 12
timestamps:
pullStart: 1653544505.7862234
_size: 20606255
appstacks:
- dxhub:1.6.0.380
description: DxHub provides APIs to create streams, subscribe to streams, publish
messages to streams and consume messages from the streams.
displayName: Cloud Connectivity - Data Hub
kind: Package
labels:
category: Cisco DNA Center Core
team: dxhub
manifestVersion: v4
name: cloud-connectivity-data-hub
requiresPull: false
status:
message: '[dxhub/dxhub-registry:1.6.0.380] Pulling artifacts for service dxhub-registry
(pulled=0, remaining=1)'
state: PARTIAL
version: 1.6.0.380
modified: 1653544689.9558897
_dependsOn:
capabilities: []
_labels:
release:
- dnac:2.1.2.7
- dnac:2.1.2.8
_manifestCount: 4
_provides:
capabilities:
- level: 2
minLevel: 0
name: dxhub:registry
- level: 2
minLevel: 0
name: dxhub:api-proxy
_pullStatus:
artifactPullCompleted: false
bytesToDownload: 20606255
manifestPullCompleted: true
manifestsDownloaded: 2
memberId: catalogserver-77dc8dcdbc-t4mmk
modified: 1653544610.1903415
progress: 12
timestamps:
pullStart: 1653544505.7862234
_size: 20606255
appstacks:
- dxhub:1.6.0.380
description: DxHub provides APIs to create streams, subscribe to streams, publish
messages to streams and consume messages from the streams.
displayName: Cloud Connectivity - Data Hub
kind: Package
labels:
category: Cisco DNA Center Core
team: dxhub
manifestVersion: v4
name: cloud-connectivity-data-hub
requiresPull: false
status:
message: '[dxhub/dxhub-registry:1.6.0.380] Pulling artifacts for service dxhub-registry
(pulled=0, remaining=1)'
state: PARTIAL
version: 1.6.0.380
After that everything started to work again. There were multiple other applications that started downloading themselves which was visible in maglev CLI and also on WEB UI.
$ maglev catalog package delete cloud-connectivity-data-hub NAME DISPLAY_NAME VERSION STATE INFO -------------------------------------------------------------------------------------------------------- access-control-application Access Control Application 2.1.368.60015 READY ai-network-analytics AI Network Analytics 2.6.9.455 READY app-hosting Application Hosting 1.6.6.2112161504 READY application-policy Application Policy 2.1.368.170003 READY application-registry Application Registry 2.1.368.170003 READY application-visibility-service Application Visibility Service 2.1.368.170003 READY assurance Assurance - Base 2.2.2.485 READY automation-core NCP - Services 2.1.368.60015 READY base-provision-core Automation - Base 2.1.368.60015 READY cloud-connectivity-contextual-content Cloud Connectivity - Contextual Content 1.3.1.364 READY cloud-connectivity-data-hub Cloud Connectivity - Data Hub 1.6.0.380 PARTIAL Package needs to be pulled/downloaded cloud-connectivity-tethering Cloud Connectivity - Tethering 2.12.1.2 READY cloud-provision-core Cloud Device Provisioning Application 2.1.368.60015 READY command-runner Command Runner 2.1.368.60015 READY device-onboarding Device Onboarding 2.1.368.60015 READY disaster-recovery Disaster Recovery 2.1.367.360196 READY dna-core-apps Network Experience Platform - Core 2.1.368.60015 READY dnac-platform Cisco DNA Center Platform 1.5.1.180 READY dnac-search Cisco DNA Center Global Search 1.5.0.466 PARTIAL Package needs to be pulled/downloaded endpoint-analytics AI Endpoint Analytics 1.4.375 READY group-based-policy-analytics Group-Based Policy Analytics 2.2.1.401 READY icap-automation Automation - Intelligent Capture 2.1.368.60015 READY image-management Image Management 2.1.368.60015 READY machine-reasoning Machine Reasoning 2.1.368.210017 READY ncp-system NCP - Base 2.1.368.60015 READY ndp-base-analytics Network Data Platform - Base Analytics 1.6.1028 PARTIAL ======> [ 25%] ndp-platform Network Data Platform - Core 1.6.596 PARTIAL ==========> [ 38%] ndp-ui Network Data Platform - Manager 1.6.543 PARTIAL Package needs to be pulled/downloaded network-visibility Network Controller Platform 2.1.368.60015 READY path-trace Path Trace 2.1.368.60015 PARTIAL =====================> [ 74%] platform-ui Cisco DNA Center UI 1.6.2.446 READY rbac-extensions RBAC Extensions 2.1.368.1910001 PARTIAL Package needs to be pulled/downloaded rogue-management Rogue and aWIPS 2.2.0.51 PARTIAL Package needs to be pulled/downloaded sd-access SD Access 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded sensor-assurance Assurance - Sensor 2.2.2.484 PARTIAL Package needs to be pulled/downloaded sensor-automation Automation - Sensor 2.1.368.60015 PARTIAL Package needs to be pulled/downloaded ssa Stealthwatch Security Analytics 2.1.368.1091226 PARTIAL Package needs to be pulled/downloaded system System 1.6.594 READY system-commons System Commons 2.1.368.60015 READY umbrella Cisco Umbrella 2.1.368.592015 PARTIAL Package needs to be pulled/downloaded wide-area-bonjour Wide Area Bonjour 2.4.368.75006 READY
For the problematic cloud-connectivity-data-hub status now was as if it wasn’t downloaded yet and waiting for the pool from the cloud to begin. It was just needed to initiate it:
$ maglev catalog package pull cloud-connectivity-data-hub NAME DISPLAY_NAME VERSION STATE --------------------------------------------------------------------------------------------- cloud-connectivity-data-hub Cloud Connectivity - Data Hub 1.6.0.380 PARTIAL Use maglev catalog package status '[:]'" to monitor the progress of the operation
The output after this command showed that the service started to pull container images and deploy the pods successfully after a few moments:
$ maglev catalog package display status cloud-connectivity-data-hub KIND RESOURCE STATE MESSAGE ---------------------------------------------------------------------------------------------------------------------- AppStack dxhub:1.6.0.380 PARTIAL One or more child resources are not available yet Package cloud-connectivity-data-hub:1.6.0.380 PARTIAL One or more child resources are not available yet ServiceBundle dxhub/dxhub-api-proxy:1.6.0.380 PARTIAL Building container image for 'dxhub-api-proxy' ServiceBundle dxhub/dxhub-registry:1.6.0.380 PARTIAL Building container image for 'dxhub-registry'
$ maglev catalog package status cloud-connectivity-data-hub KIND RESOURCE STATE MESSAGE ---------------------------------------------------------------------------------------------------------------------- AppStack dxhub:1.6.0.380 READY Package cloud-connectivity-data-hub:1.6.0.380 READY ServiceBundle dxhub/dxhub-api-proxy:1.6.0.380 READY ServiceBundle dxhub/dxhub-registry:1.6.0.380 READY
After all this, Cisco DNA appliance UI also showed that all components were downloaded successfully and the Upgrade process could be finished with the Update All option