Managed Kubernetes
Latest
Frequently Asked Questions
Solutions
How Tos
Internal Only
Templates
Powered By

Title
Message
Create new category
What is the title of your new category?
Edit page index title
What is the title of the page index?
Edit category
What is the new title of your category?
Edit link
What is the new title and URL of your link?
Cluster Upgrade From 1.20 to 1.21 Is Getting Failed Due To ETCD Corruption.
Copy Markdown
Open in ChatGPT
Open in Claude
Problem
ETCD environment variable entries are missing from the Sunpike host object during the cluster upgrade from 1.20 to 1.21 in PMK-5.5 Managemant Plane.
The nodelet logs are showing below errors:
Nodelet log
--- /opt/pf9/pf9-kube/phases/etcd_configure.sh start at 2023-04-24 17:42:40 ---[2023-04-24 17:42:40] Ensuring etcd data is stored on host[2023-04-24 17:42:40] Error: No such object: etcd[2023-04-24 17:42:40] Skipping; etcd container does not exist--- /opt/pf9/pf9-kube/phases/etcd_run.sh start at 2023-04-24 17:42:40 ---[2023-04-24 17:42:41] Node endpoint is 172.20.58.9[2023-04-24 17:42:41] Deriving local etcd environment[2023-04-24 17:42:41] Ensuring container 'etcd' is destroyed[2023-04-24 17:42:41] [2023-04-24 17:42:56] {"level":"warn","ts":"2023-04-24T15:42:56.941Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-57c1/localhost:4001","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Errorwhile dialing dial tcp 127.0.0.1:4001: connect: connection refused\""}[2023-04-24 17:42:56] https://localhost:4001 is unhealthy: failed to commit proposal: context deadline exceeded[2023-04-24 17:42:56] Error: unhealthy cluster[2023-04-24 17:42:57] Waiting for healthy etcd cluster.[2023-04-24 17:43:12] {"level":"warn","ts":"2023-04-24T15:43:12.410Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-444/localhost:4001","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Errorwhile dialing dial tcp 127.0.0.1:4001: connect: connection refused\""}[2023-04-24 17:43:12] https://localhost:4001 is unhealthy: failed to commit proposal: context deadline exceeded[2023-04-24 17:43:12] Error: unhealthy clusterEnvironment
- Platform9 Managed Kubernetes - v5.5.
- Kubernetes version 1.20.
Answer
This is a known issue, Platform9 Engineering team is currently working on this case, expecting the fix for this issue on PMK-5.10 release.
Additional Information
To track the progress of the fix for this issue, open a support ticket mentioning the JiraID- PMK-5803.
VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches
Last updated on
Was this page helpful?
Discard Changes
Do you want to discard your current changes and overwrite with the template?
Archive Synced Block
Message
Create new Template
What is this template's title?
Delete Template
Message