Platform9 Edge Cloud
Latest
Frequently Asked Questions
How To
Solution
Internal Only
Templates
Powered By

Title
Message
Create new category
What is the title of your new category?
Edit page index title
What is the title of the page index?
Edit category
What is the new title of your category?
Edit link
What is the new title and URL of your link?
Metrics-server Pods Are Continuously Restarting With Probe Failures
Copy Markdown
Open in ChatGPT
Open in Claude
Problem
Metrics-server pods are restarting with following errors:
metrics-server pod logs
shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestControllertimeout waiting for SETTINGS frames from 10.69.69.198:22639writers.go:117] apiserver was unable to write a JSON response: http: Handler timeoutstatus.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeoutwriters.go:117] apiserver was unable to write a JSON response: http: Handler timeoutwriters.go:130] apiserver was unable to write a fallback JSON response: http: Handler timeoutwriters.go:117] apiserver was unable to write a JSON response: http: Handler timeoutwrap.go:54] timeout or abort while handling: GET "/apis/metrics.k8s.io/v1beta1"status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeoutstatus.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeoutstatus.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeoutapi-server logs
controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.\n","stream":"stderr","time":"2022-06-23T12:37:12.03602478Z"}available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.21.157.51:443/apis/metrics.k8s.io/v1beta1: Get \"https://10.21.157.51:443/apis/metrics.k8s.io/v1beta1\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\n","stream":"stderr","time":"2022-06-23T12:37:16.052335263Z"}handler_proxy.go:102] no RequestInfo found in the context\n","stream":"stderr","time":"2022-06-23T12:37:17.053434398Z"}controller.go:116] loading OpenAPI spec for \"v1beta1.metrics.k8s.io\" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable\n","stream":"stderr","time":"2022-06-23T12:37:17.053516683Z"}{"log":", Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]\n","stream":"stderr","time":"2022-06-23T12:37:17.053519758Z"}Environment
- Platform9 Edge Cloud - v5.3 and above
- Metrics-server - v0.5.0
Cause
api-server logs shows the large context deadline exceeded which indicated the CPU resource isn't enough for the pods.
Resolution
Use the following steps to increase the requests and limits for the metrics-server container
- Login to the DU VM and check the watch status:
Command
# /opt/pf9/qbert/bin/kubectl get clusteraddons <CLUSTERUUID>-metrics-server --kubeconfig='/etc/sunpike/kubeconfig' -o yaml | grep watch f:watch: {} watch: true- On Edit action, set
watch: false
Command
xxxxxxxxxxspec: clusterID: 8dcfa1b7-366e-4a6a-aa07-6dcbb773bde9 override: params: - name: metricsMemoryLimit value: 300Mi - name: metricsCpuLimit value: 100m type: metrics-server version: 0.5.0 watch: falseWhen the watch is disabled you won't see the field watch under spec because it will only show if the watch is set to True.
- Scaled down the metrics-server deployment to 0
Command
xxxxxxxxxx# kubectl scale deployment --replicas=0 metrics-server-v0.5.0 -n kube-system- To increase CPU to 200M we need to tweak
extra-cpu
Example
xxxxxxxxxxCommand: /pod_nanny --cpu=40m --extra-cpu=10m <<------ --minClusterSize=16In the above example, we wanted to set 100m CPU for metrics server container we increased the extra-cpu to 10 so that CPU will become 200M. The calculation formula is [cpu+(extra-cpu*minClusterSize)]
- Scale the
metric-serverpod replicas back to1
Command
xxxxxxxxxx# kubectl scale deployment --replicas=1 metrics-server-v0.5.0 -n kube-system- Verify the
metrics-serverpod CPU resource:
Example
xxxxxxxxxx----------------------- Limits: cpu: 200m memory: 104Mi Requests: cpu: 200m memory: 104Mi-----------------------VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches
Last updated on
Was this page helpful?
Discard Changes
Do you want to discard your current changes and overwrite with the template?
Archive Synced Block
Message
Create new Template
What is this template's title?
Delete Template
Message