Troubleshooting
This page provides information to help troubleshoot issues with your Upwind cluster components installation.
Update Client Credentials
To update the upwind-operator
chart with a new client credentials, generate the new client ID and client secret in the console and then run the following commands:
helm repo add upwind https://charts.upwind.io/ --force-update
helm repo update
helm upgrade upwind-operator upwind/upwind-operator \
--namespace upwind \
--reset-then-reuse-values \
--set credentials.clientId="${UPWIND_CLIENT_ID}" \
--set credentials.clientSecret="${UPWIND_CLIENT_SECRET}"
The upwind-operator pod is not running or crashing
Check the state of the upwind-operator helm chart
Run the following command from your terminal:
helm status upwind-operator \
--namespace upwind
Confirm that the upwind-operator
chart STATUS is "deployed".
Check the logs of the upwind-operator pod
Inspect logs from the upwind-operator
pod.
kubectl -n upwind logs --selector=app.kubernetes.io/name=upwind-operator | grep "ERROR"
You may encounter common issues with the operator such as:
- Errors related to token fetching
- Errors related to the cloud metadata service (IMDSv2)
The Kubernetes cluster is not appearing in the Upwind Runtime Map
Confirm that the agent and clusteragent custom resources have been installed
kubectl -n upwind get agent upwind
kubectl -n upwind get clusteragent upwind
The status of both custom resources should be "Installed".
Confirm that the Upwind Sensor DaemonSet is running
Check the status of the upwind-agent DaemonSet and the pods within the DaemonSet.
kubectl -n upwind get daemonset upwind-agent
kubectl -n upwind get pods --selector=app.kubernetes.io/name=agent
kubectl get nodes
Confirm that all pods in the DaemonSet are running and that the number of pods matches the number of nodes in the cluster, and that all containers in the pods are healthy.
Confirm that the Upwind Cluster Manager is running
Check the status of the upwind-cluster-agent deployment and the pod in the deployment.
kubectl -n upwind get deployment upwind-cluster-agent
kubectl -n upwind get pods --selector=app.kubernetes.io/name=cluster-agent
Upgrade
The Upwind Operator will occasionally need to be upgraded to take advantage of the latest features of the Upwind platform.
helm repo update upwind
helm upgrade upwind-operator upwind/upwind-operator \
--namespace upwind \
--reset-then-reuse-values
Reinstall Troubleshooting
If you run into issues when reinstalling Upwind's Sensor after an initial uninstall, there are several things you can do to troubleshoot.
In general, Upwind's Sensor uninstall steps should be safe to follow from any cluster state, and if the commands run successfully or return errors such as Error: uninstall: Release not loaded: upwind-operator: release: not found
for the helm uninstall or Error from server (NotFound): agents.components.upwind.io "upwind" not found
when deleting the custom resources you should be able to reinstall without any issues.
If you have completed these steps and are still having trouble reinstalling the Upwind Sensor, you may have failed to delete custom resources before uninstalling. Please make sure to delete any custom resources and then proceed to the reinstall steps.
Uninstall
In the event that you need to uninstall the Upwind cluster components from a cluster, it is recommended to first delete the custom resources (CRs) created by Upwind and then uninstall the helm chart to avoid any issues with Kubernetes finalizers and CRs not being removed.
- Delete the custom resources.
kubectl -n upwind delete agent upwind
kubectl -n upwind delete clusteragent upwind
- Confirm the custom resources have been deleted. Both of these commands should not return any resources.
kubectl -n upwind get agents
kubectl -n upwind get clusteragents
- Uninstall the upwind-operator helm chart.
helm uninstall upwind-operator \
--namespace upwind
- Delete the upwind namespace.
kubectl delete namespace upwind
If you experience any issues, please contact us at our 24/7 console chat support for a quick resolution. Open a Chat .
Upgrading Kubernetes From 1.24 to 1.25 With the Upwind Sensor Installed
By default, the Upwind Sensor comes with a PodSecurityPolicy named upwind-agent
, which enables the Upwind Sensor pods to run as privileged pods. PodSecurityPolicy
was removed in Kubernetes 1.25, so before upgrading to Kubernetes 1.25 you must remove the PodSecurityPolicy
. In order to avoid disrupting the Upwind Sensor pods, however, you will either need to disable the PodSecurityPolicy
admission controller, or have a different PodSecurityPolicy
that allows the Upwind Sensor pods to run as privileged pods. Consult the documentation from your cloud provider about migrating away from PodSecurityPolicy
.
Removing the PodSecurityPolicy
Run the following Helm command to remove the upwind-agent
PodSecurityPolicy
:
helm upgrade upwind-operator upwind/upwind-operator \
--namespace upwind \
--reset-then-reuse-values \
--set agent.values.podSecurityPolicy.enabled=false
If you do not have the PodSecurityPolicy
admission controller enabled, or you have a different PodSecurityPolicy
that will allow the Upwind Sensor pods to run as privileged, you can disable the PodSecurityPolicy
by running this command at any time.
EKS
On EKS there is a PodSecurityPolicy
named eks.privileged
that allows pods to run as privileged by default. If you have not removed this PodSecurityPolicy
, you can remove the upwind-agent
PodSecurityPolicy
by running the above command at any time.
Removing the PodSecurityPolicy from the Helm manifest after upgrading
In the event that a cluster is upgraded to Kubernetes 1.25 before the upwind-agent
PodSecurityPolicy
is removed, you will encounter errors upgrading the Upwind Sensor. You can resolve these errors by removing the PodSecurityPolicy
definition out of the Helm manifest history. Run this command to update the Helm manifest history and remove the PodSecurityPolicy
:
helm plugin install https://github.com/helm/helm-mapkubeapis
helm mapkubeapis upwind-agent --namespace upwind
Firewall Configuration
In the event that you are deploying the Sensor, Cluster Manager and/or Operator to a cluster or host with a firewall that prevents outbound connections to the public internet, please add firewall rules that will allow HTTPS communication to the following hosts on port 443:
agent.upwind.io
agentgrpc.upwind.io
auth.upwind.io
charts.upwind.io
get.upwind.io
registry.upwind.io
releases.upwind.io
prod-us-east-1-starport-layer-bucket.s3.us-east-1.amazonaws.com