Skip to main content

Troubleshooting

This page provides information to help troubleshoot issues with your Upwind cluster components installation.

Update Client Credentials

To update the upwind-operator chart with a new client credentials, generate the new client ID and client secret in the console and then run the following commands:

helm repo add upwind https://charts.upwind.io/ --force-update && helm repo update
helm upgrade upwind-operator upwind/upwind-operator \
--namespace upwind \
--reuse-values \
--set credentials.clientId="${UPWIND_CLIENT_ID}" \
--set credentials.clientSecret="${UPWIND_CLIENT_SECRET}"

The upwind-operator pod is not running or crashing

Check the state of the upwind-operator helm chart

Run the following command from your terminal:

helm -n upwind status upwind-operator

Confirm that the upwind-operator chart STATUS is "deployed".

Check the logs of the upwind-operator pod

Inspect logs from the upwind-operator pod.

kubectl -n upwind logs --selector=app.kubernetes.io/name=upwind-operator | grep "ERROR"

You may encounter common issues with the operator such as:

  1. Errors related to token fetching
  2. Errors related to the cloud metadata service (IMDSv2)

The Kubernetes cluster is not appearing in the Upwind Runtime Map

Confirm that the agent and clusteragent custom resources have been installed

kubectl -n upwind get agent upwind
kubectl -n upwind get clusteragent upwind

The status of both custom resources should be "Installed".

Confirm that the Upwind Sensor DaemonSet is running

Check the status of the upwind-agent DaemonSet and the pods within the DaemonSet.

kubectl -n upwind get daemonset upwind-agent
kubectl -n upwind get pods --selector=app.kubernetes.io/name=agent
kubectl get nodes

Confirm that all pods in the DaemonSet are running and that the number of pods matches the number of nodes in the cluster, and that all containers in the pods are healthy.

Confirm that the Upwind Cluster Manager is running

Check the status of the upwind-cluster-agent deployment and the pod in the deployment.

kubectl -n upwind get deployment upwind-cluster-agent
kubectl -n upwind get pods --selector=app.kubernetes.io/name=cluster-agent

Upgrade

The Upwind Operator will occasionally need to be upgraded to take advantage of the latest features of the Upwind platform.

helm repo update upwind
helm -n upwind upgrade upwind-operator upwind/upwind-operator

Reinstall Troubleshooting

If you run into issues when reinstalling Upwind’s Sensor after an initial uninstall, there are several things you can do to troubleshoot.

In general, Upwind’s Sensor uninstall steps should be safe to follow from any cluster state, and if the commands run successfully or return errors such as Error: uninstall: Release not loaded: upwind-operator: release: not found for the helm uninstall or Error from server (NotFound): agents.components.upwind.io "upwind" not found when deleting the custom resources you should be able to reinstall without any issues.

If you have completed these steps and are still having trouble reinstalling the Upwind Sensor, you may have failed to delete custom resources before uninstalling. Please make sure to delete any custom resources and then proceed to the reinstall steps.

Uninstall

In the event that you need to uninstall the Upwind cluster components from a cluster, it is recommended to first delete the custom resources (CRs) created by Upwind and then uninstall the helm chart to avoid any issues with Kubernetes finalizers and CRs not being removed.

  1. Delete the custom resources.
kubectl -n upwind delete agent upwind
kubectl -n upwind delete clusteragent upwind
  1. Confirm the custom resources have been deleted. Both of these commands should not return any resources.
kubectl -n upwind get agents
kubectl -n upwind get clusteragents
  1. Uninstall the upwind-operator helm chart.
helm -n upwind uninstall upwind-operator
  1. Delete the upwind namespace.
kubectl delete namespace upwind

If you experience any issues, please contact us at our 24/7 console chat support for a quick resolution. Open a Chat.

Upgrading Kubernetes From 1.24 to 1.25 With the Upwind Sensor Installed

By default, the Upwind Sensor comes with a PodSecurityPolicy named upwind-agent, which enables the Upwind Sensor pods to run as privileged pods. PodSecurityPolicy was removed in Kubernetes 1.25, so before upgrading to Kubernetes 1.25 you must remove the PodSecurityPolicy. In order to avoid disrupting the Upwind Sensor pods, however, you will either need to disable the PodSecurityPolicy admission controller, or have a different PodSecurityPolicy that allows the Upwind Sensor pods to run as privileged pods. Consult the documentation from your cloud provider about migrating away from PodSecurityPolicy.

Removing the PodSecurityPolicy

Run the following Helm command to remove the upwind-agent PodSecurityPolicy:

helm -n upwind upgrade upwind-operator upwind/upwind-operator --reuse-values --set agent.values.podSecurityPolicy.enabled=false 

If you do not have the PodSecurityPolicy admission controller enabled, or you have a different PodSecurityPolicy that will allow the Upwind Sensor pods to run as privileged, you can disable the PodSecurityPolicy by running this command at any time.

EKS

On EKS there is a PodSecurityPolicy named eks.privileged that allows pods to run as privileged by default. If you have not removed this PodSecurityPolicy, you can remove the upwind-agent PodSecurityPolicy by running the above command at any time.

Removing the PodSecurityPolicy from the Helm manifest after upgrading

In the event that a cluster is upgraded to Kubernetes 1.25 before the upwind-agent PodSecurityPolicy is removed, you will encounter errors upgrading the Upwind Sensor. You can resolve these errors by removing the PodSecurityPolicy definition out of the Helm manifest history. Run this command to update the Helm manifest history and remove the PodSecurityPolicy:

helm plugin install https://github.com/helm/helm-mapkubeapis
helm mapkubeapis upwind-agent --namespace upwind