Skip to main content

Configuration

The Sensor and Cluster Manager are shipped with default configurations that should be suitable for most installations, but can be customized by setting values in the Upwind Operator chart.

Using Kubernetes Secret for Upwind Operator clientId and clientSecret

The Upwind Operator requires a clientId and clientSecret to authenticate with the Upwind API. These credentials can be provided as a Kubernetes Secret, which is then referenced in the Upwind Operator configuration.

The Secret convention is as follows:

clientId: $CLIENT_ID
clientSecret: $CLIENT_SECRET

For example, you can create it with the following command:

kubectl create secret generic --namespace upwind upwind-secret \
--from-literal=clientId=$CLIENT_ID \
--from-literal=clientSecret=$CLIENT_SECRET

To reference the Secret in the Upwind Operator configuration, you must use the following configuration:

credentials:
create: false
name: upwind-secret

Enabling Scan Jobs

Helm Values

To enable the Scan Jobs deployment model for the Scanner, upgrade the Upwind Operator chart to disable the scanner as a sidecar container in the Sensor Daemonset and activate management of the jobs from the Cluster Manager. Ensure you have updated to the latest upwind-operator chart before enabling Scan Jobs.

agent:
values:
scanAgent:
enabled: false
clusterAgent:
values:
scanJob:
enabled: true

Refer Configuration Library for additional configuration.

Tuning Scan Jobs

Scan jobs performance can be tuned using the following configuration options. Set it as environment variables at clusterAgent.values.scanJob.env or as args at clusterAgent.values.scanJob.extraArgs.

Environment VariableArgDescriptionDefault
UPWIND_IMAGE_SCAN_JOBS_THRESHOLD--image-scan-jobs-thresholdScan jobs will not be scheduled if the number of active jobs exceeds this limit.5
UPWIND_IMAGE_SCAN_PENDING_JOBS_THRESHOLD--image-scan-pending-jobs-thresholdScan jobs will not be scheduled if the number of pending jobs exceeds this limit, even if active jobs do not exceed image-scan-jobs-threshold.5
UPWIND_IMAGE_SCAN_JOB_TAINT_EXCEPTIONS--image-scan-job-taint-exceptionsNode taints where the scan job should not run e.g. key=value:NoSchedule.
UPWIND_SCANNER_MEM_INCREASE--scanner-mem-increaseScan job memory increase factor for retries in case of OOMKill.1.5
UPWIND_SCANNER_MEM_LIMIT--scanner-mem-limitScan job memory limit for retries, as a percentage of node capacity.75
UPWIND_SCAN_REPROCESSING_INTERVAL--scan-reprocessing-intervalInterval, in seconds, to reprocess all pods for scanning, if not scanned already.14400

Configure with Prometheus

Prometheus is a monitoring system commonly used with Kubernetes, which gathers metrics (e.g. CPU usage) from various sources and stores them. You can then query them with a tool like Grafana, which lets you display a graph of the CPU usage of your pods.

Monitoring of the sensors and cluster manager can be enabled by setting the following chart values in the upwind-operator chart:

agent:
values:
agent:
metrics:
enabled: true
podMonitor:
enabled: true
scanAgent:
metrics:
enabled: true
clusterAgent:
values:
serviceMonitor:
enabled: true

Note that the sensor pod is a host-networked pod and enabling metrics will require that the sensor be able to use ports 59090 (for the agent container metrics) and 59091 (for the scan agent container metrics). The metrics port can be configured via the agent.metrics.port and scanAgent.metrics.port agent chart parameters.

The Prometheus PodMonitor resource in Kubernetes is defined by the following:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: monitoring-upwind-agents
labels:
app.kubernetes.io/name: upwind-agent
spec:
podMetricsEndpoints:
- honorLabels: true
path: /metrics
port: agent-metrics
scheme: http
scrapeTimeout: 30s
jobLabel: upwind-agent
namespaceSelector:
matchNames:
- upwind
selector:
matchLabels:
app.kubernetes.io/instance: upwind-agent
app.kubernetes.io/name: agent
The Prometheus ServiceMonitor resource in Kubernetes is defined by the following:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: upwind-cluster-agent
spec:
jobLabel: upwind-cluster-agent
selector:
matchExpressions:
- key: app.kubernetes.io/name
values:
- cluster-agent
operator: In
namespaceSelector:
matchNames:
- upwind
endpoints:
- port: metrics
interval: 30s

Purpose of the Cluster Manager and Sensor ClusterRole

Upwind creates and utilizes the upwind-cluster-agent ClusterRole, configured to monitor and audit key Kubernetes resources. This role is tailored for observing and reporting on the state and configuration of various Kubernetes objects, necessary for ensuring a comprehensive security and operational overview.

The upwind-cluster-agent ClusterRole is defined with the following rules:

Core API Group (apiGroups: [""]):
Resources: Services, namespaces, nodes and pods.
Actions Allowed: ["watch", "list", "get"]

Apps API Group: ["apps"]:
Resources: Deployments, replicasets, daemonsets, and statefulsets.
Actions Allowed: ["watch", "list", "get"]

Networking API Group: ["networking.k8s.io"]:
Resources: Network policies and ingresses.
Actions Allowed: ["watch", "list", "get"]

Batch API Group: ["batch"]:
Resources: Jobs and cronjobs.
Actions Allowed: ["watch", "list", "get"]

CustomResouceDefinition API Group ["apiextensions.k8s.io"]:
Resources: CustomResourceDefinitions
Actions Allowed: ["watch", "list", "get"]

Additionally, Upwind creates and uses the upwind-agent ClusterRole with the following configuration:

Core API Group (apiGroups: [""]):
Resources: Services, namespaces, nodes and pods.
Actions Allowed: ["watch", "list", "get"]

Apps API Group: ["apps"]:
Resources: Deployments, replicasets, daemonsets, and statefulsets.
Actions Allowed: ["watch", "list", "get"]

Batch API Group: ["batch"]:
Resources: Jobs and cronjobs.
Actions Allowed: ["watch", "list", "get"]

CustomResouceDefinition API Group ["apiextensions.k8s.io"]:
Resources: CustomResourceDefinitions
Actions Allowed: ["watch", "list", "get"]

The Upwind Cluster Manager and Sensor ClusterRole are configured to access specific types of resources within designated API groups, focusing on services, namespaces, various workload controllers, and networking configurations. The verbs watch, list, and get provide read-only access, allowing Upwind to observe and report on resource states without modifying them. The ClusterRole adheres to the principle of least privilege by limiting its scope to specific resource types, reducing potential security risks.

Tuning the Sensor and Cluster Manager

The Sensor pods will use memory roughly in proportion to the level of network and process activity on the node the Sensor is running on, and is not dependent on the overall size of the Kubernetes cluster. Most of the memory the Sensor uses is dedicated to internal caches of data related to active network connections. Depending on your specific environment, it may be beneficial to adjust the size or expiration time of these caches by configuring the Sensor daemonset by modifying agent.values.agent.config.connCache or agent.values.agent.config.connTrackCache. Increasing the cache sizes will allow the sensor to monitor more concurrent network connections. Reducing the size of the caches or the expiration time can reduce memory usage.

The Cluster Manager will use memory roughly in proportion to the size of the cluster it is running in and the amount of network traffic in the cluster. For clusters larger than 50 nodes or with more than 1 GB/s of network traffic, it's recommended to increase the memory requests and limits for the Cluster Manager.

Proxy Configuration

All cluster components the operator, cluster manager, sensor and scanner will all respect the HTTP_PROXY family of environment variables for egress communication The helm charts have a proxy object that can be configured

proxy:
enabled: true|false
httpProxy: ""
httpsProxy: ""
noProxy: ""
httpProxy

Used as the proxy URL for HTTP requests unless overridden by noProxy.

httpsProxy

Used as the proxy URL for HTTPS requests unless overridden by noProxy When not specified will use the value from httpProxy

noProxy

Specifies a string that contains comma-separated values specifying hosts that should be excluded from proxying. Each value is represent by:

  • an IP address (1.2.3.4)
  • an IP address prefix in CIDR notation (1.2.3.0/24)
  • a domain name An IP address prefix and domain name can also include a literal port number (1.2.3.4:80). A domain name matches that name and all subdomains.

Configuring TLS/mTLS

The sensor and cluster manager can be configured to use TLS/mTLS for communications between them.

The sensor and cluster manager pods expect their respective certificates and their associated keys to exist in a secret, and their corresponding CA certificates to exist in a config map. By default the sensor pod expects a secret named upwind-secrets-agent-certs for the sensor certificate and key, and a config map named upwind-config-agent-peer-ca-certs for the CA certificate. These names can be configured in the upwind operator chart. Similarly the cluster manager pod expects a secret named upwind-secrets-cluster-agent-certs for the cluster manager certificate and key, and a config map named upwind-config-cluster-agent-peer-ca-certs for the CA certificate.

tip

It can be convenient to use cert-manager and trust-manager (see the official docs for more) to manage the creation of the required secrets and config maps.


If cert-manager is enabled a Certificate which represents a certificate request to obtain a signed certificate from a configured Issuer or ClusterIssuer will be created during the installation of the upwind-operator chart and cert-manager will create secrets for the TLS certificates where the sensor and cluster manager pods expect to find them. Furthermore, trust-manager can be configured to create the config maps for the CA certificates where the sensor and cluster manager pods expect to find them upon the creation of the secrets.

note

When installing trust-manager it should be configured to read source secrets from the upwind namespace by setting the value app.trust.namespace to upwind i.e. --set app.trust.namespace=upwind.

With cert-manager and trust-manager set up to create the expected secrets and config maps (or they have been created manually in which case the certManager field can be omitted in the configs below) TLS/mTLS can be enabled by setting the following chart values in the upwind-operator chart:

agent:
values:
tls:
ca:
enabled: true
agent:
extraArgs:
- --in-cluster-tls=true
- --ca-cert-enabled=true
clusterAgent:
values:
tls:
certManager:
enabled: true
issuerRef:
name: my-ca-issuer
certificates:
enabled: true
extraArgs:
- --self-signed=false
caution

While the sensor and cluster manager pods support automatic reloading of TLS certificates, automatic reloading of CA certificates is not supported. Therefore, the relevant component must be restarted to pick up changes to the CA certificate. For example, if the CA that signs the cluster manager certificates is changed, the sensors must be restarted to recognize the new CA before the cluster manager pods start using certificates signed by it.

Enabling response

To enable the response feature, verify that the --responses-enabled flag is set to true.

For example:

{
"agent": {
"values": {
"agent": {
"extraArgs": [
...
"--responses-enabled=true",
...
]
},
...
},
"clusterAgent": {
"values": {
"extraArgs": [
"--responses-enabled=true"
],
...
}
}
}

EKS Auto Mode

In EKS Auto Mode, the hop limit for IMDS is set to 1 and can not be updated. This prevents most pods from accessing IMDS unless they are using host networking. In this case, you must provide the AWS account ID via a flag to the Upwind Operator and Upwind Cluster Manager components since they can not load the account ID from IMDS. Example values for the upwind-operator chart:

credentials:
clientId: XXX
clientSecret: XXX
extraArgs:
- --cloud-account-id=XXXX
clusterAgent:
values:
extraArgs:
- --cloud-account-id=XXXX

Google Autopilot Clusters

In Google Autopilot Clusters, additional configuration of the sensor is required to allow the pods to launch.

credentials:
clientId: XXX
clientSecret: XXX
agent:
values:
providers:
gke:
autopilot:
enabled: true

API Catalog Configuration

There are a number of flags to control the data included in the API Catalog.

Environment VariableFlagDescriptionDefault
UPWIND_API_SEC_CATALOG--api-sec-catalog=trueWhether to enable the API Catalog.true
UPWIND_API_SEC_REQUESTS_SAMPLE_CNT--api-sec-requests-sample-cnt=1000Maximum number of request samples to include in an API Catalog report. Set to 0 to disable request sampling entirely.1000
UPWIND_API_SEC_REQUESTS_SAMPLE_SIZE_MAX--api-sec-requests-sample-size-max=1024Maximum size of request body. Set to 0 to disable including request bodies.1024
UPWIND_API_SEC_RESPONSE_SAMPLE_SIZE_MAX--api-sec-response-sample-size-max=1024Maximum size of response body. Set to 0 to disable including response bodies.1024
UPWIND_API_SEC_EXCLUDE_SENSITIVE_BODY--api-sec-exclude-sensitive-body=falseWhether to mask the entire request or response body on a sensitive data matchfalse

These flags or environment values can be added to the agent.values.extraArgs or agent.values.env lists in the upwind-operator chart.

gVisor Support

The Upwind Sensor includes support for monitoring applications running in gVisor sandboxes, providing enhanced security isolation for sensitive workloads. gVisor implements a userspace kernel that acts as a security boundary between containerized applications and the host system.

Learn more about gVisor support including configuration, architecture, and deployment options.

Istio Ambient Mesh Support

In order for the Upwind Sensor to accurately report network activity when Istio Ambient mesh is enabled, traffic must be monitored at layer 7 instead of layer 4. Set the --network-flow-sock and --network-flow-sock-all flags on the sensor. Example values for the upwind-operator chart with these flags set:

credentials:
clientId: XXX
clientSecret: XXX
agent:
values:
extraArgs:
- --network-flow-sock
- --network-flow-sock-all

Configuration Library

Below are the supported helm values for Upwind components. Note that you may need to upgrade to latest helm chart release version to use some of the values listed below.

# -- `[Optional]` Overrides application name.
nameOverride: ""

# -- `[Optional]` Overrides the full qualified application name.
fullnameOverride: ""

# Image.
# @see: <https://kubernetes.io/docs/concepts/containers/images/>
image:
# -- `[Optional]` Specifies the image registry/repository.
repository: public.ecr.aws/upwindsecurity/images/operator
# -- `[Optional]` Specifies the image pull policy.
pullPolicy: IfNotPresent
# -- `[Optional]` Specifies the image tag. Defaults to `.Chart.AppVersion`.
tag: ""
# -- `[Optional]` Specifies the image pull secrets.
# The Upwind Operator image is publicly available, but when self hosting images or using a proxy repository that
# requires authentication the imagePullSecrets can be set.
imagePullSecrets: []

# Service account.
# @see: <https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/>
serviceAccount:
# -- `[Optional]` Specifies whether a Service Account should be created.
create: true
# -- `[Optional]` Specifies the annotations for the Service Account.
annotations: {}
# -- `[Optional]` Specifies the name of the Service Account. Defaults to fully qualified application name.
name:

# -- `[Optional]` Specifies the Deployment annotations.
# @see: <https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/>
annotations: {}

# -- `[Optional]` Specifies the Pod annotations.
# @see: <https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/>
podAnnotations: {}

# `[Required]` Compute resources granted to the controller
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 10m
memory: 256Mi

# `[Required]` Deployment livenessProbe
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8081
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 20
successThreshold: 1
timeoutSeconds: 1

# `[Required]` Deployment readinessProbe
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: 8081
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1

# -- `[Optional]` Specifies the region of the Upwind Endpoints. Defaults to `us`
region: # @schema type:[string, null]

# [Static Secret] Upwind Application Credentials (OAuth 2.0 Client Credentials).
credentials:
# -- `[Optional]` Specifies whether a Secret should be created.
create: true
# -- `[Optional]` Specifies the name of the Secret. Defaults to fully qualified application name.
# When create is set to false this will be used as a reference to an existing Secret to use.
name: # @schema type:[string, null]
# -- `[Required]` Specifies the application's Client ID.
clientId: # @schema type:[string, null]
# -- `[Required]` Specifies the application's Client Secret.
clientSecret: # @schema type:[string, null
# -- `[Optional]` Specifies annotations for the Credentials Secret
annotations: {}
# -- `[Required]` Specifies the name of the Secret to be created.
registrySecretName: # @schema type:[string, null]
# -- `[Required]` Specifies the name of the Secret to be created.
agentSecretName: # @schema type:[string, null]

# [Agent Component] Configuration settings specifically for the Agent Custom Resource
agent:
# -- `[Optional]` Specifies whether or not this chart will create the Agent
# component resource. Disable to manage this resource on your own.
create: true
# -- `[Required]` The name of the Upwind `Agent` resource to create. Defaults
# to `upwind`. Final pod names will be `upwind-agent-...`.
name: upwind
# -- `[Optional]` Configures the reconciliation frequency of the Agent
# resource. Default is 1h.
interval: # @schema type:[string, null]
# -- `[Optional]` If specified, pins the version of the Agent component to a
# specific release. If unset, the Upwind Operator will automatically upgrade
# releases as they become available.
version: # @schema type:[string, null]
# -- `[Optional]` if specified, these custom values will be added to the
# Agent spec, and passed through to the Agent helm chart.
values: {}

# [Cluster Agent Component] Configuration settings specifically for the
# ClusterAgent Custom Resource
clusterAgent:
# -- `[Optional]` Specifies whether or not this chart will create the
# ClusterAgent component resource. Disable to manage this resource on your
# own.
create: true
# -- `[Required]` The name of the Upwind `ClusterAgent` resource to create.
# Defaults to `upwind`. Final pod names will be `upwind-cluster-agent-...`.
name: upwind
# -- `[Optional]` Configures the reconciliation frequency of the ClusterAgent
# resource. Default is 1h.
interval: # @schema type:[string, null]
# -- `[Optional]` If specified, pins the version of the ClusterAgent
# component to a specific release. If unset, the Upwind Operator will
# automatically upgrade releases as they become available.
version: # @schema type:[string, null]
# -- `[Optional]` if specified, these custom values will be added to the
# ClusterAgent spec, and passed through to the Cluster Agent helm chart.
values: {}

# Configuration settings specifically for the Operator itself
operator:
# -- `[Optional]` Set the log level for the controller.
logLevel: info
extraVolumes: []
extraVolumeMounts: []
# -- `[Optional]` If specified, this controls the Oauth Endpoint that the
# Operator will use to generate secret credentials for the components and for
# image pulling.
oauthEndpoint: # @schema type:[string, null]
# -- `[Optional]` If specified, this controls the audience value used by the
# Operator when generating credential tokens for the Agent and the Cluster
# Agent.
agentEndpoint: # @schema type:[string, null]
# --`[Optional]` If specified, this controls the audience value used by the
# Operator when generating credential Image Pull Secrets used to pull the
# Component images from the private registry.
registryEndpoint: # @schema type:[string, null]
# -- `[Optional]` If set to true, the Operator will use Helm to manage the
# Agent and Cluster Agent components. Defaults to true.
useHelm: true
# -- `[Optional]` SecurityContext for the operator container
securityContext:
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65532
runAsGroup: 65532
capabilities:
drop:
- ALL

# -- `[Optional]` Deployment progressDeadlineSeconds.
progressDeadlineSeconds: # @schema type:[string, null]

# -- `[Optional]` The number of Upwind Secret Controllers to run.
replicas: # @schema type:[integer, null]

# -- `[Optional]` Deployment revisionHistoryLimit.
revisionHistoryLimit: # @schema type:[integer, null]

# -- `[Optional]` Deployment strategy.
strategy: # @schema type:[string, null]

# -- `[Optional]` Specifies the priorityClass to use. Defaults to unset.
priorityClassName:

# -- `[Optional]` Specifies the node selector.
# @see: <https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/>
nodeSelector: {}

# -- `[Optional]` Specifies the tolerations for nodes that have taints on them.
# @see: <https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/>
tolerations: []

# -- `[Optional]` Pod SecurityContext
securityContext:
seccompProfile:
type: RuntimeDefault

# `[Optional]` Allowed profiles that can be set for pod.
# Applies to Red Hat Openshift installation only and should be set according to `.securityContext.seccompProfile.type` value.
#
# @see: https://docs.redhat.com/en/documentation/openshift_container_platform/3.11/html-single/architecture/index#authorization-seccomp
openshift:
scc:
seccompProfiles:
- runtime/default

# -- `[Required]` Pod affinity settings.
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
- key: kubernetes.io/os
operator: In
values:
- linux

# -- `[Required]` Additional command line arguments for the manager.
extraArgs:
# Ensure that we are reconciling the custom Component resources
- --enable-custom-resources=true

# -- `[Optional]` Additional environment variables for the manager.
extraEnv: []

# -- `[Optional]` Any additional labels that should be added to resources
# created by this chart.
extraLabels:
# This label is defined externally to the helper functions to enable
# overriding it with a different value in the EKS add-on chart.
# The common value Release[dot]Service (using [dot] notation to avoid
# validation issues) is not supported in EKS add-ons managed through
# the AWS Marketplace. However, in Helm, the value is always Helm,
# so it may be acceptable to simply use a hard-coded default value instead.
app.kubernetes.io/managed-by: Helm

identities:
# -- Names of ConfigMaps the cluster-agent will be allowed to read.
configMaps:
- aws-auth

# -- `[Optional]` Parameters configurable by parent charts.
global:
# [Static Secret] Upwind Application Credentials (OAuth 2.0 Client Credentials).
credentials:
# -- `[Optional]` Specifies whether a Secret should be created.
create: true
# -- `[Required]` Specifies the application's Client ID.
clientId: # @schema type:[string, null]
# -- `[Required]` Specifies the application's Client Secret.
clientSecret: # @schema type:[string, null]

# -- This adds proxy configuration for outbound http/https traffic from the components
proxy:
enabled: false
httpProxy: ""
httpsProxy: ""
noProxy: ""

# -- the names of additional agent custom resources managed by the operator.
extraAgentResources: []

clusterRole:
# -- Enables permissions to read all resources.
enableCustomResources: true