Skip to main content

Configuration

The Sensor and Cluster Manager are shipped with default configurations that should be suitable for most installations, but can be customized by setting values in the Upwind Operator chart.

Using Kubernetes Secret for Upwind Operator clientId and clientSecret

The Upwind Operator requires a clientId and clientSecret to authenticate with the Upwind API. These credentials can be provided as a Kubernetes Secret, which is then referenced in the Upwind Operator configuration.

The Secret convention is as follows:

clientId: $CLIENT_ID
clientSecret: $CLIENT_SECRET

For example, you can create it with the following command:

kubectl create secret generic --namespace upwind upwind-secret \
--from-literal=clientId=$CLIENT_ID \
--from-literal=clientSecret=$CLIENT_SECRET

To reference the Secret in the Upwind Operator configuration, you must use the following configuration:

credentials:
create: false
name: upwind-secret

Enabling Scan Jobs

Helm Values

To enable the Scan Jobs deployment model for the Scanner, upgrade the Upwind Operator chart to disable the scanner as a sidecar container in the Sensor Daemonset and activate management of the jobs from the Cluster Manager. Ensure you have updated to the latest upwind-operator chart before enabling Scan Jobs.

agent:
values:
scanAgent:
enabled: false
clusterAgent:
values:
scanJob:
enabled: true

Refer Configuration Library for additional configuration.

Tuning Scan Jobs

Scan jobs performance can be tuned using the following configuration options. Set it as environment variables at clusterAgent.values.scanJob.env or as args at clusterAgent.values.scanJob.extraArgs.

Environment VariableArgDescriptionDefault
UPWIND_IMAGE_SCAN_JOBS_THRESHOLD--image-scan-jobs-thresholdScan jobs will not be scheduled if the number of active jobs exceeds this limit.5
UPWIND_IMAGE_SCAN_PENDING_JOBS_THRESHOLD--image-scan-pending-jobs-thresholdScan jobs will not be scheduled if the number of pending jobs exceeds this limit, even if active jobs do not exceed image-scan-jobs-threshold.5
UPWIND_IMAGE_SCAN_JOB_TAINT_EXCEPTIONS--image-scan-job-taint-exceptionsNode taints where the scan job should not run e.g. key=value:NoSchedule.
UPWIND_SCANNER_MEM_INCREASE--scanner-mem-increaseScan job memory increase factor for retries in case of OOMKill.1.5
UPWIND_SCANNER_MEM_LIMIT--scanner-mem-limitScan job memory limit for retries, as a percentage of node capacity.75
UPWIND_SCAN_REPROCESSING_INTERVAL--scan-reprocessing-intervalInterval, in seconds, to reprocess all pods for scanning, if not scanned already.14400

Configure with Prometheus

Prometheus is a monitoring system commonly used with Kubernetes, which gathers metrics (e.g. CPU usage) from various sources and stores them. You can then query them with a tool like Grafana, which lets you display a graph of the CPU usage of your pods.

Monitoring of the sensors and cluster manager can be enabled by setting the following chart values in the upwind-operator chart:

agent:
values:
agent:
metrics:
enabled: true
podMonitor:
enabled: true
scanAgent:
metrics:
enabled: true
clusterAgent:
values:
serviceMonitor:
enabled: true

Note that the sensor pod is a host-networked pod and enabling metrics will require that the sensor be able to use ports 59090 (for the agent container metrics) and 59091 (for the scan agent container metrics). The metrics port can be configured via the agent.metrics.port and scanAgent.metrics.port agent chart parameters.

The Prometheus PodMonitor resource in Kubernetes is defined by the following:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: monitoring-upwind-agents
labels:
app.kubernetes.io/name: upwind-agent
spec:
podMetricsEndpoints:
- honorLabels: true
path: /metrics
port: agent-metrics
scheme: http
scrapeTimeout: 30s
jobLabel: upwind-agent
namespaceSelector:
matchNames:
- upwind
selector:
matchLabels:
app.kubernetes.io/instance: upwind-agent
app.kubernetes.io/name: agent
The Prometheus ServiceMonitor resource in Kubernetes is defined by the following:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: upwind-cluster-agent
spec:
jobLabel: upwind-cluster-agent
selector:
matchExpressions:
- key: app.kubernetes.io/name
values:
- cluster-agent
operator: In
namespaceSelector:
matchNames:
- upwind
endpoints:
- port: metrics
interval: 30s

Purpose of the Cluster Manager and Sensor ClusterRole

Upwind creates and utilizes the upwind-cluster-agent ClusterRole, configured to monitor and audit key Kubernetes resources. This role is tailored for observing and reporting on the state and configuration of various Kubernetes objects, necessary for ensuring a comprehensive security and operational overview.

The upwind-cluster-agent ClusterRole is defined with the following rules:

Core API Group (apiGroups: [""]):
Resources: Services, namespaces, nodes and pods.
Actions Allowed: ["watch", "list", "get"]

Apps API Group: ["apps"]:
Resources: Deployments, replicasets, daemonsets, and statefulsets.
Actions Allowed: ["watch", "list", "get"]

Networking API Group: ["networking.k8s.io"]:
Resources: Network policies and ingresses.
Actions Allowed: ["watch", "list", "get"]

Batch API Group: ["batch"]:
Resources: Jobs and cronjobs.
Actions Allowed: ["watch", "list", "get"]

CustomResouceDefinition API Group ["apiextensions.k8s.io"]:
Resources: CustomResourceDefinitions
Actions Allowed: ["watch", "list", "get"]

Additionally, Upwind creates and uses the upwind-agent ClusterRole with the following configuration:

Core API Group (apiGroups: [""]):
Resources: Services, namespaces, nodes and pods.
Actions Allowed: ["watch", "list", "get"]

Apps API Group: ["apps"]:
Resources: Deployments, replicasets, daemonsets, and statefulsets.
Actions Allowed: ["watch", "list", "get"]

Batch API Group: ["batch"]:
Resources: Jobs and cronjobs.
Actions Allowed: ["watch", "list", "get"]

CustomResouceDefinition API Group ["apiextensions.k8s.io"]:
Resources: CustomResourceDefinitions
Actions Allowed: ["watch", "list", "get"]

The Upwind Cluster Manager and Sensor ClusterRole are configured to access specific types of resources within designated API groups, focusing on services, namespaces, various workload controllers, and networking configurations. The verbs watch, list, and get provide read-only access, allowing Upwind to observe and report on resource states without modifying them. The ClusterRole adheres to the principle of least privilege by limiting its scope to specific resource types, reducing potential security risks.

Tuning the Sensor and Cluster Manager

The Sensor pods will use memory roughly in proportion to the level of network and process activity on the node the Sensor is running on, and is not dependent on the overall size of the Kubernetes cluster. Most of the memory the Sensor uses is dedicated to internal caches of data related to active network connections. Depending on your specific environment, it may be beneficial to adjust the size or expiration time of these caches by configuring the Sensor daemonset by modifying agent.values.agent.config.connCache or agent.values.agent.config.connTrackCache. Increasing the cache sizes will allow the sensor to monitor more concurrent network connections. Reducing the size of the caches or the expiration time can reduce memory usage.

The Cluster Manager will use memory roughly in proportion to the size of the cluster it is running in and the amount of network traffic in the cluster. For clusters larger than 50 nodes or with more than 1 GB/s of network traffic, it's recommended to increase the memory requests and limits for the Cluster Manager.

Proxy Configuration

All cluster components the operator, cluster manager, sensor and scanner will all respect the HTTP_PROXY family of environment variables for egress communication The helm charts have a proxy object that can be configured

proxy:
enabled: true|false
httpProxy: ""
httpsProxy: ""
noProxy: ""
httpProxy

Used as the proxy URL for HTTP requests unless overridden by noProxy.

httpsProxy

Used as the proxy URL for HTTPS requests unless overridden by noProxy When not specified will use the value from httpProxy

noProxy

Specifies a string that contains comma-separated values specifying hosts that should be excluded from proxying. Each value is represent by:

  • an IP address (1.2.3.4)
  • an IP address prefix in CIDR notation (1.2.3.0/24)
  • a domain name An IP address prefix and domain name can also include a literal port number (1.2.3.4:80). A domain name matches that name and all subdomains.

Configuring TLS/mTLS

The sensor and cluster manager can be configured to use TLS/mTLS for communications between them.

The sensor and cluster manager pods expect their respective certificates and their associated keys to exist in a secret, and their corresponding CA certificates to exist in a config map. By default the sensor pod expects a secret named upwind-secrets-agent-certs for the sensor certificate and key, and a config map named upwind-config-agent-peer-ca-certs for the CA certificate. These names can be configured in the upwind operator chart. Similarly the cluster manager pod expects a secret named upwind-secrets-cluster-agent-certs for the cluster manager certificate and key, and a config map named upwind-config-cluster-agent-peer-ca-certs for the CA certificate.

tip

It can be convenient to use cert-manager and trust-manager (see the official docs for more) to manage the creation of the required secrets and config maps.


If cert-manager is enabled a Certificate which represents a certificate request to obtain a signed certificate from a configured Issuer or ClusterIssuer will be created during the installation of the upwind-operator chart and cert-manager will create secrets for the TLS certificates where the sensor and cluster manager pods expect to find them. Furthermore, trust-manager can be configured to create the config maps for the CA certificates where the sensor and cluster manager pods expect to find them upon the creation of the secrets.

note

When installing trust-manager it should be configured to read source secrets from the upwind namespace by setting the value app.trust.namespace to upwind i.e. --set app.trust.namespace=upwind.

With cert-manager and trust-manager set up to create the expected secrets and config maps (or they have been created manually in which case the certManager field can be omitted in the configs below) TLS/mTLS can be enabled by setting the following chart values in the upwind-operator chart:

agent:
values:
tls:
ca:
enabled: true
agent:
extraArgs:
- --in-cluster-tls=true
- --ca-cert-enabled=true
clusterAgent:
values:
tls:
certManager:
enabled: true
issuerRef:
name: my-ca-issuer
certificates:
enabled: true
extraArgs:
- --self-signed=false
caution

While the sensor and cluster manager pods support automatic reloading of TLS certificates, automatic reloading of CA certificates is not supported. Therefore, the relevant component must be restarted to pick up changes to the CA certificate. For example, if the CA that signs the cluster manager certificates is changed, the sensors must be restarted to recognize the new CA before the cluster manager pods start using certificates signed by it.

EKS Auto Mode

In EKS Auto Mode, the hop limit for IMDS is set to 1 and can not be updated. This prevents most pods from accessing IMDS unless they are using host networking. In this case, you must provide the AWS account ID via a flag to the Upwind Operator and Upwind Cluster Manager components since they can not load the account ID from IMDS. Example values for the upwind-operator chart:

credentials:
clientId: XXX
clientSecret: XXX
extraArgs:
- --cloud-account-id=XXXX
clusterAgent:
values:
extraArgs:
- --cloud-account-id=XXXX

API Catalog Configuration

There are a number of flags to control the data included in the API Catalog.

Environment VariableFlagDescriptionDefault
UPWIND_API_SEC_CATALOG--api-sec-catalog=trueWhether to enable the API Catalog.true
UPWIND_API_SEC_REQUESTS_SAMPLE_CNT--api-sec-requests-sample-cnt=1000Maximum number of request samples to include in an API Catalog report. Set to 0 to disable request sampling entirely.1000
UPWIND_API_SEC_REQUESTS_SAMPLE_SIZE_MAX--api-sec-requests-sample-size-max=1024Maximum size of request body. Set to 0 to disable including request bodies.1024
UPWIND_API_SEC_RESPONSE_SAMPLE_SIZE_MAX--api-sec-response-sample-size-max=1024Maximum size of response body. Set to 0 to disable including response bodies.1024
UPWIND_API_SEC_EXCLUDE_SENSITIVE_BODY--api-sec-exclude-sensitive-body=falseWhether to mask the entire request or response body on a sensitive data matchfalse

These flags or environment values can be added to the agent.values.extraArgs or agent.values.env lists in the upwind-operator chart.

Configuration Library

This is a list of selected values for the upwind-operator chart.

KeyTypeDefaultDescription
agent.valuesobject{"tolerations":[{"operator":"Exists"}]}Custom values that will be added to the Sensor.
agent.values.envlist[]Additional common environment variables added to the Upwind Sensor pod.
agent.values.extraArgslist[]Additional command line arguments added to containers in the Upwind Sensor pod.
agent.values.agent.config.
connCache.expiry
int7260Expiration duration from connection cache in seconds.
agent.values.agent.config.
connCache.size
int30000Maximum connection cache capacity.
agent.values.agent.config.
connTrackCache.expiry
int7260Expiration duration from connection cache in seconds.
agent.values.agent.config.
connTrackCache.size
int30000Maximum conntrack cache capacity.
agent.values.agent.config.
logLevel
string"INFO"Logging level for the network agent container.
agent.values.agent.resourcesobject{
"limits":
{"cpu":"500m","memory":"1000Mi"},
"requests":
{"cpu":"50m","memory":"100Mi"}
}
Specifies the resource requests and limits for the Upwind Sensor network agent container.
agent.values.networkAggregation.
enabled
booltrueEnables aggregation of network reports in the Cluster Manager.
agent.values.podSecurityPolicy.
enabled
booltrueEnables creation of a PodSecurityPolicy for the sensor. Must be set to false before upgrading a cluster from Kubernetes 1.24 to Kubernetes 1.25.
agent.values.scanAgent.config.
logLevel
string"INFO"Logging level for the scanning agent container.
agent.values.scanAgent.enabledbooltrueWhether to enable the scanning agent container.
agent.values.scanAgent.resourcesobject{
"limits":
{"cpu":"500m","memory":"500Mi"},
"requests":
{"cpu":"50m","memory":"50Mi"}
}
Specifies the resource requests and limits for the Upwind Sensor scan agent container
agent.values.scanHost.enabledboolfalseWhether to enable the CronJob to scan the node filesystem.
agent.values.scanHost.resourcesobject{
"limits":
{"cpu":"500m","memory":"1000Mi"},
"requests":
{"cpu":"50m","memory":"250Mi"}
}
Specifies the resource requests and limits for the CronJob.
agent.values.scanHost.schedulestring"38 * * * *"Schedule for running the CronJob.
agent.values.tolerationslist[{"operator":"Exists"}]Specifies the tolerations for the Upwind Sensor daemonset.
agent.versionstringnilpins the version of the Sensor component to a specific release.
clusterAgent.valuesobject{}custom values will be added to the ClusterAgent spec
clusterAgent.values.envobject{}Custom environment variables to add to the Cluster Manager pod.
clusterAgent.values.resourcesobject{
"limits":
{"cpu":"1500m","memory":"2Gi"},
"requests":
{"cpu":"1000m","memory":"2Gi"}
}
Specifies the resource requests and limits for the Cluster Manager.
clusterAgent.values.scanJob.
enabled
boolfalseWhether to enable Scan Jobs.
clusterAgent.values.scanJob.
scanner.appVersion
string0.116.1Version of the Scanner to use for Scan Jobs.
clusterAgent.values.scanJob.
scanner.extraArgs
list[]Additional arguments added to the scan job pods.
clusterAgent.values.scanJob.
scanner.env
list[]Additional environment variables added to the scan jobs.
clusterAgent.values.scanJob.
scanner.resources
object{
"limits":
{"cpu":"500m","memory":"500Mi"},
"requests":
{"cpu":"50m","memory":"50Mi"}
}
The resource requests and limits for the scan job container.
clusterAgent.values.scanJob.
scanner.containerRuntime.name
stringcontainerd[Required] Container runtime to use for scanning—containerd, docker, crio.
clusterAgent.values.scanJob.
scanner.containerRuntime.containerd
object{
"address":
"/run/containerd/containerd.sock",
"root":
"/var/lib/containerd"
}
Containerd configuration
clusterAgent.values.scanJob.
scanner.containerRuntime.docker.root
string/var/lib/dockerRoot directory for docker metadata
clusterAgent.values.scanJob.
scanner.containerRuntime.crio
object{
"address":
"/run/crio/crio.sock",
"root":
"/var/lib/containers"
}
CRIO configuration
clusterAgent.values.scanJob.
scanner.activeDeadlineSeconds
int3600Scan job duration, in seconds, for which it can be active
clusterAgent.values.scanJob.
scanner.ttlSecondsAfterFinished
int120TTL for finished scan jobs, in seconds.
clusterAgent.values.scanJob.
scanner.securityContext
object{"privileged":"true"}Security context for the scan job.
clusterAgent.versionstringnilpins the version of the ClusterAgent component.
credentials.clientIdstringnil[Required] Specifies the application's Client ID.
credentials.clientSecretstringnil[Required] Specifies the application's Client Secret.
operator.logLevelstring"info"Set the log level for the Upwind Operator.
resources.limits.cpustring"500m"Sets CPU limits for the Upwind Operator.
resources.limits.memorystring"256Mi"Sets memory limits for the Upwind Operator.
resources.requests.cpustring"10m"Sets CPU requests for the Upwind Operator.
resources.requests.memorystring"64Mi"Sets memory requests for the Upwind Operator.