Configuration
The Sensor and Cluster Manager are shipped with default configurations that should be suitable for most installations, but can be customized by setting values in the Upwind Operator chart.
Using Kubernetes Secret for Upwind Operator clientId and clientSecret
The Upwind Operator requires a clientId
and clientSecret
to authenticate with the Upwind API. These credentials can be provided as a Kubernetes Secret, which is then referenced in the Upwind Operator configuration.
The Secret convention is as follows:
clientId=$CLIENT_ID
clientSecret=$CLIENT_SECRET
For example, you can create it with the following command:
kubectl create secret generic --namespace upwind upwind-secret \
--from-literal=clientId=$CLIENT_ID \
--from-literal=clientSecret=$CLIENT_SECRET
To reference the Secret in the Upwind Operator configuration, you must use the following configuration:
credentials:
create: false
name: upwind-secret
Configure with Prometheus
Prometheus is a monitoring system commonly used with Kubernetes, which gathers metrics (e.g. CPU usage) from various sources and stores them. You can then query them with a tool like Grafana, which lets you display a graph of the CPU usage of your pods.
Monitoring of the sensors and cluster manager can be enabled by setting the following chart values in the upwind-operator chart:
agent:
values:
agent:
metrics:
enabled: true
podMonitor:
enabled: true
scanAgent:
metrics:
enabled: true
clusterAgent:
values:
serviceMonitor:
enabled: true
Note that the sensor pod is a host-networked pod and enabling metrics will require that the sensor be able to use ports 59090 (for the agent container metrics) and 59091 (for the scan agent container metrics). The metrics port can be configured via the agent.metrics.port
and scanAgent.metrics.port
agent chart parameters.
The Prometheus PodMonitor resource in Kubernetes is defined by the following:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: monitoring-upwind-agents
labels:
app.kubernetes.io/name: upwind-agent
spec:
podMetricsEndpoints:
- honorLabels: true
path: /metrics
port: agent-metrics
scheme: http
scrapeTimeout: 30s
jobLabel: upwind-agent
namespaceSelector:
matchNames:
- upwind
selector:
matchLabels:
app.kubernetes.io/instance: upwind-agent
app.kubernetes.io/name: agent
The Prometheus ServiceMonitor resource in Kubernetes is defined by the following:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: upwind-cluster-agent
spec:
jobLabel: upwind-cluster-agent
selector:
matchExpressions:
- key: app.kubernetes.io/name
values:
- cluster-agent
operator: In
namespaceSelector:
matchNames:
- upwind
endpoints:
- port: metrics
interval: 30s
Purpose of the Cluster Manager and Sensor ClusterRole
Upwind creates and utilizes the upwind-cluster-agent
ClusterRole, configured to monitor and audit key Kubernetes resources. This role is tailored for observing and reporting on the state and configuration of various Kubernetes objects, necessary for ensuring a comprehensive security and operational overview.
The upwind-cluster-agent
ClusterRole is defined with the following rules:
Core API Group (apiGroups: [""]):
Resources: Services, namespaces, nodes and pods.
Actions Allowed: ["watch", "list", "get"]
Apps API Group: ["apps"]:
Resources: Deployments, replicasets, daemonsets, and statefulsets.
Actions Allowed: ["watch", "list", "get"]
Networking API Group: ["networking.k8s.io"]:
Resources: Network policies and ingresses.
Actions Allowed: ["watch", "list", "get"]
Batch API Group: ["batch"]:
Resources: Jobs and cronjobs.
Actions Allowed: ["watch", "list", "get"]
CustomResouceDefinition API Group ["apiextensions.k8s.io"]:
Resources: CustomResourceDefinitions
Actions Allowed: ["watch", "list", "get"]
Additionally, Upwind creates and uses the upwind-agent
ClusterRole with the following configuration:
Core API Group (apiGroups: [""]):
Resources: Services, namespaces, nodes and pods.
Actions Allowed: ["watch", "list", "get"]
Apps API Group: ["apps"]:
Resources: Deployments, replicasets, daemonsets, and statefulsets.
Actions Allowed: ["watch", "list", "get"]
Batch API Group: ["batch"]:
Resources: Jobs and cronjobs.
Actions Allowed: ["watch", "list", "get"]
CustomResouceDefinition API Group ["apiextensions.k8s.io"]:
Resources: CustomResourceDefinitions
Actions Allowed: ["watch", "list", "get"]
The Upwind Cluster Manager and Sensor ClusterRole are configured to access specific types of resources within designated API groups, focusing on services, namespaces, various workload controllers, and networking configurations. The verbs watch, list, and get provide read-only access, allowing Upwind to observe and report on resource states without modifying them. The ClusterRole adheres to the principle of least privilege by limiting its scope to specific resource types, reducing potential security risks.
Tuning the Sensor and Cluster Manager
The Sensor pods will use memory roughly in proportion to the level of network and process activity on the node the Sensor is running on, and is not dependent on the overall size of the Kubernetes cluster. Most of the memory the Sensor uses is dedicated to internal caches of data related to active network connections. Depending on your specific environment, it may be beneficial to adjust the size or expiration time of these caches by configuring the Sensor daemonset by modifying agent.values.agent.config.connCache
or agent.values.agent.config.connTrackCache
. Increasing the cache sizes will allow the sensor to monitor more concurrent network connections. Reducing the size of the caches or the expiration time can reduce memory usage.
The Cluster Manager will use memory roughly in proportion to the size of the cluster it is running in and the amount of network traffic in the cluster. For clusters larger than 50 nodes or with more than 1 GB/s of network traffic, it's recommended to increase the memory requests and limits for the Cluster Manager.
Configuration Library
Key | Type | Default | Description |
---|---|---|---|
agent.values | object | {"tolerations":[{"operator":"Exists"}]} | custom values will be added to the Sensor. |
agent.values.env | list | [] | Additional common environment variables added to the Upwind Sensor pod. |
agent.values.agent.config. connCache.expiry | int | 7260 | Expiration duration from connection cache in seconds. |
agent.values.agent.config. connCache.size | int | 30000 | Maximum connection cache capacity. |
agent.values.agent.config. connTrackCache.expiry | int | 7260 | Expiration duration from connection cache in seconds. |
agent.values.agent.config. connTrackCache.size | int | 30000 | Maximum conntrack cache capacity. |
agent.values.agent.config. logLevel | string | "INFO" | Logging level for the network agent container. |
agent.values.agent.resources | object | { "limits": {"cpu":"500m","memory":"1000Mi"}, "requests": {"cpu":"50m","memory":"100Mi"} } | Specifies the resource requests and limits for the Upwind Sensor network agent container. |
agent.values.networkAggregation. enabled | bool | true | Enables aggregation of network reports in the Cluster Manager. |
agent.values.podSecurityPolicy. enabled | bool | true | Enables creation of a PodSecurityPolicy for the sensor. Must be set to false before upgrading a cluster from Kubernetes 1.24 to Kubernetes 1.25. |
agent.values.scanAgent.config. logLevel | string | "INFO" | Logging level for the scanning agent container. |
agent.values.scanAgent.enabled | bool | true | Whether to enable the scanning agent container. |
agent.values.scanAgent.resources | object | { "limits": {"cpu":"500m","memory":"500Mi"}, "requests": {"cpu":"50m","memory":"50Mi"} } | Specifies the resource requests and limits for the Upwind Sensor scan agent container |
agent.values.scanHost.enabled | bool | false | Whether to enable the CronJob to scan the node filesystem. |
agent.values.scanHost.resources | object | { "limits": {"cpu":"500m","memory":"1000Mi"}, "requests": {"cpu":"50m","memory":"250Mi"} } | Specifies the resource requests and limits for the CronJob. |
agent.values.scanHost.schedule | string | "38 * * * *" | Schedule for running the CronJob. |
agent.values.tolerations | list | [{"operator":"Exists"}] | Specifies the tolerations for the Upwind Sensor daemonset. |
agent.version | string | nil | pins the version of the Sensor component to a specific release. |
clusterAgent.values | object | {} | custom values will be added to the ClusterAgent spec |
clusterAgent.values.env | object | {} | Custom environment variables to add to the Cluster Manager pod. |
clusterAgent.values.resources | object | { "limits": {"cpu":"1500m","memory":"2Gi"}, "requests": {"cpu":"1000m","memory":"2Gi"} } | Specifies the resource requests and limits for the Cluster Manager. |
clusterAgent.version | string | nil | pins the version of the ClusterAgent component. |
credentials.clientId | string | nil | [Required] Specifies the application's Client ID. |
credentials.clientSecret | string | nil | [Required] Specifies the application's Client Secret. |
operator.logLevel | string | "info" | Set the log level for the Upwind Operator. |
resources.limits.cpu | string | "500m" | Sets CPU limits for the Upwind Operator. |
resources.limits.memory | string | "256Mi" | Sets memory limits for the Upwind Operator. |
resources.requests.cpu | string | "10m" | Sets CPU requests for the Upwind Operator. |
resources.requests.memory | string | "64Mi" | Sets memory requests for the Upwind Operator. |