Kubernetes Environments

A Kubernetes environment represents the namespace in your Kubernetes cluster where StreamSets engines are deployed.

Your Kubernetes administrator must create a namespace in the cluster and allow resources that run in that namespace outbound access to Control Hub. You then create a Kubernetes environment in Control Hub that represents the namespace. When you activate the environment, Control Hub generates a StreamSets Kubernetes agent installation script and YAML file. You can use the script or the YAML file to launch an agent in the namespace.

While the environment is in an active state, the StreamSets Kubernetes agent periodically checks with Control Hub to retrieve requests. When you create and start a deployment for the environment, the agent communicates with Control Hub to provision the Kubernetes resources needed to run engines and to deploy engine instances to those resources.

When you deactivate the environment, the StreamSets Kubernetes agent is deleted from the namespace. When you activate the environment again, you must use the agent installation script or YAML file to relaunch the agent.

Before you create a Kubernetes environment, your Kubernetes administrator must complete several prerequisites.

StreamSets Kubernetes Agent

A StreamSets Kubernetes agent is an application that runs in your Kubernetes cluster. An agent communicates with Control Hub to provision the Kubernetes resources needed to run engines and to deploy engine instances to those resources.

When you activate a Kubernetes environment, Control Hub generates a StreamSets Kubernetes agent installation script and YAML file. You can use the script or the YAML file to launch the agent in the namespace.

The agent runs as a Kubernetes deployment with a single pod. Each Kubernetes environment that you create in Control Hub corresponds to a single StreamSets Kubernetes agent that runs in the specified Kubernetes namespace. StreamSets does not recommend running more than one agent in the same namespace.

The agent uses encrypted REST APIs to communicate with Control Hub, initiating outbound connections to Control Hub over HTTPS. The agent sends requests and information to Control Hub. Control Hub does not directly send requests to the agent. Instead, the agent regularly checks with Control Hub to retrieve requests, such as creating or updating Kubernetes deployments in the namespace.

When you start a Control Hub deployment that belongs to the Kubernetes environment, the agent retrieves that request and creates and monitors a YAML file describing the required resources. The YAML file creates a single Kubernetes deployment and secret in the Kubernetes namespace. The YAML file also creates a horizontal pod autoscaler if the Control Hub deployment is configured to allow autoscaling. The Kubernetes deployment then creates a replica set to ensure that enough pods are created, with each pod running a single engine instance.

When you update a Control Hub deployment that belongs to the Kubernetes environment, the agent retrieves the update request from Control Hub, and then the modified YAML file applies the updates to the engine instances running in the Kubernetes deployment.

Example

The following diagram illustrates how Control Hub objects correspond to resources in your Kubernetes cluster. In this example, Control Hub includes a single Kubernetes environment and two deployments that belong to that environment. One deployment is configured for one engine instance, and the other is configured for two engine instances.

In the Kubernetes cluster, the Kubernetes namespace specified in the Control Hub environment includes three Kubernetes deployments:
  • Deployment 1 corresponds to the Control Hub Kubernetes environment and includes a single pod that runs the StreamSets Kubernetes agent.
  • Deployment 2 corresponds to one child Control Hub deployment and includes a single pod that runs one engine instance.
  • Deployment 3 corresponds to the other child Control Hub deployment and includes two replicated pods, each of which runs an identical engine instance.

StreamSets Kubernetes Agent Versions

When you create a Kubernetes environment, you define the StreamSets Kubernetes agent version to use. As a best practice, use the latest released version.

Version 1.1.2

On April 10, 2024, StreamSets released the StreamSets Kubernetes agent version 1.1.2.

Fixed Issue
  • Version 1.1.2 addresses the following third-party vulnerabilities:
    • CVE-2020-8908 com.google.guava:guava
    • CVE-2021-28168 org.glassfish.jersey.core:jersey-common
    • CVE-2023-2976 com.google.guava:guava
    • CVE-2023-3635 com.squareup.okio:okio-jvm
    • CVE-2023-5072 JSON-Java
    • CVE-2023-31582 org.bitbucket.b_c:jose4j
    • CVE-2023-33201 org.bouncycastle:bcprov-jdk18on
    • CVE-2023-33202 org.bouncycastle:bcpkix-jdk18on
    • CVE-2023-51775 org.bitbucket.b_c:jose4j
    • GHSA-jgvc-jfgh-rjvv org.bitbucket.b_c:jose4j

Version 1.1.1

On October 18, 2023, StreamSets released the StreamSets Kubernetes agent version 1.1.1.

Fixed Issue
  • Version 1.1.1 addresses the following third-party vulnerabilities:
    • CVE-2021-28168 org.glassfish.jersey
    • CVE-2022-48174 busybox
    • CVE-2022-48174 busybox-binsh
    • CVE-2022-48174 ssl_client
    • CVE-2023-2976 com.google.guava:guava
    • CVE-2023-2975 libcrypto3
    • CVE-2023-2975 libssl3
    • CVE-2023-3446 libcrypto3
    • CVE-2023-3446 libssl3
    • CVE-2023-3817 redhat: 5.3 libcrypto3
    • CVE-2023-3817 libssl3
    • GHSA-jgvc-jfgh-rjvv org.bitbucket.b_c:jose4j

Version 1.1.0

On June 29, 2023, StreamSets released the StreamSets Kubernetes agent version 1.1.0.

Version 1.1.0 supports configuring engines to use a proxy server by defining the proxy properties in the Control Hub Kubernetes deployment.

Fixed Issue
  • When upgrading an existing StreamSets Kubernetes agent from version 1.0.0, a significant delay of 5-10 minutes occurs before the upgraded agent reconnects with Control Hub. During that time, the Control Hub Kubernetes environment indicates that the agent has been lost and that any functionality relying on the agent will be delayed until it reconnects. However, existing Control Hub Kubernetes deployments and engines continue to function normally.

Version 1.0.0

On February 10, 2023, StreamSets released the StreamSets Kubernetes agent version 1.0.0.

Known Issue
  • Version 1.0.0 does not support configuring engines to use a proxy server by defining the proxy properties in the Control Hub Kubernetes deployment.
    Workaround: Upgrade to StreamSets Kubernetes agent version 1.1.0 or later. Or, complete the following steps to configure engines to use a proxy server:
    1. Create a YAML file that defines a ConfigMap for the engine proxy properties, where <namespace-name> is the Kubernetes namespace where the engines are deployed.
      For example:
      apiVersion: v1
      data:
        http.nonProxyHosts: <pipe-separated no proxy hosts>
        no_proxy: <comma-separated no proxy hosts>
        http.proxyHost: <proxy host>
        http.proxyPassword: <password>
        http.proxyPort: "<port>"
        http.proxyUser: <proxy user>
        http_proxy: http://<proxy user>:<password>@<proxy host>:<port>
        https.proxyHost: <proxy host>
        https.proxyPassword: <password>
        https.proxyPort: "<port>"
        https.proxyUser: <proxy user>
        https_proxy: http://<proxy user>:<password>@<proxy host>:<port>
      kind: ConfigMap
      metadata:
        name: <config-map-name>
        namespace: <namespace-name>
    2. Run the following command to apply the YAML and create the ConfigMap in the Kubernetes cluster:
      kubectl -n <namespace-name> apply -f <config-map-name>.yaml
    3. In Control Hub, edit the Kubernetes deployment. Use advanced mode to edit the YAML file, adding the following envFrom field to the deployment container, after the existing - env field:
      envFrom:
        - configMapRef:
            name: <config-map-name>

Resources Created for the Agent

When you launch the StreamSets Kubernetes agent, the following Kubernetes resources are created in the namespace:
  • Deployment with a single pod that runs the agent
  • Secret that contains the authentication token used by the agent to communicate with the StreamSets platform
  • Service account
  • Role
  • Role binding

When you use Control Hub to deactivate the Kubernetes environment, the agent deployment and secret are deleted. The service account, role, and role binding are not deleted.

When you activate the environment again, you must use the agent installation script or YAML file to relaunch the agent. The script or YAML file recreates the deployment and secret for the agent and then reuses the existing service account, role, and role binding. If those resources have been manually deleted, the script or YAML file creates the resources again.

Permissions Granted to the Agent

When you launch the StreamSets Kubernetes agent, the agent is granted role-based access control (RBAC) within the namespace that the agent runs in. The role sets permissions on the following Kubernetes resources:
Resource Permissions
Deployments Get, list, create, patch, delete
Replica sets Get, list, create, patch, delete
Pods Get, list, create, patch, delete
Secrets Get, list, create, patch, delete
Horizontal pod autoscalers Get, list, create, patch, delete
Note: Horizontal pod autoscalers are used when a Control Hub deployment is configured to enable autoscaling.

Feature Versions

At this time, Kubernetes environments include the initial KUBERNETES_2023_02_10 feature version that includes all available features for Kubernetes environments and deployments.

Prerequisites

Before you create a Kubernetes environment, your Kubernetes administrator must complete the following prerequisites:
  1. Verify that your existing Kubernetes cluster meets the StreamSets requirements.
  2. Create a namespace for the exclusive use of the Kubernetes environment.
  3. Allow the required inbound and outbound traffic to the StreamSets resources that run in the namespace.
  4. Configure the Kubernetes command-line tool so that it can run commands against the Kubernetes cluster.

Verify Cluster Requirements

Verify that your existing Kubernetes cluster meets the following StreamSets requirements:
  • The Kubernetes server must be version 1.22 or later.
  • The cluster can access the internet.
  • Each node in the cluster is set up with the minimum requirements for the deployed engine type.

    The StreamSets Kubernetes agent deploys engine instances as Docker images. As such, the requirements for a Kubernetes deployment are the same as for a self-managed deployment for a Docker image installation. For the list of minimum requirements for each engine type, see the Data Collector documentation or the Transformer documentation.

  • Kubernetes secrets are safely secured in the cluster, which might require additional cluster configuration as described in the Kubernetes documentation.
  • The Kubernetes cluster has a running metrics server or an equivalent server if you plan to configure a Control Hub Kubernetes deployment to use a horizontal pod autoscaler to automatically scale the number of engine instances.
    Important: Ensure that the metrics server is correctly configured to report accurate and timely metrics. If the server is slow to report metrics, the horizontal pod autoscaler is also slow to adjust to changes in CPU usage.

    For more information about metrics servers, see the Kubernetes documentation.

Create a Namespace

Create a namespace in your Kubernetes cluster.

You can use an existing namespace. However, StreamSets recommends creating a new namespace for the exclusive use of each Control Hub Kubernetes environment.

Do not create deployments or pods within the namespace. After you create and activate the Kubernetes environment and then launch the StreamSets Kubernetes agent in the namespace, the agent communicates with Control Hub to provision the required Kubernetes resources in the namespace.

For instructions on creating a namespace in Kubernetes, see the Kubernetes documentation.

Allow Inbound and Outbound Traffic

If the Kubernetes cluster resides behind a firewall, allow the required inbound and outbound traffic to the Kubernetes resources provisioned for the environment. Or if you use Kubernetes network policies, create a network policy that defines the required ingress and egress rules to the resources.

Allow traffic as required by the following resources:
  • The StreamSets Kubernetes agent requires outbound connections to StreamSets Control Hub. For the complete list of Control Hub DNS names to allow, see Outbound Connections.
  • The StreamSets engines require all inbound and outbound connections as listed in Firewall Configuration Overview.
For more information about network policies, see the Kubernetes documentation.
Note: If the Kubernetes cluster also uses a proxy server, you must configure the StreamSets Kubernetes agent and StreamSets engines to use the same proxy server.

Configure kubectl

Configure the Kubernetes command-line tool, kubectl, so that it can communicate with your Kubernetes cluster.

For details about accessing Kubernetes clusters, see the Kubernetes documentation.

The user who runs the command that launches the StreamSets Kubernetes agent in the cluster must have the following minimum permissions within the namespace that the agent runs in:

Resource Permissions
Deployments Get, list, create, patch, delete
Pods Get, list, create, patch, delete
Replica sets Get, list, create, patch, delete
Roles Get, list, create, patch, delete
Role bindings Get, list, create, patch, delete
Secrets Get, list, create, patch, delete
Service accounts Get, list, create, patch, delete
Horizontal pod autoscalers Get, list, create, patch, delete
Note: Horizontal pod autoscalers are used when a Control Hub deployment is configured to enable autoscaling.

Configuring a Kubernetes Environment

Configure a Kubernetes environment to define where to deploy StreamSets engines in your Kubernetes cluster.

Important: Before configuring an environment, your Kubernetes administrator must complete the required prerequisites.

To create a new environment, click Set Up > Environments in the Navigation panel, and then click the Create Environment icon: .

To edit an existing environment, click Set Up > Environments in the Navigation panel, click the environment name, and then click Edit.

Define the Environment

Define the environment essentials, including the environment name and type, and optional tags to identify similar environments.

  1. Configure the following properties:
    Define Environment Property Description
    Environment Name Name of the environment.

    Use a brief name that informs your team of the environment use case.

    Environment Type Select Kubernetes.

    Once saved, you cannot change the environment type.

    Environment Tags Optional tags that identify similar environments within Control Hub. Use environment tags to easily search and filter environments.

    Enter nested tags using the following format:

    <tag1>/<tag2>/<tag3>

    Feature Version Feature version to use for the environment and all deployments created for the environment.

    Each feature version typically requires different permissions.

    When creating a new environment, StreamSets recommends using the latest feature version. When a new feature version is available, StreamSets recommends changing your existing environments to use the new feature version as soon as possible.

  2. Optionally, click Show Advanced Options and configure the following advanced property:
    Define Environment Advanced Property Description
    Allow Nightly Builds Allows deployments for this environment to use nightly engine builds in addition to released engine versions. Also allows Kubernetes environments to use nightly StreamSets Kubernetes agent builds.

    Nightly builds are for testing features under development and should not be used in production systems.

    The version number of a nightly build includes a -SNAPSHOT suffix and the build number. For example, 5.2.0-SNAPSHOT (Build 1013).

  3. If creating the environment, click one of the following buttons:
    • Cancel - Cancels creating the environment and exits the wizard.
    • Save & Next - Saves the environment and continues.
    • Save & Exit - Saves the environment and exits the wizard, displaying the incomplete environment in the Environments view.

Configure the Environment

Specify the version of the StreamSets Kubernetes agent to deploy, enter the name of the Kubernetes namespace created as a prerequisite by your Kubernetes administrator, and optionally define Kubernetes labels to apply to provisioned Kubernetes resources.

  1. Configure the following properties:
    Configure Environment Property Description
    Agent Version Version of the StreamSets Kubernetes agent to deploy. As a best practice, use the latest released version.
    Kubernetes Namespace Name of the Kubernetes namespace where you plan to deploy StreamSets engines.

    Enter the name of the namespace created as a prerequisite by your Kubernetes administrator.

    You cannot change the namespace name while the environment is active.

    Agent Java Options Custom Java options used by the agent. Enter any options supported by the JVM.
    Kubernetes Labels Kubernetes labels to apply to all Kubernetes resources provisioned for this environment.
    Enter the labels as key-value pairs. For label naming requirements, see the Kubernetes documentation.
    Important: StreamSets reserves app as a label key for its own use. As a result, you cannot define app as a label key.

    You can define the labels using simple or bulk edit mode. In simple edit mode, click Add Another to define additional labels. In bulk edit mode, configure labels in JSON format.

    Note: These labels are applied to Kubernetes resources, not to Control Hub environments.
  2. If creating the environment, click one of the following buttons:
    • Back - Returns to the previous step in the wizard.
    • Save & Next - Saves the environment and continues.
    • Save & Exit - Saves the environment and exits the wizard, displaying the incomplete environment in the Environments view.

Share the Environment

By default, the environment can only be seen by you. Share the environment with other users and groups to grant them access to it.

  1. In the Select Users and Groups field, type a user email address or a group name.
  2. Select users or groups from the list, and then click Add.

    The added users and groups display in the User / Group table.

  3. Modify permissions as needed. By default, each added user or group is granted the following permissions:
    • Read - View the details of the environment. Create and edit a deployment for the environment.
    • Write - Edit, activate, deactivate, and delete the environment.

    For more information, see Environment Permissions.

  4. Click one of the following buttons:
    • Back - Returns to the previous step in the wizard.
    • Save & Next - Saves the environment and continues.
    • Save & Exit - Saves the environment and exits the wizard, displaying the incomplete environment in the Environments view.

Review and Activate the Environment

You've successfully finished creating the environment.

  1. Click Activate & Generate Install Script to activate the environment and generate an installation script and YAML file for the StreamSets Kubernetes agent.
    Note: If you click Exit, Control Hub saves the environment and exits the wizard, displaying the deactivated environment in the Environment view. You can activate the environment and retrieve the installation script or YAML file at a later time.
  2. To launch the agent from the installation script, click the Copy to Clipboard icon () to copy the generated installation script.
  3. To launch the agent from the YAML file, complete the following steps:
    1. Click View Kubernetes YAML.
    2. Click the Copy to Clipboard icon () to copy the YAML content, and then click Close.
    3. Save the YAML to your local file system.
  4. Optionally, click Check Agent Status after Install to display the Agent Status window, where you can view the agent status after you launch the agent.
  5. Use the copied installation script or YAML file to launch the StreamSets Kubernetes agent.

Launching the Agent

After activating a Kubernetes environment, you launch the StreamSets Kubernetes agent so that it runs in the Kubernetes namespace specified in the environment.

Important: Your Kubernetes administrator also must allow the required inbound and outbound traffic to the agent and to the StreamSets engines that the agent deploys, as described in Allow Inbound and Outbound Traffic.
Launch the agent in one of the following ways:

Launching the Agent from the Installation Script

You can launch the StreamSets Kubernetes agent using the installation script that Control Hub generates. The installation script automatically retrieves the agent YAML file from Control Hub, and then uses the Kubernetes command-line tool, kubectl, to create a deployment for the agent based on the YAML file.

  1. Verify that you have completed the kubectl prerequisites.
  2. Open a command prompt.
  3. To verify that kubectl can access your Kubernetes cluster, run the following command:
    kubectl cluster-info
  4. Paste and then run the installation script command that you copied from the Control Hub Kubernetes environment.
  5. If you chose to check the agent status, view the status in the Control Hub UI.

Launching the Agent from the YAML File

You can launch the StreamSets Kubernetes agent using the YAML file that Control Hub generates. Use the Kubernetes command-line tool, kubectl, to create a deployment for the agent based on the YAML file, as you do with other Kubernetes applications.

  1. Verify that you have completed the kubectl prerequisites.
  2. Open a command prompt.
  3. To verify that kubectl can access your Kubernetes cluster, run the following command:
    kubectl cluster-info
  4. Run the following command to create a deployment for the YAML file that you copied from the Control Hub Kubernetes environment:
    kubectl apply -f ./<file_name>.yaml
    For example, if you saved the YAML to a file named streamsets-kubernetes-agent.yaml, run the following command:
    kubectl apply -f ./streamsets-kubernetes-agent.yaml
    Note: If needed, you can retrieve the generated YAML file.
  5. If you chose to check the agent status, view the status in the Control Hub UI.

Retrieving the Installation Script

You can retrieve the installation script generated for a Kubernetes environment.

Note: Each environment generates a unique installation script for the StreamSets Kubernetes agent as defined for that specific environment. Be sure that you retrieve the script for the correct environment.
  1. In the Control Hub Navigation panel, click Set Up > Environments.
  2. Locate the Kubernetes environment that you want to launch an agent for.
  3. In the Actions column, click the More icon () and then click Get Install Script.
  4. Click the Copy to Clipboard icon (), and then click Close.

Retrieving the YAML File

You can retrieve the YAML file generated for a Kubernetes environment.

Note: Each environment generates a unique YAML file for the StreamSets Kubernetes agent as defined for that specific environment. Be sure that you retrieve the file for the correct environment.
  1. In the Control Hub Navigation panel, click Set Up > Environments.
  2. Locate the Kubernetes environment that you want to launch an agent for.
  3. In the Actions column, click the More icon () and then click View Generated YAML.
  4. Click the Copy to Clipboard icon (), and then click Close.

Using a Proxy Server for Kubernetes

If the Kubernetes cluster uses a proxy server, you must configure the following Kubernetes resources provisioned for the environment to use the same proxy server:
  • StreamSets Kubernetes agent - Configure the Kubernetes agent to use a proxy server when you configure the Kubernetes environment.
  • StreamSets engines - Configure the engines to use a proxy server when you configure the Control Hub Kubernetes deployments that belong to the Kubernetes environment.
Important: You must use the StreamSets Kubernetes agent version 1.1.0 or later to define proxy server properties in the Control Hub Kubernetes deployment.

Configure the Agent to Use a Proxy Server

To configure a Kubernetes agent to use a proxy server for an existing Kubernetes environment, retrieve the YAML file generated for a Kubernetes environment and edit the YAML file to define the proxy properties. Then use the Kubernetes command-line tool, kubectl, to apply the changes to the running StreamSets Kubernetes agent.

  1. In the Control Hub Navigation panel, click Set Up > Environments.
  2. In the Actions column of the Kubernetes environment, click the More icon () and then click View Generated YAML.
  3. Click the Copy to Clipboard icon (), and then click Close.
  4. Save the YAML to your local file system.
  5. Edit the YAML to add the proxy properties as Java options to the STREAMSETS_KUBERNETES_AGENT_JAVA_OPTS environment variable:
    - name: STREAMSETS_KUBERNETES_AGENT_JAVA_OPTS
      value: -Dhttps.proxyHost=<proxy host> -Dhttps.proxyPort=<port> -Dhttp.proxyHost=<proxy host> -Dhttp.proxyPort=<port> -Dhttp.nonProxyHosts=<pipe-separated no proxy hosts> -Dhttp.proxyUser=<proxy user> -Dhttp.proxyPassword=<password> -Dhttps.proxyUser=<proxy user> -Dhttps.proxyPassword=<password> -Djdk.http.auth.tunneling.disabledSchemes=

    For the -Dhttp.nonProxyHosts option, include the ClusterIP of the Kubernetes API server and any other hosts that the agent can connect to without going through the proxy server. Separate each entry with the pipe character (|).

    To retrieve the ClusterIP of the Kubernetes API server, run the following command:
    kubectl get svc/kubernetes --namespace=default
    Note: Define the user and password properties only when the proxy server requires authentication.

    The properties are the same proxy properties that you configure for StreamSets engines. For a description of each property, see the Data Collector documentation or the Transformer documentation.

  6. Run the following command to apply the changes to the running StreamSets Kubernetes agent:
    kubectl apply -f ./<file_name>.yaml

Configure Engines to Use a Proxy Server

To configure StreamSets engines to use a proxy server for an existing Control Hub Kubernetes deployment, you must stop the deployment.

After editing the deactivated deployment to modify the proxy properties, you start the deployment. The StreamSets Kubernetes agent communicates with Control Hub to provision the Kubernetes resources again, deploying and launching new engine instances that use the updated proxy properties.

  1. In the Control Hub Navigation panel, click Set Up > Deployments.
  2. Select the active deployment that belongs to the Kubernetes environment, and then click the Stop icon: .
  3. Click OK to confirm, and then click Close.
  4. In the Actions column of the deactivated deployment, click the More icon () and then click Edit.
  5. In the Edit Deployment dialog box, expand the Configure Deployment section.
  6. Click Click here to configure next to the Advanced Configuration property.
  7. In the Proxy tab, define the engine proxy properties as needed.

    For a description of each proxy property, see the Data Collector documentation or the Transformer documentation.

  8. Click Save.
  9. In the Edit Deployment dialog box, click Save.
  10. Start the deployment to launch new engine instances that use the updated proxy properties.

    For detailed instructions, see Starting Deployments.

Monitoring the Agent

When you view the details of a Control Hub Kubernetes environment, you can monitor the following information about the StreamSets Kubernetes agent launched for that environment:

To view environment details, click an environment name in the Environments view.

When you need additional monitoring information about the agent, use the Kubernetes command-line tool, kubectl, to access Kubernetes log files for the pod where the StreamSets Kubernetes agent runs.

Agent Status

You can view the status of a StreamSets Kubernetes agent from the list of environments in the Environments view or from the environment details.

The following table describes each agent status:

Agent Status Description
Agent-Online Agent is running in the Kubernetes cluster and is communicating regularly with Control Hub.
Agent-Error Agent is running in the Kubernetes cluster, but has encountered an error. View the agent event logs for additional information.
Agent-Offline Agent has been gracefully shut down or has not yet been launched.
Agent-Lost Agent has not communicated with Control Hub for some time. The agent has either lost its connection with Control Hub or has unexpectedly shut down.

Event Logs

When you view the details of a Kubernetes environment, you can view event logs for the StreamSets Kubernetes agent. The event logs list all events related to the agent launched for the environment, including errors that might occur when launching the agent.

Event logs display messages only about the current activation of the environment. Each time you deactivate and then activate the environment, the previous event logs are removed.

To view agent event logs, click a Kubernetes environment name in the Environments view and then locate the Agent Event Logs section in the environment details.

The following image shows sample event logs for a StreamSets Kubernetes agent:

Accessing Kubernetes Log Files

You can use the Kubernetes command-line tool, kubectl, to access Kubernetes log files for the pod where the StreamSets Kubernetes agent runs.

First, run the following kubectl command to retrieve the name of the pod, where <namespace_name> is the name of the Kubernetes namespace created for the environment:

kubectl [-n <namespace_name>] get pods

Control Hub uses the following format to name the pod for the agent:

streamsets-agent-dep-<Control_Hub_deployment_ID><pod_UID>

After retrieving the appropriate pod name, run the following command:
kubectl [-n <namespace_name>] logs pod/<pod_name>

Modifying the Agent Log Level

If the agent log file does not provide enough troubleshooting information, you can modify the log level to display messages at another severity level.

By default, the agent logs StreamSets messages at the DEBUG severity level, and all other libraries at the WARN severity level. You can configure the following log levels:
  • TRACE
  • DEBUG
  • INFO
  • WARN
  • ERROR
  • FATAL

To modify the log level, retrieve the YAML file generated for a Kubernetes environment and edit the YAML file to customize the log format. Then use the Kubernetes command-line tool, kubectl, to apply the changes to the running StreamSets Kubernetes agent.

  1. In the Control Hub Navigation panel, click Set Up > Environments.
  2. In the Actions column of the Kubernetes environment, click the More icon () and then click View Generated YAML.
  3. Click the Copy to Clipboard icon (), and then click Close.
  4. Save the YAML to your local file system.
  5. Add a ConfigMap section to the end of the YAML file, setting the log level that you want to use.

    For example, to set all messages to the DEBUG log level, add the following ConfigMap:

    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: custom-agent-log4j
    data:
      custom_log4j_file_name: "custom-agent-log4j.properties"
      custom-agent-log4j.properties: |
        name = streamsets-dpm-kubernetes-agent
        appender.stdout.type = Console
        appender.stdout.name = stdout
        appender.stdout.layout.type = PatternLayout
        appender.stdout.layout.pattern = %d{ISO8601} [thread:%t] %-5p %c{1} - %m%n
    
        appender.streamsets.type = RollingFile
        appender.streamsets.name = streamsets
        appender.streamsets.fileName = ${sys:streamsets.kubernetes.agent.log.dir}/agent.log
        appender.streamsets.filePattern = ${sys:streamsets.kubernetes.agent.log.dir}/agent.%i.log
        appender.streamsets.layout.type = PatternLayout
        appender.streamsets.layout.pattern = %d{ISO8601} [thread:%t] %-5p %c{1} - %m%n
        appender.streamsets.append = true
        appender.streamsets.policies.type = Policies
        appender.streamsets.policies.size.type = SizeBasedTriggeringPolicy
        appender.streamsets.policies.size.size = 256MB
        appender.streamsets.strategy.type = DefaultRolloverStrategy
        appender.streamsets.strategy.max = 10
    
        rootLogger.level = DEBUG
        rootLogger.appenderRef.stdout.ref = stdout
    
        logger.l1.name = com.streamsets
        logger.l1.level = DEBUG
        logger.l1.appenderRef.streamsets.ref = streamsets
  6. To define a volume for the ConfigMap and mount it from the agent container, add the following lines in bold to the Deployment section of the YAML file:
    apiVersion: apps/v1
    kind: Deployment
    ...
        spec:
          containers:
            - name: streamsets-agent
              image: streamsets/kubernetes-agent:1.1.2
              ...
              resources:
                requests:
                  memory: 1Gi
                  cpu: 0.5
              volumeMounts:
              - name: config
                mountPath: "/opt/streamsets-kubernetes-agent/etc/"
                readOnly: true
          dnsPolicy: Default
          serviceAccountName: streamsets-agent-sa-f00bb553-59d8-48c5-b681-5bec6eb4f430
    volumes:
       - name: config
            configMap:
              name: custom-agent-log4j
              items:
              - key: "custom-agent-log4j.properties"
                path: "custom-agent-log4j.properties"
  7. Add the path to the log configuration file as a Java option to the STREAMSETS_KUBERNETES_AGENT_JAVA_OPTS environment variable:
    - name: STREAMSETS_KUBERNETES_AGENT_JAVA_OPTS
      value: -Dlog4j2.configurationFile=/opt/streamsets-kubernetes-agent/etc/custom-agent-log4j.properties
  8. Run the following command to apply the changes to the running StreamSets Kubernetes agent:
    kubectl apply -f ./<file_name>.yaml

Upgrading the Agent

You can upgrade an existing Kubernetes environment to use a later StreamSets Kubernetes agent version. As a best practice, use the latest released version.

  1. In the Control Hub Navigation panel, click Set Up > Environments.
  2. In the Actions column of the Kubernetes environment, click the More icon () and then click Edit.
  3. In the Edit Environment dialog box, expand the Configure Environment section.
  4. Select a later StreamSets Kubernetes agent version.
  5. Click Save.
  6. To apply the changes to your Kubernetes cluster, use the agent installation script or YAML file to launch the StreamSets Kubernetes agent again.