Deployment and Horizontal Pod Autoscaler Sample

Define a deployment and Horizontal Pod Autoscaler in the YAML specification file when creating a deployment for one or more execution Data Collectors that automatically scale during times of peak performance.

The following sample YAML specification file defines a deployment associated with a Kubernetes Horizontal Pod Autoscaler:
apiVersion: v1
kind: List
items:
- apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: datacollector-deployment
    namespace: <agentNamespace>
  spec:
    replicas: 1
    selector:
      matchLabels:
        app: <deploymentLabel>
    template:
      metadata:
        labels:
          app : <deploymentLabel>
          kerberosEnabled: true
          krbPrincipal: <KerberosUser>
      spec:
        containers:
        - name : datacollector
          image: <privateImage>
          ports:
          - containerPort: 18360
          volumeMounts:
         - name: krb5conf
           mountPath: /etc/krb5.conf
           subPath: krb5.conf
           readOnly: true
          env:
          - name: HOST
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          - name: PORT0
            value: "18630"
        imagePullSecrets:
        - name: <imagePullSecrets>
        volumes:
        - name: krb5conf
          secret:
            secretName: krb5conf
- apiVersion: autoscaling/v1
  kind: HorizontalPodAutoscaler
  metadata:
    name: datacollector-hpa
    namespace: <agentNamespace>
  spec:
    scaleTargetRef:
      apiVersion: apps/v1beta1
      kind: Deployment
      name: <deploymentLabel>
    minReplicas: 1 
    maxReplicas: 10
    targetCPUUtilizationPercentage: 50
If not enabling Kerberos authentication, you'd remove the following Kerberos attributes from the sample file:
...
      kerberosEnabled: true
      krbPrincipal: <KerberosUser>
...
      volumeMounts:
         - name: krb5conf
           mountPath: /etc/krb5.conf
           subPath: krb5.conf
           readOnly: true
...
      volumes:
     - name: krb5conf
       secret:
         secretName: krb5conf
Replace the following variables in the sample file with the appropriate attribute values:
Variable Description
agentNamespace Namespace used for the Provisioning Agent that manages this deployment.
deploymentLabel Label for this deployment. Must be unique for all deployments managed by the Provisioning Agent.
KerberosUser User for the Kerberos principal when enabling Kerberos authentication.

This attribute is optional. If you remove this attribute, the Provisioning Agent uses sdc as the Kerberos user.

The Provisioning Agent creates a unique Kerberos principal for each deployed Data Collector container using the following format: <KerberosUser>/<host>@<realm>. The agent determines the host and realm to use, creates the Kerberos principal, and generates the keytab for that principal.

For example, if you define the KerberosUser attribute as marketing and the Provisioning Agent deploys two Data Collector containers, the agent creates the following Kerberos principals:
marketing/10.60.1.25@EXAMPLE.COM
marketing/10.60.1.26@EXAMPLE.COM
privateImage Path to your private Data Collector Docker image stored in your private repository.
Or, if using the public StreamSets Data Collector Docker image, modify the attribute as follows:
image: streamsets/datacollector:<version>
Where <version> is the Data Collector version. For example:
image: streamsets/datacollector:4.1.0
imagePullSecrets Pull secrets required for the private image stored in your private repository.

If using the public StreamSets Data Collector Docker image, remove these lines.

When a specification file defines a deployment and Horizontal Pod Autoscaler, the Horizontal Pod Autoscaler must be associated to the deployment defined in the same file. In the sample above, the Horizontal Pod Autoscaler is associated to the defined deployment with the following attributes:
kind: Deployment
name: <deploymentLabel>

In the Horizontal Pod Autoscaler definition, you also might want to modify the minimum and maximum replica values and the target CPU utilization percentage value. For more information on these values, see the Kubernetes Horizontal Pod Autoscaler documentation.