Troubleshooting

Use the following tips for help with deployment management:

My cloud service provider deployment is active, but no engine instances have launched for the deployment. How do I troubleshoot?

To troubleshoot an active cloud service provider deployment, access the tracking URL to your cloud service provider account. Use the URL to view additional information about the cloud resources automatically provisioned for the deployment. The information displayed in the tracking URL depends on the deployment type:

Self-managed and Kubernetes deployments do not have a tracking URL.

My deployment is active, but no engine instances have launched for the deployment. How can I check logs for the engines?

When an engine fails to launch, shuts down unexpectedly, or cannot communicate with Control Hub, you can access the log files directly on the machine. For details, see Accessing Engine Log Files.

An Amazon EC2 deployment fails to activate with the following error:

User <cross-account role> is not authorized to perform: iam:PassRole on resource <instance profile role>

The parent AWS environment has AWS credentials that are incorrectly configured. The IAM policy used by the cross-account role does not grant the permission to pass an IAM role to Control Hub. Verify that the IAM policy used by the cross-account role has the required permissions.

No engine instances launched for an active Amazon EC2 deployment. When I use SSH to connect to a provisioned EC2 instance and view the system output generated by the engine installation script, I see the following error:

Connect timeout on endpoint URL: https://ssm.<region>.amazonaws.com

The provisioned EC2 instance cannot reach AWS Systems Manager because the security group configured for the parent AWS environment does not allow outbound traffic to AWS Systems Manager. Ask your AWS administrator to modify the security group assigned to the Amazon VPC.

For more information, see Security Group.

No engine instances launched for an active Azure VM deployment. When I use SSH to connect to a provisioned VM instance and view the engine log file, I see the following error:

ERROR com.streamsets.datacollector.main.RuntimeInfo - Was unable to detect machine hostname to potentially add to http.nonProxyHosts JVM system property
java.lang.RuntimeException: Could not load runtime configuration, 'java.net.UnknownHostException: System error'

The IBM StreamSets engine cannot detect the hostname of the provisioned VM instance because the Azure VNet created for the parent Azure environment uses custom DNS servers. When the VNet uses custom DNS servers, you must include a specific init script for all Azure VM deployments created for the Azure environment.

A GCE deployment fails to activate with the following error:

An Exception occurred during an entity operation: io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Constraint constraints/gcp.resourceLocations violated for [orgpolicy:projects/<project_ID>] attempting to create a secret in [global].

Your Google Cloud organization has used a resource location organization policy to disable global resource creation. However, the GCE deployment is configured to use an automatic replication policy for GCP Secret Manager secrets.

When your Google Cloud organization has disabled global resource creation, you must configure the GCE deployment to use a user managed replication policy.

A Kubernetes deployment remains in an Activating or Deactivating state indefinitely.

In some situations, Control Hub might start activating or deactivating a Kubernetes deployment but not be able to finish. For example, if the Kubernetes cluster has insufficient resources, the deployment can remain in an Activating state.

You can force the Kubernetes deployment to stop, and then manually delete any abandoned resources in the Kubernetes cluster.

For more information, see Forcing a Kubernetes Deployment to Stop.

Even though the number of engine instances in my organization has not reached the system limit, a new engine instance fails to launch with the following error:

Adding more <engine types> is currently not allowed. Please contact support.

Control Hub determines the system limit for engines by counting the number of generated authentication tokens, not by counting the number of successfully launched engine instances.

In rare cases, running the engine installation script for a self-managed deployment can successfully generate an authentication token for the engine, but fail to launch the engine. As a result, the number of generated authentication tokens can exceed the number of engines.

To resolve this issue, delete the unregistered authentication tokens. In the Navigation panel, click Set Up > Engines and then click an engine type tab. In the toolbar above the engines list, click the More icon (

) and then click Delete Unregistered Auth Tokens.

An engine instance launched for a self-managed deployment fails to start with the following error:

java.io.IOException: Failed to bind to /0.0.0.0:18630

The default engine port, 18630 for Data Collector or 19630 for Transformer, is already being used on the machine. In most cases, this occurs when you launch multiple engine instances on the same machine.

You cannot run multiple engine instances from the same self-managed deployment on the same machine, because all engines instances from the same deployment have an identical configuration.

You can run engine instances from different self-managed deployments on the same machine, as long as each deployment is configured to use a unique engine port number.

To modify the engine port number, edit the deployment. In the Configure Engine section, click Advanced Configuration. Then, click Data Collector Configuration or Transformer Configuration and modify the http.port property.

An engine instance launched for a self-managed deployment fails to start with the following errors:

Step 2 of 4: Waiting up to 5 minutes for engine to respond on http://<host name>:<port>
Step 2 of 4 failed: Timed out while waiting for engine to respond on http://<host name>:<port>

By default, the installation script waits a maximum of five minutes for the engine to start. In most situations, the default timeout is sufficient. However, in some situations, it might take longer.

When you encounter this error, run the installation script again using the STREAMSETS_ENGINE_TIMEOUT_MINS environment variable to increase the engine timeout value.

For more information, see Increasing the Engine Timeout for the Installation Script.