Install Data Collector on Amazon Web Services

You can install the full Data Collector on Amazon Web Services (AWS).

Data Collector is installed as an RPM package on an Amazon Linux 2 machine hosted on EC2. Data Collector is available as a service on the instance after the deployment is complete.
  1. In the AWS Marketplace, search for StreamSets Data Collector.
  2. Subscribe to the StreamSets Data Collector offer, accept the terms and conditions, and then click Continue to Configuration.
  3. Select the appropriate AWS fulfillment options, and then click Continue to Launch.
  4. To launch Data Collector from the AWS marketplace website, choose Launch from Website and then complete the following steps:
    1. Select the recommended EC2 instance type or choose another instance type based on your expected workload.

      See the Data Collector installation requirements for details.

    2. Select the appropriate VPC, subnet, and key pair settings.
    3. For the security group settings, click Create New Based on Seller Settings, enter a name for the new security group, and then configure the range of IP addresses that can access the Data Collector web-based UI on port 18630.
      Important: The default range of 0.0.0.0/0 gives all IP addresses access to Data Collector. Be sure to modify the default value to restrict access to known IP addresses only.
    4. Click Launch.
  5. To launch Data Collector from the AWS EC2 console, choose Launch through EC2 and then complete the following steps:
    1. Click Launch.
    2. Select the recommended EC2 instance type or choose another instance type based on your expected workload.

      See the Data Collector installation requirements for details.

    3. Configure the instance details, storage, and tags as needed.
    4. When configuring the security group for the instance, configure the range of IP addresses that can access the Data Collector web-based UI on port 18630.
      Important: The default range of 0.0.0.0/0 gives all IP addresses access to Data Collector. Be sure to modify the default value to restrict access to known IP addresses only.
    5. After reviewing the details, click Launch.
  6. When launching the instance, note the instance ID on the Launch Status page.

    The password to Data Collector matches the instance ID.

    AWS might require a few minutes to launch an instance.

  7. To access Data Collector, enter the following URL in the address bar of your browser:
    http://<Public DNS of EC2 instance>:18630

    For example if your DNS is ec2-12-345-678-999.compute-1.amazonaws.com, enter:

    http://ec2-12-345-678-999.compute-1.amazonaws.com:18630
  8. To log in, enter admin as the user name and the EC2 instance ID as the password.

    For information on administering Data Collector, such as viewing logs and restarting Data Collector, see Administration.

    Tip: If you are new to Data Collector, consider starting with the Tutorial.