Getting Started with SDC Edge

Data Collector Edge (SDC Edge) includes several sample pipelines that make it easy to get started.

Step 1. Import a Sample SDC Edge Pipeline

The Data Collector Edge GitHub repository includes sample edge pipelines. To use the sample pipelines, you first import them into Data Collector. You can edit the sample pipelines as needed and then deploy the edited versions to SDC Edge.

The sample edge pipelines use runtime parameters so that you can specify the values for pipeline properties when you start the pipeline.

In the following steps, we'll use the Directory Spooler to HTTP sample pipeline as an example. This sample edge pipeline uses a Directory origin to read a local text file on the edge device and write the data in JSON format to an HTTP Client destination, as follows:

  1. In the Data Collector Edge GitHub repository, view the list of sample pipelines.
  2. Under Sample Pipelines, click the name of a sample pipeline.
    For our example, click Directory Spooler to HTTP.
  3. Click Try Now.
    • If Data Collector is running on the local machine, log into Data Collector as prompted.
    • If Data Collector is running on a remote machine, modify the URL in the address bar of the browser to use the correct host name and port number, press Enter, and then log into Data Collector as prompted.
    The Import Pipeline from HTTP URL dialog box displays the pipeline title and HTTP URL.
  4. Optionally modify the pipeline title and description.
  5. Click Import.

Step 2. Create and Start a Data Collector Receiving Pipeline

Edge sending pipelines work in tandem with Data Collector pipelines. So after choosing the sample edge pipeline that you want to use, create and start the corresponding Data Collector receiving pipeline. The Data Collector receiving pipeline must start before the edge sending pipeline.

  1. In Data Collector, create a new pipeline.
  2. On the General tab, select Standalone for the Execution Mode.
  3. On the Error Records tab, select Discard, Write to File, or Write to MQTT for the error record handling.
    If writing the error records to file or MQTT, click the appropriate error records tab and configure the required properties.
  4. Configure the remaining pipeline properties as needed.
  5. Add the corresponding origin to read from the destination in the sample edge pipeline.
    To receive data from the Directory Spooler to HTTP sample pipeline - which uses an HTTP Client destination - add an HTTP Server origin.
  6. Configure the origin properties used to receive the data.
    To receive data from the Directory Spooler to HTTP sample pipeline, on the HTTP tab of the HTTP Server origin, enter a unique port number for the HTTP Listening Port and enter a unique application ID for the Application ID.

    When you start the sample edge pipeline, you'll use runtime parameters to specify the same values for the corresponding destination.

  7. On the Data Format tab, select JSON or Text based on the sample edge pipeline.
    For the Directory Spooler to HTTP pipeline template, select JSON.
  8. Add and configure any number of processors, executors, and destinations.
  9. Validate the pipeline, and then click the Start icon to start the pipeline.

Step 3. Download and Install SDC Edge

Download the SDC Edge executable from the StreamSets Support portal. Install SDC Edge on the edge device where you want to run edge pipelines.

For instructions, see Downloading SDC Edge.

Step 4. Deploy the Pipeline to SDC Edge

Deploy the sample edge pipeline to SDC Edge.

To deploy edge pipelines when SDC Edge is not running, export the pipelines and then move them to SDC Edge. For instructions, see Export Pipelines.

Step 5. Start SDC Edge and the Edge Pipeline

Run a single command to manually start SDC Edge and a sample edge pipeline at the same time. You can then run additional commands to start additional pipelines and to manage running pipelines after SDC Edge is running.

  1. On the edge device, run the following command from the SDC Edge home directory to start SDC Edge and the sample pipeline:
    bin/edge -start=<sample_pipeline_name> -runtimeParameters='{"<parameter_name1>":"<parameter_value1>",
    "<parameter_name2":"<parameter_value2>"}'

    See Data Collector Edge GitHub repository for a list of parameters used by each sample pipeline.

    For example, to run the Directory Spooler to HTTP sample pipeline, use the following command, ensuring that you use the same port number and application ID that you configured for the corresponding HTTP Server origin in the Data Collector receiving pipeline:

    bin/edge -start=directoryToHttp -runtimeParameters='{"directoryPath":"/tmp/out/dir","httpUrl":"http://localhost:9999","sdcAppId":"sde"}'
  2. To optionally run another sample pipeline when SDC Edge is already running, use the following command:
    curl -X POST http://<SDCEdge_hostname>:18633/rest/v1/pipeline/<sample_pipeline_name>/start -H 'Content-Type: application/json;charset=UTF-8' --data-binary -runtimeParameters='{"<parameter_name1>":"<parameter_value1>","<parameter_name2":"<parameter_value2>"}'