Enabling External JMX Tools
Data Collector uses JMX metrics to generate the graphical display of the status of a running pipeline. You can provide the same JMX metrics to external tools if desired.
Information provided by JMX metrics includes pipeline details like a histogram for the number of error records per batch or the amount of memory the pipeline uses. Stage-related details are also provided, such as the number of output records or stage errors. Some stages have stage-related custom metrics.
- com.sun.management.jmxremote
- com.sun.management.jmxremote.port=<port_number>
- com.sun.management.jmxremote.local.only=<true | false>
- com.sun.management.jmxremote.authenticate=<true | false>
- com.sun.management.jmxremote.ssl=<true | false>
You can pass the variables in the command line as part of the SDC_JAVA_OPTS environment variable. Or, you can add the variables as Java configuration options in the deployment associated with the engine, as described in Java Configuration Options.
For example, the following set of variables passes JMX metrics through port 3333:
export SDC_JAVA_OPTS="-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.port=3333 \
-Dcom.sun.management.jmxremote.local.only=false \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false"
Viewing JMX Metrics in External Tools
You can view the Data Collector JMX metrics in external tools. The Data Collector JMX metric names all begin with "sdc.pipeline."
Data Collector JMX metrics use the following naming pattern:
sdc.pipeline.<truncated pipeline name>__<Job ID>__<Organization ID>.<pipeline revision>.<category: pipeline|stage|custom>.\
[<stage library>_<library revision>].<metric name>.<metric type>
Where <truncated pipeline name>
is the first 10
characters of the pipeline name, with any non-alphanumeric characters removed.
For example, the following is a batch count meter for the first revision of a pipeline named WriteToKafka:
sdc.pipeline.WriteToKaf__92a9klbb-b19e-4f30-8b7u-a5t48de34753__a7f82a90-b7e3-33eb-b93h-cdd2kq1f34c4.0.pipeline.batchCount.meter
sdc.pipeline.WriteToKaf__92a9klbb-b19e-4f30-8b7u-a5t48de34753__a7f82a90-b7e3-33eb-b93h-cdd2kq1f34c4.0.stage.\
com_streamsets_pipeline_stage_origin_logtail_FileTailDSource_1.memoryConsumed.counter
Custom Metrics
Data Collector provides custom metrics for some stages. When a pipeline includes the stages below, you can view custom metrics for the stages in the Realtime Summary tab as you monitor the job in Control Hub or when you view JMX metrics using an external tool:
- File Tail origin
- In addition to the standard metrics available for origins, File Tail
provides the following custom metrics:
- Offset Lag - The amount of data remaining in the file being read.
This metric displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_origin_logtail_FileTailDSource_\ <library version>.offsets.lag.<file path>.counter
- Pending Files - The number of files in the directory that still need
to be read. This metric displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_origin_logtail_FileTailDSource_\ <library version>.pending.files.<file path>.counter
- Offset Lag - The amount of data remaining in the file being read.
This metric displays in external tools as
follows:
- Amazon S3 destination
- In addition to the standard metrics available for origins, Amazon S3 provides the following custom metrics:
- Hadoop FS destination
- In addition to the standard metrics available for origins, Hadoop FS
provides the following custom metrics:
- Late Records meter and counter - The number of late records written
to HDFS. The counter displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_HdfsTarget_\ HDFSDTarget_<library version>.lateRecords.<counter | metric>
- To HDFS Records meter and counter. The number of records written to
HDFS. The counter displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_HdfsTarget_\ HDFSDTarget_<library version>.hdfsRecords.<counter | metric>
- Transfer Rate KB Meter - Displays the transfer rate in KB. Appears
when the destination writes whole files to the destination system
with the whole file data format. The counter displays in external
tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_HdfsTarget_HDFSDTarget_\ <library version>.transferRateKb.meter
- Late Records meter and counter - The number of late records written
to HDFS. The counter displays in external tools as
follows:
- Local FS destination
- In addition to the standard metrics available for origins, Local FS provides
the following custom metrics:
- Late Records meter and counter - The number of late records written
to the local file system. The counter displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_localfilesystem_\ LocalFileSystemDTarget_<library version>.lateRecords.\ <counter | metric>
- To HDFS Records meter and counter. The number of records written to
the local file system. The counter displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_localfilesystem_\ LocalFileSystemDTarget_<library version>.hdfsRecords.\ <counter | metric>
- Transfer Rate KB Meter - Displays the transfer rate in KB. Appears
when the destination writes whole files to the destination system
with the whole file data format. The counter displays in external
tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_localfilesystem_\ LocalFileSystemDTarget_<library version>.transferRateKb.meter
- Late Records meter and counter - The number of late records written
to the local file system. The counter displays in external tools as
follows:
- MapR FS destination
- In addition to the standard metrics available for origins, MapR FS provides
the following custom metrics:
- Late Records meter and counter - The number of late records written
to MapR FS. The counter displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_marpfs_\ MaprFSDTarget_<library version>.lateRecords.<counter | metric>
- To HDFS Records meter and counter. The number of records written to
MapR FS. The counter displays in external tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_marpfs_\ MaprFSDTarget_<library version>.hdfsRecords.<counter | metric>
- Transfer Rate KB Meter - Displays the transfer rate in KB. Appears
when the destination writes whole files to the destination system
with the whole file data format. The counter displays in external
tools as
follows:
sdc.pipeline.<pipeline name>.<pipeline revision>.custom.\ com_streamsets_pipeline_stage_destination_marpfs_MaprFSDTarget_\ <library version>.transferRateKb.meter
- Late Records meter and counter - The number of late records written
to MapR FS. The counter displays in external tools as
follows: