MapR DB JSON

Data Collector

The MapR DB JSON origin reads JSON documents from MapR DB JSON tables. The origin converts each document into a record. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.

MapR DB JSON tables are tables in which every row is a JSON document. Each JSON document has a unique identifier stored in the _id field, which in turn is used as the row key to uniquely identify each row in the table.

When you configure the origin, you define the JSON table to read from. The origin uses the _id field in each JSON document as the offset field. You can optionally define the initial offset value to start reading from.

When the pipeline stops, the MapR DB JSON origin notes where it stops reading. When the pipeline starts again, the origin continues processing from where it stopped by default. You can reset the origin to process all available data.

Tip: Data Collector provides several MapR origins to address different needs. For a quick comparison chart to help you choose the right one, see Comparing MapR Origins.

Before you use any MapR stage in a pipeline, you must perform additional steps to enable Data Collector to process MapR data. For more information, see MapR PrerequisitesMapR Prerequisites in the Data Collector documentation.