Data Formats by Stage

Data Format Support

This appendix lists the data formats supported by origin, processor, and destination stages.

Origins

The following table lists the data formats supported by each origin.

Origin Avro Binary Datagram Delimited Excel JSON Log Parquet Protobuf SDC Record Text Whole File XML
Amazon S3
Amazon SQS Consumer
Aurora PostgreSQL CDC Client * * * Not Applicable * * *
Azure Blob Storage
Azure Data Lake Storage Gen1
Azure Data Lake Storage Gen2
Azure Data Lake Storage Gen2 (Legacy)
Azure IoT/Event Hub Consumer
CoAP Server
CONNX * * * Not Applicable * * *
CONNX CDC * * * Not Applicable * * *
Couchbase * * * Not Applicable * * *
Cron Scheduler * * * Not Applicable * * *
Directory
Elasticsearch * * * Not Applicable * * *
File Tail
Google BigQuery * * * Not Applicable * * *
Google Cloud Storage
Google Pub/Sub Subscriber
Groovy Scripting * * * Not Applicable * * *
gRPC Client
Hadoop FS
Hadoop FS Standalone
HTTP Client
HTTP Server
JavaScript Scripting * * * Not Applicable * * *
JDBC Multitable Consumer * * * Not Applicable * * *
JDBC Query Consumer * * * Not Applicable * * *
JMS Consumer
Jython Scripting * * * Not Applicable * * *
Kafka Consumer
Kafka Multitopic Consumer
Kinesis Consumer
MapR DB CDC * * * Not Applicable * * *
MapR DB JSON * * * Not Applicable * * *
MapR FS
MapR FS Standalone
MapR Multitopic Streams Consumer
MapR Streams Consumer
MongoDB * * * Not Applicable * * *
MongoDB Atlas * * * Not Applicable * * *
MongoDB Atlas CDC * * * Not Applicable * * *
MongoDB Oplog * * * Not Applicable * * *
MQTT Subscriber
MySQL Binary Log * * * Not Applicable * * *
NiFi HTTP Server
Omniture * * * Not Applicable * * *
OPC UA Client * * * Not Applicable * * *
Oracle Bulkload * * * Not Applicable * * *
Oracle CDC * * * Not Applicable * * *
Oracle CDC Client * * * Not Applicable * * *
PostgreSQL CDC Client * * * Not Applicable * * *
Pulsar Consumer
Pulsar Consumer (Legacy)
RabbitMQ Consumer
Redis Consumer
REST Service
Salesforce * * * Not Applicable * * *
Salesforce Bulk API 2.0 * * * Not Applicable * * *
SAP HANA Query Consumer * * * Not Applicable * * *
SDC RPC
SFTP/FTP/FTPS Client
Snowflake Bulk * * * Not Applicable * * *
SQL Server 2019 BDC Multitable Consumer * * * Not Applicable * * *
SQL Server CDC Client * * * Not Applicable * * *
SQL Server Change Tracking * * * Not Applicable * * *
Start Jobs * * * Not Applicable * * *
Start Pipelines * * * Not Applicable * * *
System Metrics * * * Not Applicable * * *
TCP Server
Teradata Consumer * * * Not Applicable * * *
UDP Multithreaded Source * * * Not Applicable * * *
UDP Source * * * Not Applicable * * *
Web Client
WebSocket Client
WebSocket Server
Windows Event Log * * * Not Applicable * * *

Processors

The following table lists the processors that read data of the listed format:
Processor Avro Binary Datagram Delimited JSON Log Netflow Protobuf SDC Record Syslog Text XML
Data Parser
HTTP Client
JSON Parser
Kaitai Struct Parser
Log Parser
Web Client
XML Parser

The following table lists the processors that write data of the specified data format to a field:

Processor Avro Binary Delimited JSON Log Protobuf SDC Record Text XML
Data Generator
JSON Generator

Destinations

The following table lists the data formats supported by each destination.

Destination Avro Binary Delimited JSON Protobuf Parquet SDC Record Text Whole File XML
Aerospike * * * Not Applicable * * *
Aerospike Client * * * Not Applicable * * *
Amazon S3
Azure Blob Storage
Azure Data Lake Storage (Legacy)
Azure Data Lake Storage Gen1
Azure Data Lake Storage Gen2
Azure Event Hub Producer
Azure IoT Hub Producer
Azure Synapse SQL * * * Not Applicable * * *
Cassandra * * * Not Applicable * * *
CoAP Client
Couchbase
Databricks Delta Lake * * * Not Applicable * * *
Elasticsearch * * * Not Applicable * * *
Flume
Google BigQuery (Legacy) * * * Not Applicable * * *
Google BigQuery * * * Not Applicable * * *
Google Bigtable * * * Not Applicable * * *
Google Cloud Storage
Google Pub/Sub Publisher
GPSS Producer * * * Not Applicable * * *
Hadoop FS
HBase * * * Not Applicable * * *
Hive Metastore
Hive Streaming * * * Not Applicable * * *
HTTP Client
InfluxDB * * * Not Applicable * * *
InfluxDB 2.x * * * Not Applicable * * *
JDBC Producer * * * Not Applicable * * *
JMS Producer
Kafka Producer
Kinesis Firehose
Kinesis Producer
KineticaDB * * * Not Applicable * * *
Kudu * * * Not Applicable * * *
Local FS
MapR DB * * * Not Applicable * * *
MapR DB JSON * * * Not Applicable * * *
MapR FS
MapR Streams Producer
MemSQL Fast Loader * * * Not Applicable * * *
MongoDB * * * Not Applicable * * *
MongoDB * * * Not Applicable * * *
MQTT Publisher
Named Pipe
Pulsar Producer
RabbitMQ Producer
Redis
Salesforce * * * Not Applicable * * *
Salesforce Bulk API 2.0 * * * Not Applicable * * *
SDC RPC
Send Response to Origin
SFTP/FTP/FTPS Client
SingleStore * * * Not Applicable * * *
Snowflake * * * Not Applicable * * *
Snowflake File Uploader
Solr * * * Not Applicable * * *
Splunk * * * Not Applicable * * *
SQL Server 2019 BDC Bulk Loader * * * Not Applicable * * *
Syslog
Tableau CRM * * * Not Applicable * * *
Teradata * * * Not Applicable * * *
To Error * * * Not Applicable * * *
Trash * * * Not Applicable * * *
Web Client
WebSocket Client