Resetting the Origin
You can reset the origin when you want the Data Collector to process all available data instead of processing data from the last-saved offset. Reset the origin when the pipeline is not running.
- Amazon S3
- Azure Data Lake Storage Gen1
- Azure Data Lake Storage Gen2
- Directory
- Elasticsearch
- File Tail
- Google Cloud Storage
- Groovy Scripting
- Hadoop FS Standalone
- HTTP Client
- JavaScript Scripting
- JDBC Multitable Consumer
- JDBC Query Consumer
- Jython Scripting
- Kinesis Consumer
- MapR DB JSON
- MapR FS Standalone
- MongoDB
- MongoDB Oplog
- MySQL Binary Log
- Salesforce
- SAP HANA Query Consumer
- SFTP/FTP/FTPS Client
- SQL Server 2019 BDC Multitable Consumer
- SQL Server CDC Client
- SQL Server Change Tracking
- Teradata Consumer
- Windows Event Log
For these origins, when you stop the pipeline, the Data Collector notes where it stopped processing data. When you restart the pipeline, it continues from where it left off by default. When you want the Data Collector to process all available data instead of continuing from where it stopped, reset the origin. For unique details about resetting the Kinesis Consumer origin, see Resetting the Kinesis Consumer Origin.
You can configure the Kafka and MapR Streams Consumer origins to process all available data by specifying an additional Kafka configuration property. You can reset the Azure IoT/Event Hub Consumer origin by deleting offset details in the Microsoft Azure portal. The remaining origin stages process transient data where resetting the origin has no effect.
You can reset the origin for multiple pipelines at the same time from the Home page. Or, you can reset the origin for a single pipeline from the pipeline canvas.
To reset the origin:
- Select multiple pipelines from the Home page, or view a single pipeline in the pipeline canvas.
- Click the More icon, and then click Reset Origin.
- In the Reset Origin Confirmation dialog box, click Yes to reset the origin.