Pipeline Processing on SparkTransformer functions as a Spark client that launches distributed Spark applications. 
 Batch Case StudyTransformer can         run pipelines in batch mode. A batch pipeline processes all available data in a single         batch, and then stops. 
 Streaming Case StudyTransformer can         run pipelines in streaming mode. A streaming pipeline maintains connections to origin         systems and processes data at user-defined intervals. The pipeline runs continuously until         you manually stop it.
 Tutorials and Sample PipelinesStreamSets provides tutorials and sample pipelines to help you learn about using Transformer. 
 What is a Transformer Pipeline?A Transformer         pipeline describes the flow of data from origin systems to destination systems and defines         how to transform the data along the way.
 Sample Pipelines         Transformer provides sample pipelines that you can use to learn about Transformer features or as a template for building your own pipelines.
 Local PipelinesTypically, you run a Transformer pipeline             on a cluster. You can         also run a pipeline on a Spark installation on the Transformer machine.         This is known as a local pipeline.
 Spark ExecutorsA Transformer         pipeline runs on one or more Spark executors. 
 PartitioningWhen you start a pipeline, StreamSets Transformer launches a Spark application. Spark runs the application just as it runs any other         application, splitting the pipeline data into partitions and performing operations on the         partitions in parallel.
 Batch Header AttributesBatch header attributes are attributes in batch headers that you can use in pipeline         logic.
 Delivery GuaranteeTransformer's offset handling ensures that, in         times of sudden failures, a Transformer pipeline does not lose data - it processes data at least once. If a sudden failure occurs         at a particular time, up to one batch of data may be reprocessed. This is an at-least-once         delivery guarantee. 
 Caching DataYou can configure most origins and processors to cache data. You might enable caching         when a stage passes data to more than one downstream stage.