Spark SQL Query
The Spark SQL Query processor runs a Spark SQL query to transform batches of data. To perform record-level calculations using Spark SQL expressions, use the Spark SQL Expression processor.
For each batch of data, the processor receives a single Spark DataFrame as input and registers the input DataFrame as a temporary table in Spark. The processor then runs a Spark SQL query to transform the temporary table, and then returns a new DataFrame as output.
When you configure the processor, you define the Spark SQL query that the processor runs. The Spark SQL query can include Spark SQL and a subset of the functions provided with the StreamSets expression language.
rank
in the query, you'd first want to use a Window processor
before the Spark SQL Query processor.