Using the Bulk API with PK Chunking

You can use PK Chunking with the Bulk API to process large volumes of Salesforce data. PK Chunking uses the Id field as the offset field and returns chunks of data based on user-defined chunks of the Id field. For more information about PK Chunking, see the Salesforce documentation or this informative blog post.

When performing PK Chunking, the origin cannot process deleted records.

Use the following guidelines when using the Bulk API with PK Chunking to process existing data:

SOQL query

Use the following query guidelines:

Include the Id field in the SELECT statement.
Optionally include a WHERE clause, but do not use the Id field in the WHERE clause.
Do not include an ORDER BY clause.

The complete SOQL query for PK Chunking should use the following syntax:

SELECT Id, <field1>, <field2>, ... [WHERE <condition without the Id field>] FROM <object>

If you specify SELECT * FROM <object> in the SOQL query, the origin expands * to all fields in the Salesforce object that are accessible to the configured user. Note that the origin adds components of compound fields to the query, rather than adding the compound fields themselves. For example, the origin adds BillingStreet, BillingCity, etc., rather than adding BillingAddress. Similarly, it adds Location__Latitude__s and Location__Longitude__s rather than Location__c.

Additional properties

Configure the following additional properties on the Query tab:

Offset Field - The field to use for chunking. Must use the default Id field.
Chunk Size - The range of values in the Id field to be queried at one time. The default is 100,000 and the maximum size is 250,000.
Start ID - An optional lower boundary for the first chunk. When omitted, the origin begins processing with the first record in the object.

For example, when using a chunk size of 250,000 and a start ID of 001300000000000, the first query returns data with Id values starting with 001300000000000 with a chunk size of 250,000. The second query returns the next chunk of records.

When using PK Chunking, the origin ignores the Initial Offset property and uses the optional Start ID instead.