Encrypt and Decrypt Fields
The Encrypt and Decrypt Fields processor encrypts or decrypts field values.
You can use the processor to encrypt one or more fields in a record. You can also use the processor to decrypt one or more fields that were encrypted by another Encrypt and Decrypt Fields processor. You cannot use the processor to perform encryption and decryption at the same time. Use an additional processor when you want to perform both tasks.
The Encrypt and Decrypt Fields processor uses the Amazon AWS Encryption SDK to encrypt and decrypt fields. When encrypting fields, the processor encrypts the data key and any additional encryption details, and stores the encrypted details along with the encrypted data. When decrypting fields, the processor extracts the encrypted data key and additional details, decrypts the key, and then uses it to decrypt the data.
You can use Amazon AWS Key Management Service (KMS) as a key provider for the processor, or you can supply the data key in the processor configuration properties. When using Amazon AWS KMS, you specify the KMS Key Amazon Resource Name (ARN). You can use an instance profile or AWS access key pairs to connect to Amazon AWS. When using a user-supplied key, you specify a Base64 encoded key and can optionally configure a key ID.
For both key provider types, you specify the cipher suite and frame size to use. When encrypting data, you can optionally define an encryption context and configure data key caching.
For information about the structure of AWS-encrypted data, see the AWS Encryption SDK documentation.
Supported Data Types
When encrypting a field, the Encrypt and Decrypt Fields processor includes the data type of the field in the encrypted data. When decrypting the same field, the processor restores the field to its original data type.
The Encrypt and Decrypt Fields processor can encrypt or decrypt string or byte array data. So you can use the processor to encrypt or decrypt data that can be converted to string or byte array.
You can use the Encrypt and Decrypt Fields processor to encrypt or decrypt the following data types:
- Boolean
- Byte
- Byte Array
- Character
- Date
- Datetime
- Decimal
- Double
- Float
- Integer
- Long
- Short
- String
- Time
- Zoned Datetime
Key Provider
When you use the Encrypt and Decrypt Fields processor, you specify the key provider for the stage.
- Amazon AWS KMS
- Uses a master key provided by the AWS KMS service.
- User supplied key
- Requires specifying a Base64 encoded master key.
AWS Credentials
When you use Amazon AWS KMS as the key provider, Data Collector must pass credentials to AWS.
Use one of the following methods to pass AWS credentials:
- Instance profile
- When the execution Data Collector runs on an Amazon EC2 instance that has an associated instance profile, Data Collector uses the instance profile credentials to automatically authenticate with AWS.
- AWS access key pair
- When the execution Data Collector does
not run on an Amazon EC2 instance or when the EC2 instance doesn’t
have an instance profile, you must specify the Access Key ID and
Secret Access Key properties in the stage.Tip: To secure sensitive information such as access key pairs, you can use runtime resources or credential stores. For more information about credential stores, see Credential Stores in the Data Collector documentation.
Cipher Suite
When you use the Encrypt and Decrypt Fields processor, you specify the cipher suite to use. The processor uses the selected cipher suite to encrypt or decrypt the data.
- ALG_AES_256_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384 (default)
-
ALG_AES_192_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384
-
ALG_AES_128_GCM_IV12_TAG16_HKDF_SHA256_ECDSA_P256
-
ALG_AES_256_GCM_IV12_TAG16_HKDF_SHA256 (no signature)
-
ALG_AES_192_GCM_IV12_TAG16_HKDF_SHA256 (no signature)
-
ALG_AES_128_GCM_IV12_TAG16_HKDF_SHA256 (no signature)
-
ALG_AES_256_GCM_IV12_TAG16_NO_KDF (not recommended)
-
ALG_AES_192_GCM_IV12_TAG16_NO_KDF (not recommended)
-
ALG_AES_128_GCM_IV12_TAG16_NO_KDF (not recommended)
For an overview of how the AWS Encryption SDK supports cipher suites, see the AWS Encryption SDK documentation. The documentation also provides additional details about cipher suites.
Encryption Contexts
You can specify encryption contexts to be included in the encrypted data. Encryption contexts, also known as additional authenticated data (AAD), are key value pairs that are encrypted and included with the encrypted data.
Optionally use encryption contexts as an additional tool to prevent tampering with encrypted data.
When used to encrypt data, the encryption contexts are required to decrypt the data as well.
Data Key Caching
By default, the Encrypt and Decrypt Fields processor generates a new data key for each encryption operation. You can enable caching and reusing data keys to increase pipeline performance when security considerations allow.
Consider the possible security ramifications before enabling data key caching. This AWS blog post describes some of the issues to consider. For details on how data key caching works, see the AWS Encryption SDK documentation.
- Cache Capacity
- Max Data Key Age
- Records per Data Key
- Bytes per Data Key
Encrypt and Decrypt Records
You can use the Encrypt and Decrypt Fields processor to encrypt or decrypt a whole record by serializing the record to a single field before passing it to the processor.
You can use the Data Generator processor to serialize the record to the root field of the record. When you configure the Data Generator processor, you specify the data format to use for the serialized record. Use a text-based format, such as JSON, which results in a String field, or a binary format such as Avro which results in a Byte Array field.
Configuring an Encrypt and Decrypt Field Processor
-
In the Properties panel, on the General tab, configure the
following properties:
General Property Description Name Stage name. Description Optional description. Required Fields Fields that must include data for the record to be passed into the stage. Tip: You might include fields that the stage uses.Records that do not include all required fields are processed based on the error handling configured for the pipeline.
Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions. Records that do not meet all preconditions are processed based on the error handling configured for the stage.
On Record Error Error record handling for the stage: - Discard - Discards the record.
- Send to Error - Sends the record to the pipeline for error handling.
- Stop Pipeline - Stops the pipeline. Not valid for cluster pipelines.
-
On the Action tab, configure the following
properties:
Action Property Description Mode The action for the processor to perform: encrypting or decrypting data in the specified fields. Fields Field paths for the fields to encrypt. Tip: To encrypt an entire record, you can use a Data Generator processor earlier in the pipeline to serialize the record to a single field. -
On the Key Provider tab, configure the following
properties:
Key Provider Property Description Master Key Provider The data key provider for encoding or decoding data: - Amazon AWS KMS - Uses data keys from the Amazon AWS Key Management Service.
- User Supplied Key - Uses a Base64 encoded user-supplied key.
Cipher The cipher suite to use for encoding or decoding data: - ALG_AES_256_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384 (default)
-
ALG_AES_192_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384
-
ALG_AES_128_GCM_IV12_TAG16_HKDF_SHA256_ECDSA_P256
-
ALG_AES_256_GCM_IV12_TAG16_HKDF_SHA256 (no signature)
-
ALG_AES_192_GCM_IV12_TAG16_HKDF_SHA256 (no signature)
-
ALG_AES_128_GCM_IV12_TAG16_HKDF_SHA256 (no signature)
-
ALG_AES_256_GCM_IV12_TAG16_NO_KDF (not recommended)
-
ALG_AES_192_GCM_IV12_TAG16_NO_KDF (not recommended)
-
ALG_AES_128_GCM_IV12_TAG16_NO_KDF (not recommended)
Frame Size The frame size in bytes. Use to divide data into multiple frames for encryption. To divide data into multiple frames, specify the frame size to use. To use a single frame, enter 0.
Default is 4096. With this default, data less than 4096 bytes is encrypted in a single frame.
Access Key ID AWS access key ID.
Required when using Amazon AWS KMS for the key provider and not using instance profile credentials.
For an Amazon AWS KMS key provider only.
Secret Access Key AWS secret access key. Required when using the Amazon AWS KMS for the key provider and not using instance profile credentials.
For an Amazon AWS KMS key provider only.
KMS Key ARN Amazon Resource Name (ARN) for the KMS key. For information about locating the key ARN, see the AWS KMS documentation.
Required when using an Amazon AWS KMS key provider.
Base64 Encoded Key Base64 encoded data key to use when using a user-supplied key. You can use credential functions to access a key from a supported credential store. For more information, see Credential Stores in the Data Collector documentation. You can also use the base64EncodeString() function to encode the string returned by the function.
The length of the encoded key must match the length expected by the selected cipher. For example, when using a 256-bit (32 bytes) cipher suite, the key must be 32 bytes in length.
Key ID An optional key ID to use in addition to the Base64 encoded key when using a user-supplied key. Use a string value.
Encryption Context (AAD) Key value pairs to be used as encryption contexts, also known as additional authenticated data. Data Key Caching Enables caching and reusing data keys. Use to improve performance when security considerations allow. Cache Capacity The maximum number of keys to cache in memory. Max Data Key Age The maximum number of seconds that a data key can be used before the data key is retired. Max Records per Data Key The maximum number of fields that a data key can encrypt before the data key is retired. Max Bytes per Data Key The maximum number of bytes that a data key can be used to encrypt before the data key is retired.