MongoDB Atlas

The MongoDB Atlas destination writes data to MongoDB Atlas and MongoDB Enterprise Server. For information about supported versions, see Supported Systems and Versions.

The MongoDB Atlas destination can write CDC data if an operation is specified in the CRUD operation header attribute. When not specified, it treats all records as inserts. The destination can also perform upserts for update and replace records. For information about Data Collector change data processing and a list of CDC-enabled origins, see Processing Changed Data.

When you configure the destination, you define connection information, such as the connection string and credentials to use. You can specify SSL/TLS properties for an SSL/TLS-enabled MongoDB Atlas cluster.

You configure the database, collection, and write concern to use. To replace and update records, you must specify a unique key field and can optionally enable an upsert flag. When you do not specify a unique key field, update and replace records are sent to the stage for error handling.

You can optionally configure advanced options that determine how the destination connects to MongoDB Atlas.

Credentials

Based on the authentication used by MongoDB, configure the MongoDB Atlas destination to use no authentication, username/password authentication, or LDAP authentication. By default, no authentication is used.

To use username/password or LDAP authentication, enter the required credentials in one of the following ways:
Authentication method
Specify the authentication to use with the Authentication Method property on the Credentials tab:
  • None
  • Username / Password
  • LDAP
Then, define the username and password for username/password or LDAP authentication.
When using username/password authentication, you also specify the authentication mechanism to use. You can also specify an authentication database.
Connection string
If you prefer, you can specify credentials in the connection string on the Connection tab. However, specifying credentials on the Credentials tab is the recommended method.
To enter credentials for username/password authentication for self-managed clusters, enter the username and password before the host name. Use the following format:
mongodb://username:password@host[:port][/[database][?options]]
To enter credentials for MongoDB Atlas, specify the URL from your Atlas cluster settings.

Specifying Field Paths

When configuring the MongoDB Atlas destination, you can specify field paths in either of the following ways:
  • Data Collector format - Uses a slash ( / ) as a delimiter. Includes a leading slash.
  • MongoDB format - Uses a period ( . ) as a delimiter.
The following table lists examples of the two field path formats:
Data Collector Format MondoDB Format
/_id _id
/orders/address/line1 orders.address.line1
/orders/lines[1]/quantity orders.lines[1].quantity

Unordered Writes and Stopping the Pipeline

The MongoDB Atlas destination can perform ordered or unordered writes. The write type affects whether the Stop Pipeline error record handling property is honored, as follows:
ordered writes
The MongoDB Atlas destination performs ordered writes by default. Use this mode to perform ordered writes for pipelines that handle a range of CRUD operations. When all operations are Inserts, the destination performs ordered writes by default.
When performing ordered writes, the destination can stop the pipeline when the On Record Error property is set to Stop Pipeline.
Performing ordered writes can slow pipeline performance.
unordered writes
When ordered writes are not required, you can improve pipeline performance by configuring the MongoDB Atlas destination to perform unordered writes. Use this mode when all records to be processed are Inserts or if the order of the writes is not important. Otherwise data consistency is not guaranteed.
When performing unordered writes, the destination cannot stop the pipeline when the On Record Error property is set to Stop Pipeline.

Use the Ordered Writes property on the MongoDB tab to specify the type of writes that the destination performs.

Define the CRUD Operation

To write to MongoDB Atlas, ensure that the CRUD operation record header attribute is defined for each record earlier in the pipeline. Records without an operation record header attribute are sent to error.

To update and replace records, you must specify a unique key field. You can also enable upserts for update and replace records.

Note that when performing a DELETE operation, the destination deletes a maximum of one matching document in MongoDB Atlas. It does not delete all matching documents, as is sometimes possible with MongoDB Atlas.

To write records to MongoDB Atlas, make sure records include the following CRUD operation record header attribute:
sdc.operation.type
When defined, the MongoDB Atlas destination uses the CRUD operation in the sdc.operation.type record header attribute when writing to MongoDB. When not specified, it treats all records as Inserts.
The MongoDB Atlas destination supports the following values for the sdc.operation.type attribute:
  • 1 for INSERT
  • 2 for DELETE
  • 3 for UPDATE
  • 4 for UPSERT
  • 7 for REPLACE
If your pipeline has a CRUD-enabled origin that processes changed data, the destination simply reads the operation type from the sdc.operation.type header attribute that the origin generates. If your pipeline has a non-CDC origin, you can use the Expression Evaluator processor or a scripting processor to define the record header attribute. For more information about Data Collector changed data processing and a list of CDC-enabled origins, see Processing Changed Data.

Performing Upserts

The MongoDB Atlas destination performs upserts when records are flagged for upsert – when records have the sdc.operation.type record header attribute set to 4 for upsert.

The destination also provides an Upsert property that enables upserts for records that are flagged for update or replace. When you enable the Upsert property, the destination inserts records when it does not find existing records to update or replace.

When not enabled, if the destination does not find an existing record for a record flagged for update or replace, it does not write the record to MongoDB Atlas. The Upsert property is not enabled by default.

For more information about MongoDB Atlas operations and the upsert flag, see the MongoDB Atlas documentation.

Enabling SSL/TLS

By default, the MongoDB Atlas destination does not use SSL/TLS. If the cluster is enabled to use SSL/TLS, then you can connect using one of the following methods:
  • Atlas/System CA - Connects to a MongoDB Atlas cluster. You can also use this when your certificates or keys have already been specified at the JVM level.
  • Server Validation (1 Way TLS) - Connects to an SSL/TLS-enabled MongoDB Enterprise Server cluster when the client needs to validate the server certificate and does not need to prove client identity.
  • Server and Client Validation (2 Way TLS) - Connects to an SSL/TLS-enabled MongoDB Enterprise Server cluster when the client needs to validate the server certificate and the server also validates the client key. This occurs when the cluster is set up to require client certificates.
Note: Server validation and server and client validation require configuring additional properties that provide the required information. Both options require obtaining the certificate file for the cluster in one of the valid formats. Server and client validation also requires generating or obtaining the public certificate and private key file for Data Collector.
You can specify certificates and keys in the following formats:
  • JKS (Java Keystore)
  • PEM (text-based)
  • DER (text-based)
  • PKCS #7 / P7B
  • PKCS #12 / P12 / PFX
  • Private keys inside PEM, DER, or PKCS #12 encoded as PKCS#1 or PKCS#8

If the files are in PEM or DER plain text format, you can provide the text in the stage properties. The certificate should begin and end with text such as: —BEGIN CERTIFICATE— or —END PRIVATE KEY—. Otherwise, you provide a path to the certificate file.

MongoDB Data Types

When the MongoDB Atlas destination writes to MongoDB, it converts Data Collector data types to the following standard MongoDB data types, by default. When necessary, the destination can convert Data Collector types to MongoDB BSON types. For more information, see Writing BSON Types.

The following table describes how Data Collector data types are converted to standard MongoDB data types:

Data Collector Type Standard MongoDB Type
Boolean Boolean
Byte Binary
ByteArray Binary
Char String
Date Date
Datetime Date
Decimal Decimal128
Double Double
Float Double
Integer Int32
List Array
List-Map Document
Long Int64
Map Document
Short Int32
String String
Time Date
ZonedDateTime Date

Converted to UTC since MongoDB does not store time zones.

Writing BSON Types

When writing to MongoDB, the MongoDB Atlas destination converts record fields to standard MongoDB data types as described in MongoDB Data Types, by default.

When a default conversion to a standard MongoDB data type is not appropriate, you can enable the destination to convert fields to MongoDB BSON types by placing BSON type information in field attributes.

To enable conversions to BSON types, use an Expression Evaluator processor earlier in the pipeline to set the bsonType field attribute to the BSON type you want to use. You can use a single processor to set the required field attributes for multiple fields. Some conversions require defining additional field attributes.

For example, to write a String field that contains an ObjectID hexadecimal string to a MongoDB DBRef column, you use an Expression Evaluator processor to set the bsonType field attribute to Db_Ref. And since the String to DB_REF conversion requires additional attributes, you also set the database and collection field attributes to the appropriate values.

The following table lists supported MongoDB BSON data types, the Data Collector types you can convert from, and related attribute details.

Note: Field attribute names are case sensitive. However, attribute values are not. For example, CODE, Code, and code are all valid bsonType attribute values.
BSON Data Type Compatible Data Collector Type bsonType Attribute Value
Binary Byte, Byte Array, Char, String

Strings are converted using UTF-8

Binary
BsonDbPointer List-Map, Map
Should contain the following fields:
  • database
  • collection
  • id - A 24-digit hexadecimal or yyyy-MM-dd HH:mm:ss format
Bson_Db_Pointer
BsonDbPointer String Bson_Db_Pointer
Also specify the following field attributes:
  • database
  • collection
  • id - Use a 24-digit hexadecimal or yyyy-MM-dd HH:mm:ss format
BsonRegularExpression String Bson_Regular_Expression
BSONTimestamp Date, Datetime, String, Time, Zoned Datetime

Strings should be in yyyy-MM-dd HH:mm:ss format

Bson_Timestamp

Optionally specify an ordinal field attribute

Code String Code
CodeWithScope

(Scope cannot be set)

String Code_With_Scope
DBRef Map
Should contain the following fields:
  • database
  • collection
  • id - A 24-digit hexadecimal or yyyy-MM-dd HH:mm:ss format
Db_Ref
DBRef String

Use a 24-digit hexadecimal or yyyy-MM-dd HH:mm:ss format

Db_Ref
Also specify the following field attributes:
  • database
  • collection
Decimal128 Decimal, Double, Float, Integer, Long, Short, String Decimal128
ObjectId String

Formatted as a 24-digit hexadecimal representation of an ObjectId

Object_Id
Symbol Char, String Symbol

Configuring a MongoDB Atlas Destination

Configure a MongoDB Atlas destination to write to MongoDB Atlas or MongoDB Enterprise Server.

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Required Fields Fields that must include data for the record to be passed into the stage.
    Tip: You might include fields that the stage uses.

    Records that do not include all required fields are processed based on the error handling configured for the pipeline.

    Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions.

    Records that do not meet all preconditions are processed based on the error handling configured for the stage.

    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline if Ordered Write is enabled.
  2. On the Connection tab, configure the following properties:
    Connection Property Description
    Connection String
    Connection string for MongoDB. To connect to MongoDB Atlas or Enterprise Server, you can use the following DNS seed list format:
    mongodb+srv://server.example.com/

    To connect to a MongoDB Enterprise Server cluster, use the following standard connection format:

    mongodb://host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]

    When connecting to a cluster, enter additional node information to ensure a connection.

    For more information about MongoDB connection strings, see the MongoDB documentation.

    SSL/TLS Mode Method used to implement SSL/TLS:
    • None - Connects to a MongoDB Enterprise Server cluster that is not enabled to use SSL/TLS.
    • Atlas/System CA - Connects to a MongoDB Atlas cluster. You can also use this when your certificates or keys have already been specified at the JVM level.
    • Server Validation (1 Way TLS) - Connects to an SSL/TLS-enabled MongoDB Enterprise Server cluster when the client needs to validate the server certificate and does not need to prove client identity.
    • Server and Client Validation (2 Way TLS) - Connects to an SSL/TLS-enabled MongoDB Enterprise Server cluster when the client needs to validate the server certificate and the server also validates the client key. This occurs when the cluster is set up to require client certificates.
    SSL Invalid Host Name Allowed Specifies whether invalid host names are allowed in SSL/TLS certificates.

    Available when using server validation or server and client validation.

    Certificate Mode Mode to provide the SSL/TLS certificate:
    • File - Use when the certificate is in a file local to Data Collector.
    • Embedded - Use to provide the certificate text directly in stage properties.

    Available when using server validation or server and client validation.

    Certificate Authority MongoDB certificate to use. Define this property based on the configured certificate mode:
    • When using file certificate mode, specify a path to the certificate. Enter an absolute path to the file or enter the following expression to define the file stored in the Data Collector resources directory:

      ${runtime:resourcesDirPath()}/keystore.jks

    • When using the embedded certificate mode, provide the full text of the certificate to use. The text should start with ---BEGIN CERTIFICATE---.

    Available when using server validation or server and client validation.

    Certificate Authority Password Password for the certificate. Specify if the certificate file is encrypted.

    Available when using server validation or server and client validation, and when using file certificate mode.

    Client Certificate Client certificate to use. Define this property based on the configured certificate mode:
    • When using file certificate mode, specify a path to the certificate. Enter an absolute path to the file or enter the following expression to define the file stored in the Data Collector resources directory:

      ${runtime:resourcesDirPath()}/keystore.jks

    • When using the embedded certificate mode, provide the full text of the certificate to use. The text should start with ---BEGIN CERTIFICATE---.

    Available when using server and client validation.

    Client Private Key Path to the key file.

    Available when using server and client validation and file certificate mode.

    Private Key Password Password for the private key. Specify if the private key is encrypted.

    Available when using server and client validation and file certificate mode.

  3. On the Credentials tab, configure the following properties:
    Credentials Property Description
    Authentication Method Authentication method to use:
    • None
    • Username / Password
    • LDAP
    Username User name for the selected authentication method.
    Password Password for the specified user name.
    Tip: To secure sensitive information such as user names and passwords, you can use runtime resources or credential stores.
    Authentication Database Database name associated with the specified user account.

    Available when using username/password authentication.

    Authentication Mechanism Authentication mechanism to use:
    • Default - Data Collector and MongoDB negotiate to choose the encryption mechanism.
    • SCRAM-SHA-1 - Data Collector sends SCRAM-SHA-1 credentials to MongoDB.
    • SCRAM-SHA-256 - Data Collector sends SCRAM-SHA-1 credentials to MongoDB.
  4. On the MongoDB tab, configure the following properties:
    MongoDB Property Description
    Database MongoDB database name. You can use record functions in an expression to define the database name.
    Collection MongoDB collection name. You can use record functions in an expression to define the collection name.
    Write Concern The acknowledgement level requested from the destination system.

    For details about write concern levels, see the MongoDB documentation.

    Unique Keys One or more key fields to use for the write. Specify the following details:
    • Collection Path - Path to the key field in the collection.
    • Incoming Record Path - Path to the corresponding field in the record. When the path matches the collection path, you can leave this field empty.

    Required for update and replace operations. Optional for inserts and deletes.

    For information about specifying paths, see Specifying Field Paths.

    Upsert Inserts records flagged for update or replace when the record does not exist in the database.

    Note that the destination performs upserts using the sdc.operation.type attribute regardless of how this property is set.

    Ordered Write Enables the destination to perform ordered writes and to stop the pipeline when the On Record Error property is set to Stop Pipeline. This can slow performance.

    This property is enabled by default. When disabled, writes are not ordered and the pipeline does stop when the destination generates an error record.

  5. Optionally, click the Advanced tab to configure how the origin connects to MongoDB.

    The defaults for these properties should work in most cases. If a numeric property is set to 0, then the driver default value is used.

    Advanced Property Description
    Compression Algorithm Compression algorithm to use to communicate with MongoDB:
    • None
    • Snappy
    • ZLib
    • ZStandard

    These compression algorithms are not supported by all MongoDB versions. See the MongoDB documentation for details.

    Default is Snappy.

    Application Name Name to use in MongoDB reporting, such as server logs.
    Maximum Connections Maximum number of open connections allowed in the connection pool.
    Minimum Connections Minimum number of open connections allowed in the connection pool.
    Max Connection Idle Time Maximum idle time in milliseconds before a connection is removed from the connection pool.
    Max Connection Lifetime Maximum lifetime in milliseconds for a connection in the connection pool.
    Max Connection Wait Time Maximum time in milliseconds that a connection waits to connect.
    Socket Connect Timeout Maximum time in milliseconds to wait for a network socket connection.
    Socket Read Timeout Maximum time in milliseconds to wait for a read connection.
    Socket Receive Buffer Size (bytes) Buffer size in bytes for receiving data.
    Socket Send Buffer Size (bytes) Buffer size in bytes for sending data.
    Heartbeat Frequency Milliseconds between Data Collector attempts to determine the current state of each server in the cluster.
    Min Heartbeat Frequency Minimum number of milliseconds between Data Collector checks on the state of each server.
    Server Selection Timeout Maximum time in milliseconds that Data Collector waits for server selection before throwing an exception. If you enter 0, an exception is thrown immediately if no server is available. Use a negative value to wait indefinitely.
    Local Threshold Local threshold in milliseconds. Requests are sent to a server whose ping time is less than or equal to the server with the fastest ping time plus the local threshold value.
    Required Replica Set Name Required replica set name to use for the cluster.
    Enable Single Mode Connects to the first MongoDB server in the connection string.

    Applicable only for MongoDB Enterprise Server clusters.

    Max Number of Retries Maximum number of times to retry the connection when the connection fails.

    Default is 10.

    Retry Interval (ms) Time between retries in milliseconds.

    Default is 10,000.

    Capped Collection Cursor Type Style of cursor to use for a capped collection:
    • Normal
    • Tailable
    • Tailable Await