Hive

Available when using an authoring Data Collector version 5.0.0 or later.

To create a Hive connection, one of the following stage libraries must be installed on the selected authoring Data Collector:

Cloudera CDP, streamsets-datacollector-cdp_<version>-lib
MapR with MEP, streamsets-datacollector-mapr_<version>-mep<version>-lib

For a description of the Hive connection properties, see Hive Connection Properties.

After you create a Hive connection, you can use the connection in the following stages:


Engine	Stages
Data Collector 5.0.0 or later	Hive Metadata processor Hive Metastore destination Hive Query executor

Hive Connection Properties

When creating a Hive connection, configure the following properties on the Hive tab:


Hive Property	Description
JDBC URL	JDBC URL for Hive. For details about specifying the URL, see this informative community post. You can optionally include the user name and password in the JDBC URL. If you include a password with special characters, you must URL-encode (also called percent-encoding) the special characters. Otherwise errors will occur when validating or running your pipeline. For example, if your JDBC URL looks like this: `jdbc:hive2://sunnyvale:12345/default;user=admin;password=a#b!c$e` URL-encode your password so that your JDBC URL looks like this: `jdbc:hive2://sunnyvale:12345/default;user=admin;password=a%23b%21c%24e` Tip: To secure sensitive information, you can use credential stores or runtime resources.
JDBC Driver Name	The fully-qualified JDBC driver name. Before using an Impala JDBC driver for the Hive Query executor, install the driver as an external library for the stage library used by the executor. For more information, see Installing the Impala Driver in the Data Collector documentation.
Use Credentials	Enables entering credentials in properties. Use when you do not include credentials in the JDBC URL. Note: To impersonate the current user in connections to Hive, you can edit the Data Collector configuration properties to configure Data Collector to automatically impersonate the user without specifying credentials in the pipeline. See Configuring Data Collector in the Data Collector documentation.
Username	User name for the JDBC connection. The user account must have the correct permissions or privileges in the database.
Password	Password for the JDBC user name. Tip: To secure sensitive information, you can use credential stores or runtime resources.
Additional JDBC Configuration Properties	Additional JDBC configuration properties to pass to the JDBC driver. Using simple or bulk edit mode, click Add to add additional properties and define the property name and value. Use the property names and values as expected by the JDBC driver.
Hadoop Configuration Directory	Absolute path to the directory containing the following Hive and Hadoop configuration files: core-site.xml hdfs-site.xml hive-site.xml Note: Properties in the configuration files are overridden by individual properties defined in the Additional Hadoop Configuration property.
Additional Hadoop Configuration	Additional properties to use. Using simple or bulk edit mode, click Add to add additional properties and define the property name and value. Use the property names and values as expected by HDFS and Hive.