Upgrade an Installation with Cloudera Manager
When you upgrade an installation with Cloudera Manager, the new version uses the same configuration, data, log, and resource directories. As a result, the new version has access to the files created in the previous version.
Step 2. Back Up the Previous Version
Step 3. Install the StreamSets Custom Service Descriptor
Step 4. Manually Install the Parcel and Checksum Files (Optional)
Step 5. Distribute and Activate the New StreamSets Parcel
Step 6. Verify Modified Safety Valves
Step 7. Restart the StreamSets Service
Step 1. Stop All Pipelines
Stop all pipelines running on the Data Collector to be upgraded.
Use one of the following methods to stop all pipelines:
-
If the Data Collector is not registered to work with StreamSets Control Hub, stop the pipelines using the Data Collector UI.
From the Data Collector Home page, select all running pipelines in the list and then click the Stop icon.
-
If the Data Collector is registered to work with StreamSets Control Hub, stop all jobs running on the Data Collector using the Control Hub UI.
From the Control Hub Jobs page, filter the jobs by engine and by engine label. Select all active jobs in the list and then click the Stop Jobs icon.
Step 2. Back Up the Previous Version
Before you install the new version, create a backup of the files in the previous version by copying and renaming the configuration, data, and resource directories. That way, you can continue to run the previous version if needed.
Copy and rename the following directories on every Cloudera Manager node that runs Data Collector:
- SDC_DATA - The Data Collector directory for pipeline state and configuration information.
- SDC_RESOURCES - The Data Collector directory for runtime resource files.
- SDC_EXTERNAL_RESOURCES - The Data Collector directory for external resources.
- SDC_RESOURCES - The Data Collector directory for runtime resource files.
-
STREAMSETS_LIBRARIES_EXTRA_DIR - The Data Collector directory for external libraries.
-
USER_LIBRARIES_DIR - The Data Collector directory for custom stages.
For example, if you are upgrading version 3.0.0.0, copy the Data Collector
configuration directory and rename it as follows: /etc/sdc3000
.
For more information about these directories, including the default values, see Data Collector Directories.
Step 3. Install the StreamSets Custom Service Descriptor
Install the new StreamSets custom service descriptor file (CSD), and then restart Cloudera Manager.
-
Download the CSD from one of the following locations:
- StreamSets Support portal if you have an enterprise account.
- StreamSets archives page if you do not have an enterprise account.
Or, you can use the GNU Wget program to download the CSD from the command line by running the following commands:export VERSION="5.11.0" wget https://archives.streamsets.com/datacollector/$VERSION/csd/STREAMSETS-$VERSION.jar
-
Remove the previous StreamSets CSD file from Cloudera Manager.
For example:
rm -f /opt/cloudera/csd/STREAMSETS*.jar
-
Copy the Data Collector CSD file
to the Local Descriptor Repository Path. By default, the
path is
/opt/cloudera/csd
.To verify the path to use, in Cloudera Manager, click Custom Service Descriptors category. Place the CSD file in the path configured for Local Descriptor Repository Path.. In the navigation panel, select the -
Set the file ownership to
cloudera-scm:cloudera-scm
with permission 644.For example:chown cloudera-scm:cloudera-scm /opt/cloudera/csd/STREAMSETS*.jar chmod 644 /opt/cloudera/csd/STREAMSETS*.jar
-
Use one of the following commands to restart Cloudera Manager Server:
For Ubuntu 14.04, CentOS 6, Red Hat Enterprise Linux 6, or Oracle Linux 6:For Ubuntu 16.04, CentOS 7, Red Hat Enterprise Linux 7, or Oracle Linux 7:
service cloudera-scm-server restart
systemctl restart cloudera-scm-server
- In Cloudera Manager, to restart the Cloudera Management Service, click Menu icon and select Restart. . To the right of Cloudera Management Service, click the
Step 4. Manually Install the Parcel and Checksum Files (Optional)
You can manually install the StreamSets parcel and related checksum files. Manually install the files when the Cloudera Manager Server does not have internet access.
When working with multiple clusters, perform the following steps for each cluster.
- Download the StreamSets parcel and related checksum file for the Cloudera Manager Server operating system.
-
Copy the StreamSets parcel and checksum file to the Cloudera Manager
Local Parcel Repository Path.
By default, the path is
/opt/cloudera/parcel-repo
.To verify the path to use, click Parcels category. Place the StreamSets parcel file in the path configured for Local Parcel Repository Path.. In the navigation panel, select the -
Change ownership on the parcel and checksum file to the user that runs the
Cloudera Manager process.
For example, if the Cloudera Manager process runs as the cloudera-scm user, use the following command to change ownership to cloudera-scm:
sudo chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/STREAMSETS_DATACOLLECTOR*
Step 5. Distribute and Activate the New StreamSets Parcel
After you add the StreamSets repository to Cloudera Manager, you can download and distribute the new StreamSets parcel across the cluster. Stop the StreamSets service and deactivate the previous parcel before you activate the new parcel.
-
To view the list of available parcels, in the menu bar, click the
Parcels icon.
The new StreamSets parcel displays in the list of available parcels. If it doesn't display, click Check for New Parcels.
-
To download the new StreamSets parcel to the local repository, click
Download.
After the parcel is downloaded, the Download button becomes the Distribute button.
- To distribute the new StreamSets parcel to the cluster, click Distribute.
- To stop the StreamSets service, click and then click .
- Click the Parcels icon to return to the Parcels page.
- To deactivate the previous StreamSets parcel, choose the appropriate cluster in the Location selector, and then click Deactivate for the parcel.
- To activate the new StreamSets parcel, choose the appropriate cluster in the Location selector, and then click Activate for the parcel.
Step 6. Verify Modified Safety Valves
When you upgrade, Cloudera Manager updates the Data Collector configuration properties for you. However, if you modified any of the Advanced Configuration Snippet (Safety Valve) properties in Cloudera Manager for the previous Data Collector version, those values override any property settings in the new configuration files.
You must compare the new configuration files shipped with the parcel in
/opt/cloudera/parcels/STREAMSETS
with your modified safety valves
and update the safety valves as needed to include any new properties.
For example, if you used the Data Collector Advanced Configuration Snippet
(Safety Valve) for sdc.properties to override the
system.stagelibs.blacklist
property, you must add any new stage
libraries listed in the blacklist property in the new sdc.properties
file to the overridden property in the safety valve.
Step 7. Restart the StreamSets Service
When you restart the StreamSets service, Cloudera Manager updates the Data Collector configuration properties for you. Cloudera Manager retains any customized values that you added in the previous Data Collector version. It also adds any new properties included in the new Data Collector version.
To restart the StreamSets service, click
and then click .