Cloudera Administration PDF
Cloudera Administration PDF
Important Notice
© 2010-2018 Cloudera, Inc. All rights reserved.
Cloudera, the Cloudera logo, and any other product or service names or slogans contained
in this document are trademarks of Cloudera and its suppliers or licensors, and may not
be copied, imitated or used, in whole or in part, without the prior written permission
of Cloudera or the applicable trademark holder.
Hadoop and the Hadoop elephant logo are trademarks of the Apache Software
Foundation. All other trademarks, registered trademarks, product names and company
names or logos mentioned in this document are the property of their respective owners.
Reference to any products, services, processes or other information, by trade name,
trademark, manufacturer, supplier or otherwise does not constitute or imply
endorsement, sponsorship or recommendation thereof by us.
Complying with all applicable copyright laws is the responsibility of the user. Without
limiting the rights under copyright, no part of this document may be reproduced, stored
in or introduced into a retrieval system, or transmitted in any form or by any means
(electronic, mechanical, photocopying, recording, or otherwise), or for any purpose,
without the express written permission of Cloudera.
The information in this document is subject to change without notice. Cloudera shall
not be liable for any damages resulting from technical errors or omissions which may
be present in this document, or from use of this document.
Cloudera, Inc.
395 Page Mill Road
Palo Alto, CA 94306
[email protected]
US: 1-888-789-1488
Intl: 1-650-362-0488
www.cloudera.com
Release Information
Resource Management........................................................................................236
Schedulers........................................................................................................................................................236
Cloudera Manager Resource Management.....................................................................................................236
Linux Control Groups (cgroups)........................................................................................................................238
Resource Management with Control Groups.....................................................................................................................240
Configuring Resource Parameters......................................................................................................................................241
Static Service Pools...........................................................................................................................................242
Dynamic Resource Pools..................................................................................................................................243
Managing Dynamic Resource Pools...................................................................................................................................244
YARN Pool Status and Configuration Options....................................................................................................................246
Assigning Applications and Queries to Resource Pools......................................................................................................247
Configuration Sets.............................................................................................................................................................249
Scheduling Rules................................................................................................................................................................250
Managing Impala Admission Control...............................................................................................................251
Managing the Impala Llama ApplicationMaster..............................................................................................253
Enabling Integrated Resource Management Using Cloudera Manager............................................................................254
Disabling Integrated Resource Management Using Cloudera Manager...........................................................................255
Configuring Llama Using Cloudera Manager.....................................................................................................................255
Impala Resource Management........................................................................................................................255
Admission Control and Query Queuing..............................................................................................................................255
Integrated Resource Management with YARN...................................................................................................................263
Performance Management...................................................................................265
Optimizing Performance in CDH.......................................................................................................................265
Choosing a Data Compression Format.............................................................................................................268
Tuning the Solr Server......................................................................................................................................269
Tuning to Complete During Setup......................................................................................................................................269
General Tuning...................................................................................................................................................................269
Other Resources.................................................................................................................................................................275
Tuning Spark Applications................................................................................................................................275
Tuning YARN.....................................................................................................................................................282
Overview............................................................................................................................................................................282
Cluster Configuration.........................................................................................................................................................286
YARN Configuration...........................................................................................................................................................287
MapReduce Configuration.................................................................................................................................................288
Step 7: MapReduce Configuration.....................................................................................................................................288
Step 7A: MapReduce Sanity Checking................................................................................................................................289
Configuring Your Cluster In Cloudera Manager.................................................................................................................289
High Availability...................................................................................................291
HDFS High Availability......................................................................................................................................291
Introduction to HDFS High Availability...............................................................................................................................291
Configuring Hardware for HDFS HA...................................................................................................................................292
Enabling HDFS HA..............................................................................................................................................................293
Disabling and Redeploying HDFS HA..................................................................................................................................305
Configuring Other CDH Components to Use HDFS HA.......................................................................................................306
Administering an HDFS High Availability Cluster...............................................................................................................309
Changing a Nameservice Name for Highly Available HDFS Using Cloudera Manager......................................................313
MapReduce (MRv1) and YARN (MRv2) High Availability..................................................................................313
YARN (MRv2) ResourceManager High Availability.............................................................................................................314
Work Preserving Recovery for YARN Components.............................................................................................................321
MapReduce (MRv1) JobTracker High Availability..............................................................................................................323
Cloudera Navigator Key Trustee Server High Availability.................................................................................335
Configuring Key Trustee Server High Availability Using Cloudera Manager......................................................................335
Configuring Key Trustee Server High Availability Using the Command Line......................................................................336
Recovering a Key Trustee Server........................................................................................................................................338
Key Trustee KMS High Availability....................................................................................................................338
High Availability for Other CDH Components...................................................................................................339
HBase High Availability......................................................................................................................................................339
Hive Metastore High Availability.......................................................................................................................................344
Hue High Availability .........................................................................................................................................................346
Llama High Availability......................................................................................................................................................349
Configuring Oozie for High Availability..............................................................................................................................350
Search High Availability.....................................................................................................................................................351
Configuring Cloudera Manager for High Availability With a Load Balancer.....................................................353
Introduction to Cloudera Manager Deployment Architecture...........................................................................................353
Prerequisites for Setting up Cloudera Manager High Availability......................................................................................354
High-Level Steps to Configure Cloudera Manager High Availability .................................................................................355
Database High Availability Configuration..........................................................................................................................381
TLS and Kerberos Configuration for Cloudera Manager High Availability.........................................................................382
Cloudera Administration | 7
Managing CDH and Managed Services
Configuration Overview
When Cloudera Manager configures a service, it allocates roles that are required for that service to the hosts in your
cluster. The role determines which service daemons run on a host.
For example, for an HDFS service instance, Cloudera Manager configures:
• One host to run the NameNode role.
• One host to run as the secondary NameNode role.
• One host to run the Balancer role.
• Remaining hosts as to run DataNode roles.
A role group is a set of configuration properties for a role type, as well as a list of role instances associated with that
group. Cloudera Manager automatically creates a default role group named Role Type Default Group for each role
type.
When you run the installation or upgrade wizard, Cloudera Manager configures the default role groups it adds, and
adds any other required role groups for a given role type. For example, a DataNode role on the same host as the
NameNode might require a different configuration than DataNode roles running on other hosts. Cloudera Manager
creates a separate role group for the DataNode role running on the NameNode host and uses the default configuration
for DataNode roles running on other hosts.
Cloudera Manager wizards autoconfigure role group properties based on the resources available on the hosts. For
properties that are not dependent on host resources, Cloudera Manager default values typically align with CDH default
values for that configuration. Cloudera Manager deviates when the CDH default is not a recommended configuration
or when the default values are illegal.
8 | Cloudera Administration
Managing CDH and Managed Services
New layout pages contain controls that allow you filter configuration properties based on configuration status, category,
and group. For example, to display the JournalNode maximum log size property (JournalNode Max Log Size), click the
CATEGORY > JournalNode and GROUP > Logs filters:
When a configuration property has been set to a value different from the default, a reset to default value icon
displays.
Classic layout pages are organized by role group and categories within the role group. For example, to display the
JournalNode maximum log size property (JournalNode Max Log Size), select JournalNode Default Group > Logs.
Cloudera Administration | 9
Managing CDH and Managed Services
When a configuration property has been set to a value different from the default, a Reset to the default value link
displays.
There is no mechanism for resetting to an autoconfigured value. However, you can use the configuration history and
rollback feature to revert any configuration changes.
Note:
This topic discusses how to configure properties using the Cloudera Manager "new layout." The older
layout, called the "classic layout" is still available. For instructions on using the classic layout, see
Modifying Configuration Properties (Classic Layout) on page 15.
To switch between the layouts, click either the Switch to the new layout or Switch to the classic
layout links in the upper-right portion of all configuration pages.
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
When a service is added to Cloudera Manager, either through the installation or upgrade wizard or with the Add
Services workflow, Cloudera Manager automatically sets the configuration properties, based on the needs of the service
and characteristics of the cluster in which it will run. These configuration properties include both service-wide
configuration properties, as well as specific properties for each role type associated with the service, managed through
role groups. A role group is a set of configuration properties for a role type, as well as a list of role instances associated
with that group. Cloudera Manager automatically creates a default role group named Role Type Default Group for
each role type. See Role Groups on page 48.
Changing the Configuration of a Service or Role Instance
1. Go to the service status page. (Cluster > service name)
2. Click the Configuration tab.
3. Locate the property you want to edit. You can type all or part of the property name in the search box, or use the
filters on the left side of the screen:
• The Status section limits the displayed properties by their status. Possible statuses include:
10 | Cloudera Administration
Managing CDH and Managed Services
• Error
• Warning
• Edited
• Non-default
• Has Overrides
• The Scope section of the left hand panel organizes the configuration properties by role types; first those that
are Service-Wide, followed by various role types within the service. When you select one of these roles, a
set of properties whose values are managed by the default role group for the role display. Any additional
role groups that apply to the property also appear in this panel and you can modify values for each role group
just as you can the default role group.
• The Category section of the left hand panel allows you to limit the displayed properties by category.
2. Click the ... and 6 others link to display all of the role groups:
3. Click the Show fewer link to collapse the list of role groups.
If you edit the single value for this property, Cloudera Manager applies the value to all role groups. To edit
the values for one or more of these role groups individually, click Edit Individual Values. Individual fields
display where you can edit the values for each role group. For example:
Cloudera Administration | 11
Managing CDH and Managed Services
5. Click Save Changes to commit the changes. You can add a note that is included with the change in the Configuration
History. This changes the setting for the role group, and applies to all role instances associated with that role
group. Depending on the change you made, you may need to restart the service or roles associated with the
configuration you just changed. Or, you may need to redeploy your client configuration for the service. You should
see a message to that effect at the top of the Configuration page, and services will display an outdated configuration
(Restart Needed), (Refresh Needed), or outdated client configuration indicator. Click the indicator to display
the Stale Configurations on page 28 page.
Searching for Properties
You can use the Search box to search for properties by name or label. The search also returns properties whose
description matches your search term.
Validation of Configuration Properties
Cloudera Manager validates the values you specify for configuration properties. If you specify a value that is outside
the recommended range of values or is invalid, Cloudera Manager displays a warning at the top of the Configuration
tab and in the text box after you click Save Changes. The warning is yellow if the value is outside the recommended
range of values and red if the value is invalid.
Overriding Configuration Properties
For role types that allow multiple instances, each role instance inherits its configuration properties from its associated
role group. While role groups provide a convenient way to provide alternate configuration properties for selected
groups of role instances, there may be situations where you want to make a one-off configuration change—for example
when a host has malfunctioned and you want to temporarily reconfigure it. In this case, you can override configuration
properties for a specific role instance:
1. Go to the Status page for the service whose role you want to change.
2. Click the Instances tab.
3. Click the role instance you want to change.
4. Click the Configuration tab.
5. Change the configuration values as appropriate.
6. Save your changes.
12 | Cloudera Administration
Managing CDH and Managed Services
You will most likely need to restart your service or role to have your configuration changes take effect. See Stale
Configuration Actions on page 29.
Viewing and Editing Overridden Configuration Properties
To see a list of all role instances that have an override value for a particular configuration setting, go to the Status page
for the service and select Status > Has overrides. A list of configuration properties where values have been overridden
displays. The panel for each configuration property displays the values for each role group or instance. You can edit
the value of this property for this instance, or, you can click the
icon. The default value is inserted and the icon turns into an Undo icon
( .)
Explicitly setting a configuration to the same value as its default (inherited value) has the same effect as using the
icon.
There is no mechanism for resetting to an autoconfigured value. However, you can use the configuration history and
rollback feature to revert any configuration changes.
Viewing and Editing Host Overrides
You can override the properties of individual hosts in your cluster.
1. Click the Hosts tab.
2. Click the Configuration tab.
3. Use the Filters or Search box to locate the property that you want to override.
4. Click the Manage Host Overrides link.
Cloudera Administration | 13
Managing CDH and Managed Services
14 | Cloudera Administration
Managing CDH and Managed Services
A dialog box opens where you can enter a comment about the suppression.
2. Click Confirm.
You can also suppress warnings from the All Configuration Issues screen:
1. Browse to the Home screen.
2. Click Configurations > Configuration Issues.
3. Locate the validation message in the list and click the Suppress... link.
A dialog box opens where you can enter a comment about the suppression.
4. Click Confirm.
The suppressed validation warning is now hidden.
Note: As of Cloudera Manager version 5.2, a new layout of the pages where you configure Cloudera
Manager system properties was introduced. In Cloudera Manager version 5.4, this new layout displays
by default. This topic discusses how to configure properties using the older layout, called the "Classic
Layout". For instructions on using the new layout, see Modifying Configuration Properties Using
Cloudera Manager on page 10.
To switch between the layouts, click either the Switch to the new layout or Switch to the classic
layout links in the upper-right portion of all configuration pages.
Cloudera Administration | 15
Managing CDH and Managed Services
When a service is added to Cloudera Manager, either through the installation or upgrade wizard or with the Add
Services workflow, Cloudera Manager automatically sets the configuration properties, based on the needs of the service
and characteristics of the cluster in which it will run. These configuration properties include both service-wide
configuration properties, as well as specific properties for each role type associated with the service, managed through
role groups. A role group is a set of configuration properties for a role type, as well as a list of role instances associated
with that group. Cloudera Manager automatically creates a default role group named Role Type Default Group for
each role type. See Role Groups on page 48.
Changing the Configuration of a Service or Role Instance (Classic Layout)
1. Go to the service status page.
2. Click the Configuration tab.
3. Under the appropriate role group, select the category for the properties you want to change.
4. To search for a text string (such as "snippet"), in a property, value, or description, enter the text string in the
Search box at the top of the category list.
5. Moving the cursor over the value cell highlights the cell; click anywhere in the highlighted area to enable editing
of the value. Then type the new value in the field provided (or check or uncheck the box, as appropriate).
• To facilitate entering some types of values, you can specify not only the value, but also the units that apply
to the value. for example, to enter a setting that specifies bytes per second, you can choose to enter the
value in bytes (B), KiBs, MiBs, or GiBs—selected from a drop-down menu that appears when you edit the
value.
• If the property allows a list of values, click the icon to the right of the edit field to add an additional field.
An example of this is the HDFS DataNode Data Directory property, which can have a comma-delimited list of
directories as its value. To remove an item from such a list, click the icon to the right of the field you want
to remove.
6. Click Save Changes to commit the changes. You can add a note that will be included with the change in the
Configuration History. This will change the setting for the role group, and will apply to all role instances associated
with that role group. Depending on the change you made, you may need to restart the service or roles associated
with the configuration you just changed. Or, you may need to redeploy your client configuration for the service.
You should see a message to that effect at the top of the Configuration page, and services will display an outdated
configuration (Restart Needed), (Refresh Needed), or outdated client configuration indicator. Click the
indicator to display the Stale Configurations on page 28 page.
16 | Cloudera Administration
Managing CDH and Managed Services
To view the override values, and change them if appropriate, click the Edit Overrides link. This opens the Edit Overrides
page, and lists the role instances that have override properties for the selected configuration setting.
Cloudera Administration | 17
Managing CDH and Managed Services
1. Follow the instructions in Restarting a Service on page 41 or Starting, Stopping, and Restarting Role Instances on
page 46.
2. If you see a Finished status, the service or role instances have restarted.
3. Go to the Home > Status tab. The service should show a Status of Started for all instances and a health status of
Good.
For further information, see Stale Configurations on page 28.
Autoconfiguration
Cloudera Manager provides several interactive wizards to automate common workflows:
• Installation - used to bootstrap a Cloudera Manager deployment
• Add Cluster - used when adding a new cluster
• Add Service - used when adding a new service
• Upgrade - used when upgrading to a new version of CDH
• Static Service Pools - used when configuring static service pools
• Import MapReduce - used when migrating from MapReduce to YARN
In some of these wizards, Cloudera Manager uses a set of rules to automatically configure certain settings to best suit
the characteristics of the deployment. For example, the number of hosts in the deployment drives the memory
requirements for certain monitoring daemons: the more hosts, the more memory is needed. Additionally, wizards that
are tasked with creating new roles will use a similar set of rules to determine an ideal host placement for those roles.
Scope
The following table shows, for each wizard, the scope of entities it affects during autoconfiguration and role-host
placement.
Certain autoconfiguration rules are unscoped, that is, they configure settings belonging to entities that aren't necessarily
the entities under the wizard's scope. These exceptions are explicitly listed.
Autoconfiguration
Cloudera Manager employs several different rules to drive automatic configuration, with some variation from wizard
to wizard. These rules range from the simple to the complex.
Configuration Scope
One of the points of complexity in autoconfiguration is configuration scope. The configuration hierarchy as it applies
to services is as follows: configurations may be modified at the service level (affecting every role in the service), role
group level (affecting every role instance in the group), or role level (affecting one role instance). A configuration found
in a lower level takes precedence over a configuration found in a higher level.
With the exception of the Static Service Pools, and the Import MapReduce wizard, all Cloudera Manager wizards follow
a basic pattern:
1. Every role in scope is moved into its own, new, role group.
18 | Cloudera Administration
Managing CDH and Managed Services
2. This role group is the receptacle for the role's "idealized" configuration. Much of this configuration is driven by
properties of the role's host, which can vary from role to role.
3. Once autoconfiguration is complete, new role groups with common configurations are merged.
4. The end result is a smaller set of role groups, each with an "idealized" configuration for some subset of the roles
in scope. A subset can have any number of roles; perhaps all of them, perhaps just one, and so on.
The Static Service Pools and Import MapReduce wizards configure role groups directly and do not perform any merging.
Static Service Pools
Certain rules are only invoked in the context of the Static Service Pools wizard. Additionally, the wizard autoconfigures
cgroup settings for certain kinds of roles:
• HDFS DataNodes
• HBase RegionServers
• MapReduce TaskTrackers
• YARN NodeManagers
• Impala Daemons
• Solr Servers
• Spark Standalone Workers
• Accumulo Tablet Servers
• Add-on services
YARN
yarn.nodemanager.resource.cpu-vcores - For each NodeManager role group, set to number of cores,
including hyperthreads, on one NodeManager member's host * service percentage chosen in
wizard.
All Services
Cgroup cpu.shares - For each role group that supports cpu.shares, set to max(20, (service percentage
chosen in wizard) * 20).
Cgroup blkio.weight - For each role group that supports blkio.weight, set to max(100, (service percentage
chosen in wizard) * 10).
Data Directories
Several autoconfiguration rules work with data directories, and there's a common sub-rule used by all such rules to
determine, out of all the mountpoints present on a host, which are appropriate for data. The subrule works as follows:
• The initial set of mountpoints for a host includes all those that are disk-backed. Network-backed mountpoints are
excluded.
• Mountpoints beginning with /boot, /cdrom, /usr, /tmp, /home, or /dev are excluded.
• Mountpoints beginning with /media are excluded, unless the backing device's name contains /xvd somewhere
in it.
• Mountpoints beginning with /var are excluded, unless they are /var or /var/lib.
• The largest mount point (in terms of total space, not available space) is determined.
• Other mountpoints with less than 1% total space of the largest are excluded.
• Mountpoints beginning with /var or equal to / are excluded unless they’re the largest mount point.
• Remaining mountpoints are sorted lexicographically and retained for future use.
Memory
The rules used to autoconfigure memory reservations are perhaps the most complicated rules employed by Cloudera
Manager. When configuring memory, Cloudera Manager must take into consideration which roles are likely to enjoy
more memory, and must not over commit hosts if at all possible. To that end, it needs to consider each host as an
entire unit, partitioning its available RAM into segments, one segment for each role. To make matters worse, some
Cloudera Administration | 19
Managing CDH and Managed Services
roles have more than one memory segment. For example, a Solr server has two memory segments: a JVM heap used
for most memory allocation, and a JVM direct memory pool used for HDFS block caching. Here is the overall flow during
memory autoconfiguration:
1. The set of participants includes every host under scope as well as every {role, memory segment} pair on those
hosts. Some roles are under scope while others are not.
2. For each {role, segment} pair where the role is under scope, a rule is run to determine four different values for
that pair:
• Minimum memory configuration. Cloudera Manager must satisfy this minimum, possibly over-committing
the host if necessary.
• Minimum memory consumption. Like the above, but possibly scaled to account for inherent overhead. For
example, JVM memory values are multiplied by 1.3 to arrive at their consumption value.
• Ideal memory configuration. If RAM permits, Cloudera Manager will provide the pair with all of this memory.
• Ideal memory consumption. Like the above, but scaled if necessary.
3. For each {role, segment} pair where the role is not under scope, a rule is run to determine that pair's existing
memory consumption. Cloudera Manager will not configure this segment but will take it into consideration by
setting the pair's "minimum" and "ideal" to the memory consumption value.
4. For each host, the following steps are taken:
a. 20% of the host's available RAM is subtracted and reserved for the OS.
b. sum(minimum_consumption) and sum(ideal_consumption) are calculated.
c. An "availability ratio" is built by comparing the two sums against the host's available RAM.
a. If RAM < sum(minimum) ratio = 0
b. If RAM >= sum(ideal) ratio = 1
c. Otherwise, ratio is computed via: RAM - sum(minimum)) / (sum(ideal) - sum(minimum)
5. For each {role, segment} pair where the role is under scope, the segment is configured to be (minimum + ((ideal
- minimum) * (host availability ratio))). The value is rounded down to the nearest megabyte.
6. The {role, segment} pair is set with the value from the previous step. In the Static Service Pools wizard, the role
group is set just once (as opposed to each role).
7. Custom post-configuration rules are run.
Customization rules are applied in steps 2, 3 and 7. In step 2, there's a generic rule for most cases, as well as a series
of custom rules for certain {role, segment} pairs. Likewise, there's a generic rule to calculate memory consumption in
step 3 as well as some custom consumption functions for certain {role, segment} pairs.
Step 2 Generic Rule Excluding Static Service Pools Wizard
For every {role, segment} pair where the segment defines a default value, the pair's minimum is set to the segment's
minimum value (or 0 if undefined), and the ideal is set to the segment's default value.
Step 2 Custom Rules Excluding Static Service Pools Wizard
HDFS
For the NameNode and Secondary NameNode JVM heaps, the minimum is 50 MB and the ideal is max(1 GB,
sum_over_all(DataNode mountpoints’ available space) / 0.000008).
MapReduce
For the JobTracker JVM heap, the minimum is 50 MB and the ideal is max(1 GB, round((1 GB * 2.3717181092
* ln(number of TaskTrackers in MapReduce service)) - 2.6019933306)). If there are <=5 TaskTrackers,
the ideal is 1 GB.
For the mapper JVM heaps, the minimum is 1 and the ideal is (number of cores, including hyperthreads, on the
TaskTracker host). Note that memory consumption is scaled by mapred_child_java_opts_max_heap (the size of
a given task's heap).
20 | Cloudera Administration
Managing CDH and Managed Services
For the reducer JVM heaps, the minimum is 1 and the ideal is (number of cores, including hyperthreads, on the
TaskTracker host) / 2. Note that memory consumption is scaled by mapred_child_java_opts_max_heap (the size
of a given task's heap).
YARN
For the memory total allowed for containers, the minimum is 1 GB and the ideal is min(8 GB, (total RAM on
NodeManager host) * 0.8).
Hue
With the exception of the Beeswax Server (present only in CDH 4), Hue roles don’t have memory limits. Therefore,
Cloudera Manager treats them as roles that consume a fixed amount of memory by setting their minimum and ideal
consumption values, but not their configuration values. The two consumption values are set to 256 MB.
Impala
With the exception of the Impala Daemon, Impala roles don’t have memory limits. Therefore Cloudera Manager treats
them as roles that consume a fixed amount of memory by setting their minimum/ideal consumption values, but not
their configuration values. The two consumption values are set to 150 MB for the Catalog Server and 64 MB for the
StateStore.
For the Impala Daemon memory limit, the minimum is 256 MB and the ideal is (total RAM on daemon host) *
0.64.
Solr
For the Solr Server JVM heap, the minimum is 50 MB and the ideal is min(64 GB, (total RAM on Solr Server
host) * 0.64) / 2.6. For the Solr Server JVM direct memory segment, the minimum is 256 MB and the ideal is
min(64 GB, (total RAM on Solr Server host) * 0.64) / 2.
Cloudera Administration | 21
Managing CDH and Managed Services
With Segments
The minimum is the min(cgroup.memory_limit_in_bytes_min (if exists) or 0, sum_over_all(segment
minimum consumption)), and the ideal is the sum of all segment ideal consumptions.
Without Segments
The minimum is cgroup.memory_limit_in_bytes_min (if exists) or 0, and the ideal is (total RAM on
role's host * 0.8 * service percentage chosen in wizard).
YARN
For the memory total allowed for containers, the minimum is 1 GB and the ideal is min(8 GB, (total RAM on
NodeManager host) * 0.8 * service percentage chosen in wizard).
Impala
For the Impala Daemon memory limit, the minimum is 256 MB and the ideal is ((total RAM on Daemon host)
* 0.8 * service percentage chosen in wizard).
MapReduce
• Mapper JVM heaps - the minimum is 1 and the ideal is (number of cores, including hyperthreads, on the TaskTracker
host * service percentage chosen in wizard). Note that memory consumption is scaled by
mapred_child_java_opts_max_heap (the size of a given task's heap).
• Reducer JVM heaps - the minimum is 1 and the ideal is (number of cores, including hyperthreads on the TaskTracker
host * service percentage chosen in wizard) / 2. Note that memory consumption is scaled by
mapred_child_java_opts_max_heap (the size of a given task's heap).
Impala
For the Impala Daemon, the memory consumption is 0 if YARN Service for Resource Management is set. If the memory
limit is defined but not -1, its value is used verbatim. If it's defined but -1, the consumption is equal to the total RAM
on the Daemon host. If it is undefined, the consumption is (total RAM * 0.8).
MapReduce
See Step 3 Custom Rules for Static Service Pools Wizard on page 22.
22 | Cloudera Administration
Managing CDH and Managed Services
Solr
For the Solr Server JVM direct memory segment, the consumption is equal to the value verbatim provided
solr.hdfs.blockcache.enable and solr.hdfs.blockcache.direct.memory.allocation are both true.
Otherwise, the consumption is 0.
Step 7 Custom Rules
HDFS
• NameNode JVM heaps are equalized. For every pair of NameNodes in an HDFS service with different heap sizes,
the larger heap size is reset to the smaller one.
• JournalNode JVM heaps are equalized. For every pair of JournalNodes in an HDFS service with different heap sizes,
the larger heap size is reset to the smaller one.
• NameNode and Secondary NameNode JVM heaps are equalized. For every {NameNode, Secondary NameNode}
pair in an HDFS service with different heap sizes, the larger heap size is reset to the smaller one.
HBase
Master JVM heaps are equalized. For every pair of Masters in an HBase service with different heap sizes, the larger
heap size is reset to the smaller one.
Impala
If an Impala service has YARN Service for Resource Management set, every Impala Daemon memory limit is set to the
value of (yarn.nodemanager.resource.memory-mb * 1 GB) if there's a YARN NodeManager co-located with the
Impala Daemon.
MapReduce
JobTracker JVM heaps are equalized. For every pair of JobTrackers in an MapReduce service with different heap sizes,
the larger heap size is reset to the smaller one.
Oozie
Oozie Server JVM heaps are equalized. For every pair of Oozie Servers in an Oozie service with different heap sizes,
the larger heap size is reset to the smaller one.
YARN
ResourceManager JVM heaps are equalized. For every pair of ResourceManagers in a YARN service with different heap
sizes, the larger heap size is reset to the smaller one.
ZooKeeper
ZooKeeper Server JVM heaps are equalized. For every pair of servers in a ZooKeeper service with different heap sizes,
the larger heap size is reset to the smaller one.
General Rules
HBase
• hbase.replication - For each HBase service, set to true if there's a Key-Value Store Indexer service in the
cluster. This rule is unscoped; it can fire even if the HBase service is not under scope.
• replication.replicationsource.implementation - For each HBase service, set to
com.ngdata.sep.impl.SepReplicationSource if there's a Keystore Indexer service in the cluster. This rule
is unscoped; it can fire even if the HBase service is not under scope.
Cloudera Administration | 23
Managing CDH and Managed Services
HDFS
• dfs.datanode.du.reserved - For each DataNode, set to min((total space of DataNode host largest
mountpoint) / 10, 10 GB).
• dfs.namenode.name.dir - For each NameNode, set to the first two mountpoints on the NameNode host with
/dfs/nn appended.
• dfs.namenode.checkpoint.dir - For each Secondary NameNode, set to the first mountpoint on the Secondary
NameNode host with /dfs/snn appended.
• dfs.datanode.data.dir - For each DataNode, set to all the mountpoints on the host with /dfs/dn appended.
• dfs.journalnode.edits.dir - For each JournalNode, set to the first mountpoint on the JournalNode host
with /dfs/jn appended.
• dfs.datanode.failed.volumes.tolerated - For each DataNode, set to (number of mountpoints on DataNode
host) / 2.
• dfs.namenode.service.handler.count and dfs.namenode.handler.count - For each NameNode, set
to max(30, ln(number of DataNodes in this HDFS service) * 20).
• dfs.block.local-path-access.user - For each HDFS service, set to impala if there's an Impala service in
the cluster. This rule is unscoped; it can fire even if the HDFS service is not under scope.
• dfs.datanode.hdfs-blocks-metadata.enabled - For each HDFS service, set to true if there's an Impala
service in the cluster. This rule is unscoped; it can fire even if the HDFS service is not under scope.
• dfs.client.read.shortcircuit - For each HDFS service, set to true if there's an Impala service in the cluster.
This rule is unscoped; it can fire even if the HDFS service is not under scope.
• dfs.datanode.data.dir.perm - For each DataNode, set to 755 if there's an Impala service in the cluster and
the cluster isn’t Kerberized. This rule is unscoped; it can fire even if the HDFS service is not under scope.
• fs.trash.interval - For each HDFS service, set to 1.
Hue
• WebHDFS dependency - For each Hue service, set to either the first HttpFS role in the cluster, or, if there are
none, the first NameNode in the cluster.
• HBase Thrift Server dependency- For each Hue service in a CDH 4.4 or higher cluster, set to the first HBase Thrift
Server in the cluster.
Impala
For each Impala service, set Enable Audit Collection and Enable Lineage Collection to true if there's a Cloudera
Management Service with a Navigator Audit Server and Navigator Metadata Server roles. This rule is unscoped; it can
fire even if the Impala service is not under scope.
MapReduce
• mapred.local.dir - For each JobTracker, set to the first mountpoint on the JobTracker host with /mapred/jt
appended.
• mapred.local.dir - For each TaskTracker, set to all the mountpoints on the host with /mapred/local
appended.
• mapred.reduce.tasks - For each MapReduce service, set to max(1, sum_over_all(TaskTracker number
of reduce tasks (determined via mapred.tasktracker.reduce.tasks.maximum for that
TaskTracker, which is configured separately)) / 2).
• mapred.job.tracker.handler.count - For each JobTracker, set to max(10, ln(number of TaskTrackers
in this MapReduce service) * 20).
• mapred.submit.replication - If there's an HDFS service in the cluster, for each MapReduce service, set to
max(1, sqrt(number of DataNodes in the HDFS service)).
• mapred.tasktracker.instrumentation - If there's a management service, for each MapReduce service, set
to org.apache.hadoop.mapred.TaskTrackerCmonInst. This rule is unscoped; it can fire even if the MapReduce
service is not under scope.
24 | Cloudera Administration
Managing CDH and Managed Services
YARN
• yarn.nodemanager.local-dirs - For each NodeManager, set to all the mountpoints on the NodeManager
host with /yarn/nm appended.
• yarn.nodemanager.resource.cpu-vcores - For each NodeManager, set to the number of cores (including
hyperthreads) on the NodeManager host.
• mapred.reduce.tasks - For each YARN service, set to max(1,sum_over_all(NodeManager number of
cores, including hyperthreads) / 2).
• yarn.resourcemanager.nodemanagers.heartbeat-interval-ms - For each NodeManager, set to max(100,
10 * (number of NodeManagers in this YARN service)).
• yarn.scheduler.maximum-allocation-vcores - For each ResourceManager, set to
max_over_all(NodeManager number of vcores (determined via
yarn.nodemanager.resource.cpu-vcores for that NodeManager, which is configured
separately)).
• yarn.scheduler.maximum-allocation-mb - For each ResourceManager, set to max_over_all(NodeManager
amount of RAM (determined via yarn.nodemanager.resource.memory-mb for that NodeManager,
which is configured separately)).
• mapreduce.client.submit.file.replication - If there's an HDFS service in the cluster, for each YARN
service, set to max(1, sqrt(number of DataNodes in the HDFS service)).
All Services
If a service dependency is unset, and a service with the desired type exists in the cluster, set the service dependency
to the first such target service. Applies to all service dependencies except YARN Service for Resource Management.
Applies only to the Installation and Add Cluster wizards.
Role-Host Placement
Cloudera Manager employs the same role-host placement rule regardless of wizard. The set of hosts considered
depends on the scope. If the scope is a cluster, all hosts in the cluster are included. If a service, all hosts in the service's
cluster are included. If the Cloudera Management Service, all hosts in the deployment are included. The rules are as
follows:
1. The hosts are sorted from most to least physical RAM. Ties are broken by sorting on hostname (ascending) followed
by host identifier (ascending).
2. The overall number of hosts is used to determine which arrangement to use. These arrangements are hard-coded,
each dictating for a given "master" role type, what index (or indexes) into the sorted host list in step 1 to use.
3. Master role types are included based on several factors:
• Is this role type part of the service (or services) under scope?
• Does the service already have the right number of instances of this role type?
• Does the cluster's CDH version support this role type?
• Does the installed Cloudera Manager license allow for this role type to exist?
4. Master roles are placed on each host using the indexes and the sorted host list. If a host already has a given master
role, it is skipped.
5. An HDFS DataNode is placed on every host outside of the arrangement described in step 2, provided HDFS is one
of the services under scope.
6. Certain "worker" roles are placed on every host where an HDFS DataNode exists, either because it existed there
prior to the wizard, or because it was added in the previous step. The supported worker role types are:
• MapReduce TaskTrackers
• YARN NodeManagers
• HBase RegionServers
• Impala Daemons
• Spark Workers
Cloudera Administration | 25
Managing CDH and Managed Services
7. Hive gateways are placed on every host, provided a Hive service is under scope and a gateway didn’t already exist
on a given host.
8. Spark on YARN gateways are placed on every host, provided a Spark on YARN service is under scope and a gateway
didn’t already exist on a given host.
This rule merely dictates the default placement of roles; you are free to modify it before it is applied by the wizard.
Custom Configuration
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
Cloudera Manager exposes properties that allow you to insert custom configuration text into XML configuration,
property, and text files, or into an environment. The naming convention for these properties is: XXX Advanced
Configuration Snippet (Safety Valve) for YYY or XXX YYY Advanced Configuration Snippet (Safety Valve), where XXX
is a service or role and YYY is the target.
The values you enter into a configuration snippet must conform to the syntax of the target. For an XML configuration
file, the configuration snippet must contain valid XML property definitions. For a properties file, the configuration
snippet must contain valid property definitions. Some files simply require a list of host addresses.
The configuration snippet mechanism is intended for use in cases where there is configuration setting that is not
exposed as a configuration property in Cloudera Manager. Configuration snippets generally override normal configuration.
Contact Cloudera Support if you are required to use a configuration snippet that is not explicitly documented.
Service-wide configuration snippets apply to all roles in the service; a configuration snippet for a role group applies to
all instances of the role associated with that role group.
There are configuration snippets for servers and client configurations. In general after changing a server configuration
snippet you must restart the server, and after changing a client configuration snippet you must redeploy the client
configuration. Sometimes you can refresh instead of restart. In some cases however, you must restart a dependent
server after changing a client configuration. For example, changing a MapReduce client configuration marks the
dependent Hive server as stale, which must be restarted. The Admin Console displays an indicator when a server must
be restarted. In addition, the All Configuration Issues tab on the Home page lists the actions you must perform to
propagate the changes.
26 | Cloudera Administration
Managing CDH and Managed Services
White and black Specify a list of host addresses that are host1.domain1
lists allowed or disallowed from accessing host2.domain2
a service.
Cloudera Administration | 27
Managing CDH and Managed Services
• specific cluster
1. On the Home > Status tab, click a cluster name.
2. Select Configuration > Advanced Configuration Snippets.
• all clusters
1. Select Configuration > Advanced Configuration Snippets.
Stale Configurations
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
The Stale Configurations page provides differential views of changes made in a cluster. For any configuration change,
the page contains entries of all affected attributes. For example, the following File entry shows the change to the file
hdfs-site.xml when you update the property controlling how much disk space is reserved for non-HDFS use on
each DataNode:
To display the entities affected by a change, click the Show button at the right of the entry. The following dialog box
shows that three DataNodes were affected by the disk space change:
Attribute Categories
The categories of attributes include:
28 | Cloudera Administration
Managing CDH and Managed Services
• Environment - represents environment variables set for the role. For example, the following entry shows the
change to the environment that occurs when you update the heap memory configuration of the
SecondaryNameNode.
Client configuration files are generated automatically by Cloudera Manager based on the services and roles you have
installed and Cloudera Manager deploys these configurations automatically when you install your cluster, add a service
on a host, or add a gateway role on a host. Specifically, for each host that has a service role instance installed, and for
each host that is configured as a gateway role for that service, the deploy function downloads the configuration zip
file, unzips it into the appropriate configuration directory, and uses the Linux alternatives mechanism to set a given,
configurable priority level. If you are installing on a system that happens to have pre-existing alternatives, then it is
possible another alternative may have higher priority and will continue to be used. The alternatives priority of the
Cloudera Administration | 29
Managing CDH and Managed Services
Cloudera Manager client configuration is configurable under the Gateway scope of the Configuration tab for the
appropriate service.
You can also manually distribute client configuration files to the clients of a service.
The main circumstance that may require a redeployment of the client configuration files is when you have modified a
configuration. In this case you will typically see a message instructing you to redeploy your client configurations. The
affected service(s) will also display a icon. Click the indicator to display the Stale Configurations on page 28 page.
How Client Configurations are Deployed
Client configuration files are deployed on any host that is a client for a service—that is, that has a role for the service
on that host. This includes roles such as DataNodes, TaskTrackers, RegionServers and so on as well as gateway roles
for the service.
If roles for multiple services are running on the same host (for example, a DataNode role and a TaskTracker role on
the same host) then the client configurations for both roles are deployed on that host, with the alternatives priority
determining which configuration takes precedence.
For example, suppose we have six hosts running roles as follows: host H1: HDFS-NameNode; host H2: MR-JobTracker;
host H3: HBase-Master; host H4: MR-TaskTracker, HDFS-DataNode, HBase-RegionServer; host H5: MR-Gateway; host
H6: HBase-Gateway. Client configuration files will be deployed on these hosts as follows: host H1: hdfs-clientconfig
(only); host H2: mapreduce-clientconfig, host H3: hbase-clientconfig; host H4: hdfs-clientconfig, mapreduce-clientconfig,
hbase-clientconfig; host H5: mapreduce-clientconfig; host H6: hbase-clientconfig
If the HDFS NameNode and MapReduce JobTracker were on the same host, then that host would have both
hdfs-clientconfig and mapreduce-clientconfig installed.
Downloading Client Configuration Files
1. Follow the appropriate procedure according to your starting point:
Page Procedure
Home 1. On the Home > Status tab, click
to the right of the cluster name and select View Client Configuration URLs. A
pop-up window with links to the configuration files for the services you have
installed displays.
2. Click a link or save the link URL and download the file using wget or curl.
Note: If you are deploying client configurations on a host that has multiple services installed, some
of the same configuration files, though with different configurations, will be installed in the conf
directories for each service. Cloudera Manager uses the priority parameter in the alternatives
--install command to ensure that the correct configuration directory is made active based on the
combination of services on that host. The priority order is YARN > MapReduce > HDFS. The priority
can be configured under the Gateway sections of the Configuration tab for the appropriate service.
30 | Cloudera Administration
Managing CDH and Managed Services
to the right of the cluster name and select Deploy Client Configuration.
2. Click Deploy Client Configuration.
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
Whenever you change and save a set of configuration settings for a service or role instance or a host, Cloudera Manager
saves a revision of the previous settings and the name of the user who made the changes. You can then view past
revisions of the configuration settings, and, if desired, roll back the settings to a previous state.
Viewing Configuration Changes
1. For a service, role, or host, click the Configuration tab.
2. Click the History and Rollback button. The most recent revision, currently in effect, is shown under Current
Revision. Prior revisions are shown under Past Revisions.
• By default, or if you click Show All, a list of all revisions is shown. If you are viewing a service or role instance,
all service/role group related revisions are shown. If you are viewing a host or all hosts, all host/all hosts
related revisions are shown.
• To list only the configuration revisions that were done in a particular time period, use the Time Range Selector
to select a time range. Then, click Show within the Selected Time Range.
3. Click the Details... link. The Revision Details dialog box displays.
Cloudera Administration | 31
Managing CDH and Managed Services
Important: This feature can only be used to revert changes to configuration values. You cannot use
this feature to:
• Revert NameNode high availability. You must perform this action by explicitly disabling high
availability.
• Disable Kerberos security.
• Revert role group actions (creating, deleting, or moving membership among groups). You must
perform these actions explicitly in the Role Groups on page 48 feature.
Managing Clusters
Cloudera Manager can manage multiple clusters, however each cluster can only be associated with a single Cloudera
Manager Server or Cloudera Manager HA pair. Once you have successfully installed your first cluster, you can add
additional clusters, running the same or a different version of CDH. You can then manage each cluster and its services
independently.
On the Home > Status tab you can access many cluster-wide actions by selecting
to the right of the cluster name: add a service, start, stop, restart, deploy client configurations, enable Kerberos, and
perform cluster refresh, rename, upgrade, and maintenance mode actions.
Note:
Cloudera Manager configuration screens offer two layout options: classic and new. The new layout
is the default; however, on each configuration page you can easily switch between layouts using the
Switch to XXX layout link at the top right of the page. For more information, see Configuration Overview
on page 8.
Adding a Cluster
Action Procedure
New Hosts 1. On the Home > Status tab, click
and select Add Cluster. This begins the Installation Wizard, just as if you were installing a
cluster for the first time. (See Cloudera Manager Deployment for detailed instructions.)
2. To find new hosts, not currently managed by Cloudera Manager, where you want to install
CDH, enter the hostnames or IP addresses, and click Search. Cloudera Manager lists the hosts
you can use to configure a new cluster. Managed hosts that already have services installed
will not be selectable.
32 | Cloudera Administration
Managing CDH and Managed Services
Action Procedure
3. Click Continue to install the new cluster. At this point the installation continues through the
wizard the same as it did when you installed your first cluster. You will be asked to select the
version of CDH to install, which services you want and so on, just as previously.
4. Restart the Reports Manager role.
Managed Hosts You may have hosts that are already "managed" but are not part of a cluster. You can have
managed hosts that are not part of a cluster when you have added hosts to Cloudera Manager
either through the Add Host wizard, or by manually installing the Cloudera Manager agent onto
hosts where you have not install any other services. This will also be the case if you remove all
services from a host so that it no longer is part of a cluster.
1. On the Home > Status tab, click
and select Add Cluster. This begins the Installation Wizard, just as if you were installing a
cluster for the first time. (See Cloudera Manager Deployment for detailed instructions.)
2. To see the list of the currently managed hosts, click the Currently Managed Hosts tab. This
tab does not appear if you have no currently managed hosts that are not part of a cluster.
3. To perform the installation, click Continue. Instead of searching for hosts, this will attempt
to install onto any hosts managed by Cloudera Manager that are not already part of a cluster.
It will proceed with the installation wizard as for a new cluster installation.
4. Restart the Reports Manager role.
Deleting a Cluster
1. Stop the cluster.
2. On the Home > Status tab, click
Starting a Cluster
1. On the Home > Status tab, click
Note: The cluster-level Start action starts only CDH and other product services (Impala, Cloudera
Search). It does not start the Cloudera Management Service. You must start the Cloudera Management
Service separately if it is not already running.
Cloudera Administration | 33
Managing CDH and Managed Services
Stopping a Cluster
1. On the Home > Status tab, click
Note: The cluster-level Stop action does not stop the Cloudera Management Service. You must stop
the Cloudera Management Service separately.
Refreshing a Cluster
Runs a cluster refresh action to bring the configuration up to date without restarting all services. For example, certain
masters (for example NameNode and ResourceManager) have some configuration files (for example,
fair-scheduler.xml, mapred_hosts_allow.txt, topology.map) that can be refreshed. If anything changes in
those files then a refresh can be used to update them in the master. Here is a summary of the operations performed
in a refresh action:
Restarting a Cluster
1. On the Home > Status tab, click
34 | Cloudera Administration
Managing CDH and Managed Services
When All services successfully started appears, the task is complete and you can close the Command Details
window.
Renaming a Cluster
Minimum Required Role: Full Administrator
1. On the Home > Status tab, click
Cluster-Wide Configuration
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
To make configuration changes that apply to an entire cluster, do one of the following to open the configuration page:
• all clusters
1. Select Configuration and then select one of the following classes of properties:
• Advanced Configuration Snippets
• Databases
• Disk Space Thresholds
• Local Data Directories
• Local Data Files
• Log Directories
• Navigator Settings
• Non-default Values - properties whose value differs from the default value
• Non-uniform Values - properties whose values are not uniform across the cluster or clusters
• Port Configurations
• Service Dependencies
You can also select Configuration Issues to view a list of configuration issues for all clusters.
• specific cluster
1. On the Home page, click a cluster name.
2. Select Configuration and then select one of the classes of properties listed above.
You can also apply the following filters to limit the displayed properties:
• Enter a search term in the Search box to search for properties by name or description.
• Expand the Status filter to select options that limit the displayed properties to those with errors or warnings,
properties that have been edited, properties with non-default values, or properties with overrides. Select All to
remove any filtering by Status.
• Expand the Scope filter to display a list of service types. Expand a service type heading to filter on Service-Wide
configurations for a specific service instance or select one of the default role groups listed under each service
type. Select All to remove any filtering by Scope.
• Expand the Category filter to filter using a sub-grouping of properties. Select All to remove any filtering by Category.
Cloudera Administration | 35
Managing CDH and Managed Services
2. Removing all roles from the host (except for the Cloudera Manager management roles). See Deleting Role Instances
on page 47.
3. Deleting the host from the cluster (see Deleting Hosts on page 61), specifically the section on removing a host
from a cluster but leaving it available to Cloudera Manager.
4. Adding the host to the new cluster (see Adding a Host to the Cluster on page 54).
5. Adding roles to the host (optionally using one of the host templates associated with the new cluster). See Adding
a Role Instance on page 45 and Host Templates on page 57.
Managing Services
Cloudera Manager service configuration features let you manage the deployment and configuration of CDH and
managed services. You can add new services and roles if needed, gracefully start, stop and restart services or roles,
and decommission and delete roles or services if necessary. Further, you can modify the configuration properties for
services or for individual role instances. If you have a Cloudera Enterprise license, you can view past configuration
changes and roll back to a previous revision. You can also generate client configuration files, enabling you to easily
distribute them to the users of a service.
The topics in this chapter describe how to configure and use the services on your cluster. Some services have unique
configuration requirements or provide unique features: those are covered in Managing Individual Services on page
77.
Note:
Cloudera Manager configuration screens offer two layout options: classic and new. The new layout
is the default; however, on each configuration page you can easily switch between layouts using the
Switch to XXX layout link at the top right of the page. For more information, see Configuration Overview
on page 8.
Adding a Service
Minimum Required Role: Full Administrator
After initial installation, you can use the Add a Service wizard to add and configure new service instances. For example,
you may want to add a service such as Oozie that you did not select in the wizard during the initial installation.
The binaries for the following services are not packaged in CDH 4 and CDH 5 and must be installed individually before
being adding the service:
If you do not add the binaries before adding the service, the service will fail to start.
To add a service:
1. On the Home > Status tab, click
to the right of the cluster name and select Add a Service. A list of service types display. You can add one type of
service at a time.
2. Click the radio button next to a service and click Continue. If you are missing required binaries, a pop-up displays
asking if you want to continue with adding the service.
36 | Cloudera Administration
Managing CDH and Managed Services
3. Select the radio button next to the services on which the new service should depend. All services must depend
on the same ZooKeeper service. Click Continue.
4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
5. Review and modify configuration settings, such as data directory paths and heap sizes and click Continue. The
service is started.
6. Click Continue then click Finish. You are returned to the Home page.
7. Verify the new service is started properly by checking the health status for the new service. If the Health Status
is Good, then the service started properly.
The service's properties will be displayed showing the values for each property for the selected clusters. The filters on
the left side can be used to limit the properties displayed.
Cloudera Administration | 37
Managing CDH and Managed Services
You can also view property configuration values that differ between clusters across a deployment by selecting
Non-uniform Values on the Configuration tab of the Cloudera Manager Home > Status tab. For more information, see
Cluster-Wide Configuration on page 35
Add-on Services
Minimum Required Role: Full Administrator
Cloudera Manager supports adding new types of services (referred to as an add-on service) to Cloudera Manager,
allowing such services to leverage Cloudera Manager distribution, configuration, monitoring, resource management,
and life-cycle management features. An add-on service can be provided by Cloudera or an independent software
vendor (ISV). If you have multiple clusters managed by Cloudera Manager, an add-on service can be deployed on any
of the clusters.
Note: If the add-on service is already installed and running on hosts that are not currently being
managed by Cloudera Manager, you must first add the hosts to a cluster that's under management.
See Adding a Host to the Cluster on page 54 for details.
5. Log into the Cloudera Manager Admin Console and restart the Cloudera Management Service.
38 | Cloudera Administration
Managing CDH and Managed Services
3. In the Remote Parcel Repository URLs list, click to open an additional row.
4. Enter the path to the repository.
5. Click Save Changes to commit the changes.
6. Click . The external parcel should appear in the set of parcels available for download.
7. Download, distribute, and activate the parcel. See Managing Parcels.
Adding an Add-on Service
Add the service following the procedure in Adding a Service on page 36.
Uninstalling an Add-on Service
1. Stop all instances of the service.
2. Delete the service from all clusters. If there are other services that depend on the service you are trying to delete,
you must delete those services first.
3. Log on to the Cloudera Manager Server host and remove the CSD file.
4. Restart the Cloudera Manager Server:
5. After the server has restarted, log into the Cloudera Manager Admin Console and restart the Cloudera Management
Service.
6. Optionally remove the parcel.
Cloudera Administration | 39
Managing CDH and Managed Services
Note: If you are unable to start the HDFS service, it's possible that one of the roles instances, such
as a DataNode, was running on a host that is no longer connected to the Cloudera Manager Server
host, perhaps because of a hardware or network failure. If this is the case, the Cloudera Manager
Server will be unable to connect to the Cloudera Manager Agent on that disconnected host to start
the role instance, which will prevent the HDFS service from starting. To work around this, you can
stop all services, abort the pending command to start the role instance on the disconnected host, and
then restart all services again without that role instance. For information about aborting a pending
command, see Aborting a Pending Command on page 43.
40 | Cloudera Administration
Managing CDH and Managed Services
5. Hive
6. MapReduce or YARN
7. Key-Value Store Indexer
8. HBase
9. Flume
10. Solr
11. HDFS
12. ZooKeeper
13. Cloudera Management Service
Restarting a Service
It is sometimes necessary to restart a service, which is essentially a combination of stopping a service and then starting
it again. For example, if you change the hostname or port where the Cloudera Manager is running, or you enable TLS
security, you must restart the Cloudera Management Service to update the URL to the Server.
1. On the Home > Status tab, click
Rolling Restart
Minimum Required Role: Operator (also provided by Configurator, Cluster Administrator, Full Administrator)
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
Rolling restart allows you to conditionally restart the role instances of Flume, HBase, HDFS, Kafka, MapReduce, YARN,
and ZooKeeper services to update software or use a new configuration. If the service is not running, rolling restart is
not available for that service. You can do a rolling restart of each service individually.
If you have HDFS high availability enabled, you can also perform a cluster-level rolling restart. At the cluster level, the
rolling restart of worker hosts is performed on a host-by-host basis, rather than per service, to avoid all roles for a
service potentially being unavailable at the same time. During a cluster restart, in order to avoid having your NameNode
(and thus the cluster) being unavailable during the restart, Cloudera Manager will force a failover to the standby
NameNode.
MapReduce (MRv1) JobTracker High Availability on page 323 and YARN (MRv2) ResourceManager High Availability on
page 314 is not required for a cluster-level rolling restart. However, if you have JobTracker or ResourceManager high
availability enabled, Cloudera Manager will force a failover to the standby JobTracker or ResourceManager.
Cloudera Administration | 41
Managing CDH and Managed Services
Note:
• HDFS - If you do not have HDFS high availability configured, a warning appears reminding
you that the service will become unavailable during the restart while the NameNode is
restarted. Services that depend on that HDFS service will also be disrupted. It is recommended
that you restart the DataNodes one at a time—one host per batch, which is the default.
• HBase
– Administration operations such as any of the following should not be performed during
the rolling restart, to avoid leaving the cluster in an inconsistent state:
– Split
– Create, disable, enable, or drop table
– Metadata changes
– Create, clone, or restore a snapshot. Snapshots rely on the RegionServers being
up; otherwise the snapshot will fail.
– To increase the speed of a rolling restart of the HBase service, set the Region Mover
Threads property to a higher value. This increases the number of regions that can be
moved in parallel, but places additional strain on the HMaster. In most cases, Region
Mover Threads should be set to 5 or lower.
– Another option to increase the speed of a rolling restart of the HBase service is to set
the Skip Region Reload During Rolling Restart property to true. This setting can cause
regions to be moved around multiple times, which can degrade HBase client performance.
• MapReduce - If you restart the JobTracker, all current jobs will fail.
• YARN - If you restart ResourceManager and ResourceManager HA is enabled, current jobs
continue running: they do not restart or fail. ResourceManager HA is supported for CDH 5.2
and higher.
• ZooKeeper and Flume - For both ZooKeeper and Flume, the option to restart roles in batches
is not available. They are always restarted one by one.
42 | Cloudera Administration
Managing CDH and Managed Services
Cloudera Administration | 43
Managing CDH and Managed Services
Deleting Services
Minimum Required Role: Full Administrator
1. Stop the service. For information on starting and stopping services, see Starting, Stopping, and Restarting Services
on page 39.
2. On the Home > Status tab, click
Renaming a Service
Minimum Required Role: Full Administrator
A service is given a name upon installation, and that name is used as an identifier internally. However, Cloudera Manager
allows you to provide a display name for a service, and that name will appear in the Cloudera Manager Admin Console
instead of the original (internal) name.
1. On the Home > Status tab, click
Managing Roles
When Cloudera Manager configures a service, it configures hosts in your cluster with one or more functions (called
roles in Cloudera Manager) that are required for that service. The role determines which Hadoop daemons run on a
given host. For example, when Cloudera Manager configures an HDFS service instance it configures one host to run
the NameNode role, another host to run as the Secondary NameNode role, another host to run the Balancer role, and
some or all of the remaining hosts to run DataNode roles.
44 | Cloudera Administration
Managing CDH and Managed Services
Configuration settings are organized in role groups. A role group includes a set of configuration properties for a specific
group, as well as a list of role instances associated with that role group. Cloudera Manager automatically creates default
role groups.
For role types that allow multiple instances on multiple hosts, such as DataNodes, TaskTrackers, RegionServers (and
many others), you can create multiple role groups to allow one set of role instances to use different configuration
settings than another set of instances of the same role type. In fact, upon initial cluster setup, if you are installing on
identical hosts with limited memory, Cloudera Manager will (typically) automatically create two role groups for each
worker role — one group for the role instances on hosts with only other worker roles, and a separate group for the
instance running on the host that is also hosting master roles.
The HDFS service is an example of this: Cloudera Manager typically creates one role group (DataNode Default Group)
for the DataNode role instances running on the worker hosts, and another group (HDFS-1-DATANODE-1) for the
DataNode instance running on the host that is also running the master roles such as the NameNode, JobTracker, HBase
Master and so on. Typically the configurations for those two classes of hosts will differ in terms of settings such as
memory for JVMs.
Cloudera Manager configuration screens offer two layout options: classic and new. The new layout is the default;
however, on each configuration page you can easily switch between layouts using the Switch to XXX layout link at the
top right of the page. For more information, see Configuration Overview on page 8.
Gateway Roles
A gateway is a special type of role whose sole purpose is to designate a host that should receive a client configuration
for a specific service, when the host does not have any roles running on it. Gateway roles enable Cloudera Manager
to install and manage client configurations on that host. There is no process associated with a gateway role, and its
status will always be Stopped. You can configure gateway roles for HBase, HDFS, Hive, MapReduce, Solr, Spark, Sqoop
1 Client, and YARN.
Role Instances
Cloudera Administration | 45
Managing CDH and Managed Services
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
5. Click Continue.
6. In the Review Changes page, review the configuration changes to be applied. Confirm the settings entered for file
system paths. The file paths required vary based on the services to be installed. For example, you might confirm
the NameNode Data Directory and the DataNode Data Directory for HDFS. Click Continue. The wizard finishes by
performing any actions necessary to prepare the cluster for the new role instances. For example, new DataNodes
are added to the NameNode dfs_hosts_allow.txt file. The new role instance is configured with the default
role group for its role type, even if there are multiple role groups for the role type. If you want to use a different
role group, follow the instructions in Managing Role Groups on page 48 for moving role instances to a different
role group. The new role instances are not started automatically.
46 | Cloudera Administration
Managing CDH and Managed Services
5. Select Actions for Selected > Decommission, and then click Decommission again to start the process. A
Decommission Command pop-up displays that shows each step or decommission command as it is run. In the
Details area, click to see the subcommands that are run. Depending on the role, the steps may include adding
the host to an "exclusions list" and refreshing the NameNode, JobTracker, or NodeManager; stopping the Balancer
(if it is running); and moving data blocks or regions. Roles that do not have specific decommission actions are
stopped.
You can abort the decommission process by clicking the Abort button, but you must recommission and restart
the role.
The Commission State facet in the Filters list displays Decommissioning while decommissioning is in progress,
and Decommissioned when the decommissioning process has finished. When the process is complete, a is
added in front of Decommission Command.
Note: Deleting a role instance does not clean up the associated client configurations that have been
deployed in the cluster.
Cloudera Administration | 47
Managing CDH and Managed Services
Role Groups
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
A role group is a set of configuration properties for a role type, as well as a list of role instances associated with that
group. Cloudera Manager automatically creates a default role group named Role Type Default Group for each role
type.Each role instance can be associated with only a single role group.
Role groups provide two types of properties: those that affect the configuration of the service itself and those that
affect monitoring of the service, if applicable (the Monitoring subcategory). (Not all services have monitoring properties).
For more information about monitoring properties see Configuring Monitoring Settings.
When you run the installation or upgrade wizard, Cloudera Manager configures the default role groups it adds, and
adds any other required role groups for a given role type. For example, a DataNode role on the same host as the
NameNode might require a different configuration than DataNode roles running on other hosts. Cloudera Manager
creates a separate role group for the DataNode role running on the NameNode host and uses the default configuration
for DataNode roles running on other hosts.
You can modify the settings of the default role group, or you can create new role groups and associate role instances
to whichever role group is most appropriate. This simplifies the management of role configurations when one group
of role instances may require different settings than another group of instances of the same role type—for example,
due to differences in the hardware the roles run on. You modify the configuration for any of the service's role groups
through the Configuration tab for the service. You can also override the settings inherited from a role group for a role
instance.
If there are multiple role groups for a role type, you can move role instances from one group to another. When you
move a role instance to a different group, it inherits the configuration settings for its new group.
Creating a Role Group
1. Go to a service status page.
2. Click the Instances or Configuration tab.
3. Click Role Groups.
4. Click Create new group....
5. Provide a name for the group.
6. Select the role type for the group. You can select role types that allow multiple instances and that exist for the
service you have selected.
7. In the Copy From field, select the source of the basic configuration information for the role group:
• An existing role group of the appropriate type.
• None.... The role group is set up with generic default values that are not the same as the values Cloudera
Manager sets in the default role group, as Cloudera Manager specifically sets the appropriate configuration
properties for the services and roles it installs. After you create the group you must edit the configuration to
set missing properties (for example the TaskTracker Local Data Directory List property, which is not populated
if you select None) and clear other validation warnings and errors.
48 | Cloudera Administration
Managing CDH and Managed Services
Action Procedure
Rename 1. Click the role group name, click
Delete You cannot delete any of the default groups. The group must first be empty; if you
want to delete a group you've created, you must move any role instances to a
different role group.
1. Click the role group name.
2. Click
next to the role group name on the right, select Delete, and confirm by clicking
Delete. Deleting a role group removes it from host templates.
Managing Hosts
Cloudera Manager provides a number of features that let you configure and manage the hosts in your clusters.
The Hosts screen has the following tabs:
Cloudera Administration | 49
Managing CDH and Managed Services
Role Assignments
You can view the assignment of roles to hosts as follows:
1. Click the Roles tab.
2. Click a cluster name or All Clusters.
Disks Overview
Click the Disks Overview tab to display an overview of the status of all disks in the deployment. The statistics exposed
match or build on those in iostat, and are shown in a series of histograms that by default cover every physical disk
in the system.
Adjust the endpoints of the time line to see the statistics for different time periods. Specify a filter in the box to limit
the displayed data. For example, to see the disks for a single rack rack1, set the filter to: logicalPartition =
false and rackId = "rack1" and click Filter. Click a histogram to drill down and identify outliers. Mouse over
the graph and click to display additional information about the chart.
50 | Cloudera Administration
Managing CDH and Managed Services
• Number of cores
• System load averages for the past 1, 5, and 15 minutes
• Memory usage
• File system disks, their mount points, and usage
• Health test results for the host
• Charts showing a variety of metrics and health test results over time.
• Role instances running on the host and their health
• CPU, memory, and disk resources used for each role instance
To view detailed host information:
1. Click the Hosts tab.
2. Click the name of one of the hosts. The Status page is displayed for the host you selected.
3. Click tabs to access specific categories of information. Each tab provides various categories of information about
the host, its services, components, and configuration.
From the status page you can view details about several categories of information.
Status
The Status page is displayed when a host is initially selected and provides summary information about the status of
the selected host. Use this page to gain a general understanding of work being done by the system, the configuration,
and health status.
If this host has been decommissioned or is in maintenance mode, you will see the following icon(s) ( , ) in the top
bar of the page next to the status message.
Details
This panel provides basic system configuration such as the host's IP address, rack, health status summary, and disk
and CPU resources. This information summarizes much of the detailed information provided in other panes on this
tab. To view details about the Host agent, click the Host Agent link in the Details section.
Health Tests
Cloudera Manager monitors a variety of metrics that are used to indicate whether a host is functioning as expected.
The Health Tests panel shows health test results in an expandable/collapsible list, typically with the specific metrics
that the test returned. (You can Expand All or Collapse All from the links at the upper right of the Health Tests panel).
• The color of the text (and the background color of the field) for a health test result indicates the status of the
results. The tests are sorted by their health status – Good, Concerning, Bad, or Disabled. The list of entries for
good and disabled health tests are collapsed by default; however, Bad or Concerning results are shown expanded.
• The text of a health test also acts as a link to further information about the test. Clicking the text will pop up a
window with further information, such as the meaning of the test and its possible results, suggestions for actions
you can take or how to make configuration changes related to the test. The help text for a health test also provides
a link to the relevant monitoring configuration section for the service. See Configuring Monitoring Settings for
more information.
Health History
The Health History provides a record of state transitions of the health tests for the host.
• Click the arrow symbol at the left to view the description of the health test state change.
• Click the View link to open a new page that shows the state of the host at the time of the transition. In this view
some of the status settings are greyed out, as they reflect a time in the past, not the current status.
File Systems
The File systems panel provides information about disks, their mount points and usage. Use this information to determine
if additional disk space is required.
Cloudera Administration | 51
Managing CDH and Managed Services
Roles
Use the Roles panel to see the role instances running on the selected host, as well as each instance's status and health.
Hosts are configured with one or more role instances, each of which corresponds to a service. The role indicates which
daemon runs on the host. Some examples of roles include the NameNode, Secondary NameNode, Balancer, JobTrackers,
DataNodes, RegionServers and so on. Typically a host will run multiple roles in support of the various services running
in the cluster.
Clicking the role name takes you to the role instance's status page.
You can delete a role from the host from the Instances tab of the Service page for the parent service of the role. You
can add a role to a host in the same way. See Role Instances on page 45.
Charts
Charts are shown for each host instance in your cluster.
See Viewing Charts for Cluster, Service, Role, and Host Instances for detailed information on the charts that are
presented, and the ability to search and display metrics of your choice.
Processes
The Processes page provides information about each of the processes that are currently running on this host. Use this
page to access management web UIs, check process status, and access log information.
Note: The Processes page may display exited startup processes. Such processes are cleaned up within
a day.
52 | Cloudera Administration
Managing CDH and Managed Services
Cloudera Administration | 53
Managing CDH and Managed Services
You can use the host inspector to gather information about hosts that Cloudera Manager is currently managing. You
can review this information to better understand system status and troubleshoot any existing issues. For example, you
might use this information to investigate potential DNS misconfiguration.
The inspector runs tests to gather information for functional areas including:
• Networking
• System time
• User and group configuration
• HDFS settings
• Component versions
Common cases in which this information is useful include:
• Installing components
• Upgrading components
• Adding hosts to a cluster
• Removing hosts from a cluster
Running the Host Inspector
1. Click the Hosts tab.
2. Click Host Inspector. Cloudera Manager begins several tasks to inspect the managed hosts.
3. After the inspection completes, click Download Result Data or Show Inspector Results to review the results.
The results of the inspection displays a list of all the validations and their results, and a summary of all the components
installed on your managed hosts.
If the validation process finds problems, the Validations section will indicate the problem. In some cases the message
may indicate actions you can take to resolve the problem. If an issue exists on multiple hosts, you may be able to view
the list of occurrences by clicking a small triangle that appears at the end of the message.
The Version Summary section shows all the components that are available from Cloudera, their versions (if known)
and the CDH distribution to which they belong (CDH 4 or CDH 5).
If you are running multiple clusters with both CDH 4 and CDH 5, the lists will be organized by distribution (CDH 4 or
CDH 5). The hosts running that version are shown at the top of each list.
Viewing Past Host Inspector Results
You can view the results of a past host inspection by looking for the Host Inspector command using the Recent
Commands feature.
1. Click the Running Commands indicator ( ) just to the left of the Search box at the right hand side of the navigation
bar.
2. Click the Recent Commands button.
3. If the command is too far in the past, you can use the Time Range Selector to move the time range back to cover
the time period you want.
4. When you find the Host Inspector command, click its name to display its subcommands.
5. Click the Show Inspector Results button to view the report.
See Viewing Running and Recent Commands for more information about viewing past command activity.
54 | Cloudera Administration
Managing CDH and Managed Services
The Add Hosts wizard does not create roles on the new host; once you have successfully added the host(s) you can
either add roles, one service at a time, or apply a host template, which can define role configurations for multiple roles.
Important:
• All hosts in a single cluster must be running the same version of CDH.
• When you add a new host, you must install the same version of CDH to enable the new host to
work with the other hosts in the cluster. The installation wizard lets you select the version of CDH
to install, and you can choose a custom repository to ensure that the version you install matches
the version on the other hosts.
• If you are managing multiple clusters, select the version of CDH that matches the version in use
on the cluster where you plan to add the new host.
• When you add a new host, the following occurs:
– YARN topology.map is updated to include the new host
– Any service that includes topology.map in its configuration—Flume, Hive, Hue, Oozie, Solr,
Spark, Sqoop 2, YARN—is marked stale
At a convenient point after adding the host you should restart the stale services to pick up the
new configuration.
Important: This step leaves the existing hosts in an unmanageable state; they are still configured to
use TLS, and so communicate with the Cloudera Manager Server.
Cloudera Administration | 55
Managing CDH and Managed Services
that should run on a host. When you have created the template, it will appear in the list of host templates
from which you can choose.
c. Select the host template you want to use.
d. By default Cloudera Manager will automatically start the roles specified in the host template on your newly
added hosts. To prevent this, uncheck the option to start the newly-created roles.
6. When the wizard is finished, you can verify the Agent is connecting properly with the Cloudera Manager Server
by clicking the Hosts tab and checking the health status for the new host. If the Health Status is Good and the
value for the Last Heartbeat is recent, then the Agent is connecting properly with the Cloudera Manager Server.
If you did not specify a host template during the Add Hosts wizard, then no roles will be present on your new hosts
until you add them. You can do this by adding individual roles under the Instances tab for a specific service, or by using
a host template. See Role Instances on page 45 for information about adding roles for a specific service. See Host
Templates on page 57 to create a host template that specifies a set of roles (from different services) that should run
on a host.
Enable Kerberos
If you have previously enabled Kerberos on your cluster:
• Install the packages required to kinit on the new host. For the list of packages required for each OS, see Kerberos
Prerequisites.
• If you have set up Cloudera Manager to manage krb5.conf, it will automatically deploy the file on the new host.
• If Cloudera Manager does not manage krb5.conf, you must manually update the file at /etc/krb5.conf.
Adding a Host by Installing the Packages Using Your Own Method
If you used a different mechanism to install the Oracle JDK, CDH, Cloudera Manager Agent packages, you can use that
same mechanism to install the Oracle JDK, CDH, Cloudera Manager Agent packages and then start the Cloudera Manager
Agent.
1. Install the Oracle JDK, CDH, and Cloudera Manager Agent packages using your own method. For instructions on
installing these packages, see Installation Path B - Manual Installation Using Cloudera Manager Packages.
2. After installation is complete, start the Cloudera Manager Agent. For instructions, see Starting, Stopping, and
Restarting Cloudera Manager Agents on page 436.
3. After the Agent is started, you can verify the Agent is connecting properly with the Cloudera Manager Server by
clicking the Hosts tab and checking the health status for the new host. If the Health Status is Good and the value
for the Last Heartbeat is recent, then the Agent is connecting properly with the Cloudera Manager Server.
4. If you have enabled TLS security on your cluster, you must enable and configure TLS on each new host. Otherwise,
ignore this step.
a. Enable and configure TLS on each new host by specifying 1 for the use_tls property in the
/etc/cloudera-scm-agent/config.ini configuration file.
b. Configure the same level(s) of TLS security on the new hosts by following the instructions in Configuring TLS
Security for Cloudera Manager.
56 | Cloudera Administration
Managing CDH and Managed Services
5. If you have previously enabled TLS/SSL on your cluster, and you plan to start these roles on this new host, make
sure you install a new host certificate to be configured from the same path and naming convention as the rest of
your hosts. Since the new host and the roles configured on it are inheriting their configuration from the previous
host, ensure that the keystore or truststore passwords and locations are the same on the new host. For instructions
on configuring TLS/SSL, see Configuring TLS/SSL Encryption for CDH Services.
6. If you have previously enabled Kerberos on your cluster:
• Install the packages required to kinit on the new host. For the list of packages required for each OS, see
Kerberos Prerequisites.
• If you have set up Cloudera Manager to manage krb5.conf, it will automatically deploy the file on the new
host.
• If Cloudera Manager does not manage krb5.conf, you must manually update the file at /etc/krb5.conf.
Host Templates
Minimum Required Role: Full Administrator
Host templates let you designate a set of role groups that can be applied in a single operation to a host or a set of
hosts. This significantly simplifies the process of configuring new hosts when you need to expand your cluster. Host
templates are supported for both CDH 4 and CDH 5 cluster hosts.
Important: A host template can only be applied on a host with a version of CDH that matches the
CDH version running on the cluster to which the host template belongs.
You can create and manage host templates under the Templates tab from the Hosts page.
1. Click the Hosts tab on the main Cloudera Manager navigation bar.
2. Click the Templates tab on the Hosts page.
Templates are not required; Cloudera Manager assigns roles and role groups to the hosts of your cluster when you
perform the initial cluster installation. However, if you want to add new hosts to your cluster, a host template can
make this much easier.
Cloudera Administration | 57
Managing CDH and Managed Services
If there are existing host templates, they are listed on the page, along with links to each role group included in the
template.
If you are managing multiple clusters, you must create separate host templates for each cluster, as the templates
specify role configurations specific to the roles in a single cluster. Existing host templates are listed under the cluster
to which they apply.
• You can click a role group name to be taken to the Edit configuration page for that role group, where you can
modify the role group settings.
• From the Actions menu associated with the template you can edit the template, clone it, or delete it.
Creating a Host Template
1. From the Templates tab, click Click here
2. In the Create New Host Template pop-up window that appears:
• Type a name for the template.
• For each role, select the appropriate role group. There may be multiple role groups for a given role type —
you want to select the one with the configuration that meets your needs.
3. Click Create to create the host template.
Editing a Host Template
1. From the Hosts tab, click the Templates tab.
2. Pull down the Actions menu for the template you want to modify, and click Edit. This put you into the Edit Host
Template pop-up window. This works exactly like the Create New Host Template window — you can modify they
template name or any of the role group selections.
3. Click OK when you have finished.
Applying a Host Template to a Host
You can use a host template to apply configurations for multiple roles in a single operation.
You can apply a template to a host that has no roles on it, or that has roles from the same services as those included
in the host template. New roles specified in the template that do not already exist on the host will be added. A role
on the host that is already a member of the role group specified in the template will be left unchanged. If a role on the
host matches a role in the template, but is a member of a different role group, it will be moved to the role group
specified by the template.
For example, suppose you have two role groups for a DataNode (DataNode Default Group and DataNode (1)). The host
has a DataNode role that belongs to DataNode Default Group. If you apply a host template that specifies the DataNode
(1) group, the role on the host will be moved from DataNode Default Group to DataNode (1).
However, if you have two instances of a service, such as MapReduce (for example, mr1 and mr2) and the host has a
TaskTracker role from service mr2, you cannot apply a TaskTracker role from service mr1.
A host may have no roles on it if you have just added the host to your cluster, or if you decommissioned a managed
host and removed its existing roles.
Also, the host must have the same version of CDH installed as is running on the cluster whose host templates you are
applying.
If a host belongs to a different cluster than the one for which you created the host template, you can apply the host
template if the "foreign" host either has no roles on it, or has only management roles on it. When you apply the host
template, the host will then become a member of the cluster whose host template you applied. The following instructions
assume you have already created the appropriate host template.
1. Go to the Hosts page, Status tab.
2. Select the host(s) to which you want to apply your host template.
3. From the Actions for Selected menu, select Apply Host Template.
4. In the pop-up window that appears, select the host template you want to apply.
58 | Cloudera Administration
Managing CDH and Managed Services
5. Optionally you can have Cloudera Manager start the roles created per the host template – check the box to enable
this.
6. Click Confirm to initiate the action.
Decommissioning Hosts
Minimum Required Role: Limited Operator (also provided by Operator, Configurator, Cluster Administrator, or Full
Administrator)
You cannot decommission a DataNode or a host with a DataNode if the number of DataNodes equals the replication
factor (which by default is three) of any file stored in HDFS. For example, if the replication factor of any file is three,
and you have three DataNodes, you cannot decommission a DataNode or a host with a DataNode. If you attempt to
decommission a DataNode or a host with a DataNode in such situations, the DataNode will be decommissioned, but
the decommission process will not complete. You will have to abort the decommission and recommission the DataNode.
To decommission hosts:
1. If the host has a DataNode, perform the steps in Tuning HDFS Prior to Decommissioning DataNodes on page 59.
2. Click the Hosts tab.
3. Select the checkboxes next to one or more hosts.
4. Select Actions for Selected > Hosts Decommission.
A confirmation pop-up informs you of the roles that will be decommissioned or stopped on the hosts you have
selected.
5. Click Confirm. A Decommission Command pop-up displays that shows each step or decommission command as
it is run, service by service. In the Details area, click to see the subcommands that are run for decommissioning
a given service. Depending on the service, the steps may include adding the host to an "exclusions list" and
refreshing the NameNode, JobTracker, or NodeManager; stopping the Balancer (if it is running); and moving data
blocks or regions. Roles that do not have specific decommission actions are stopped.
You can abort the decommission process by clicking the Abort button, but you must recommission and restart
each role that has been decommissioned.
The Commission State facet in the Filters lists displays Decommissioning while decommissioning is in progress,
and Decommissioned when the decommissioning process has finished. When the process is complete, a is
added in front of Decommission Command.
Note: When a DataNode is decommissioned, the data blocks are not removed from the storage
directories. You must delete the data manually.
Cloudera Administration | 59
Managing CDH and Managed Services
1. Raise the heap size of the DataNodes. DataNodes should be configured with at least 4 GB heap size to allow for
the increase in iterations and max streams.
a. Go to the HDFS service page.
b. Click the Configuration tab.
c. Select Scope > DataNode.
d. Select Category > Resource Management.
e. Set the Java Heap Size of DataNode in Bytes property as recommended.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
f. Click Save Changes to commit the changes.
2. Set the DataNode balancing bandwidth:
a. Select Scope > DataNode.
b. Expand the Category > Performance category.
c. Configure the DataNode Balancing Bandwidth property to the bandwidth you have on your disks and network.
d. Click Save Changes to commit the changes.
3. Increase the replication work multiplier per iteration to a larger number (the default is 2, however 10 is
recommended):
a. Select Scope > NameNode.
b. Expand the Category > Advanced category.
c. Configure the Replication Work Multiplier Per Iteration property to a value such as 10.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
d. Click Save Changes to commit the changes.
4. Increase the replication maximum threads and maximum replication thread hard limits:
a. Select Scope > NameNode.
b. Expand the Category > Advanced category.
c. Configure the Maximum number of replication threads on a DataNode and Hard limit on the number of
replication threads on a DataNode properties to 50 and 100 respectively.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
d. Click Save Changes to commit the changes.
5. Restart the HDFS service.
Recommissioning Hosts
Minimum Required Role: Operator (also provided by Configurator, Cluster Administrator, Full Administrator)
Only hosts that are decommissioned using Cloudera Manager can be recommissioned.
1. Click the Hosts tab.
2. Select one or more hosts to recommission.
60 | Cloudera Administration
Managing CDH and Managed Services
3. Select Actions for Selected > Recommission and Confirm. A Recommission Command pop-up displays that shows
each step or recommission command as it is run. When the process is complete, a is added in front of
Recommission Command. The host and roles are marked as commissioned, but the roles themselves are not
restarted.
Deleting Hosts
Minimum Required Role: Full Administrator
You can remove a host from a cluster in two ways:
• Delete the host entirely from Cloudera Manager.
• Remove a host from a cluster, but leave it available to other clusters managed by Cloudera Manager.
Both methods decommission the hosts, delete roles, and remove managed service software, but preserve data
directories.
Maintenance Mode
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
Maintenance mode allows you to suppress alerts for a host, service, role, or an entire cluster. This can be useful when
you need to take actions in your cluster (make configuration changes and restart various elements) and do not want
to see the alerts that will be generated due to those actions.
Putting an entity into maintenance mode does not prevent events from being logged; it only suppresses the alerts that
those events would otherwise generate. You can see a history of all the events that were recorded for entities during
the period that those entities were in maintenance mode.
Cloudera Administration | 61
Managing CDH and Managed Services
to the right of the cluster name and selecting View Maintenance Mode Status.
62 | Cloudera Administration
Managing CDH and Managed Services
Cloudera Administration | 63
Managing CDH and Managed Services
64 | Cloudera Administration
Managing CDH and Managed Services
Cloudera Administration | 65
Managing CDH and Managed Services
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
init(8) starts some daemons when the system is booted. Depending on the distribution, init executes scripts from
either the /etc/init.d directory or the /etc/rc2.d directory. The CDH packages link the files in init.d and rc2.d
so that modifying one set of files automatically updates the other.
To start system services at boot time and on restarts, enable their init scripts on the systems on which the services
will run, using the appropriate tool:
• chkconfig is included in the RHEL and CentOS distributions. Debian and Ubuntu users can install the chkconfig
package.
• update-rc.d is included in the Debian and Ubuntu distributions.
Configuring init to Start Core Hadoop System Services in an MRv1 Cluster
Important:
Cloudera does not support running MRv1 and YARN daemons on the same nodes at the same time;
it will degrade performance and may result in an unstable cluster deployment.
66 | Cloudera Administration
Managing CDH and Managed Services
Where Command
Important:
Do not run MRv1 and YARN on the same set of nodes at the same time. This is not recommended; it
degrades your performance and may result in an unstable MapReduce cluster deployment.
Where Command
Cloudera Administration | 67
Managing CDH and Managed Services
Where Command
68 | Cloudera Administration
Managing CDH and Managed Services
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use an earlier version of CDH, see the
documentation for that version located at Cloudera Documentation.
Starting HBase
When starting HBase, it is important to start the HMaster, followed by the RegionServers, then the Thrift server.
Cloudera Administration | 69
Managing CDH and Managed Services
1. To start a HBase cluster using the command line, start the HBase Master by using the sudo hbase-master
start command on RHEL or SuSE, or the sudo hadoop-hbase-regionserver start command on Ubuntu
or Debian. The HMaster starts the RegionServers automatically.
2. To start a RegionServer manually, use the sudo hbase-regionserver start command on RHEL or SuSE, or
the sudo hadoop-hbase-regionserver start command on Ubuntu or Debian.
3. To start the Thrift server, use the hbase-thrift start on RHEL or SuSE, or the hadoop-hbase-thrift
start on Ubuntu or Debian.
Stopping HBase
When stopping HBase, it is important to stop the Thrift server, followed by each RegionServer, followed by any backup
HMasters, and finally the main HMaster.
1. Shut down the Thrift server by using the hbase-thrift stop command on the Thrift server host. sudo service
hbase-thrift stop
2. Shut down each RegionServer by using the hadoop-hbase-regionserver stop command on the RegionServer
host.
3. Shut down backup HMasters, followed by the main HMaster, by using the hbase-master stop command.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
To shut down all Hadoop Common system services (HDFS, YARN, MRv1), run the following on each host in the cluster:
To verify that no Hadoop processes are running, run the following command on each host in the cluster:
To stop system services individually, use the instructions in the table below.
Important: Stop services in the order listed in the table. (You can start services in the reverse order.)
70 | Cloudera Administration
Managing CDH and Managed Services
4 Hive Exit the Hive console and sudo service hiveserver2 stop
ensure no Hive scripts are
running. Stop the Hive
server, HCatalog, and sudo service hive-webhcat-server stop
metastore daemon on each
client. sudo service hive-metastore stop
10 HBase Stop the Thrift server and sudo service hbase-thrift stop
clients, followed by
RegionServers and finally
the Master. sudo service hbase-rest stop
Cloudera Administration | 71
Managing CDH and Managed Services
sudo service
hadoop-0.20-mapreduce-jobtracker stop
sudo service
hadoop-0.20-mapreduce-tasktracker stop
13 HDFS Stop HttpFS and the NFS sudo service hadoop-httpfs stop
Gateway (if present).
Stop the Secondary sudo service hadoop-hdfs-nfs3 stop
NameNode, then the
primary NameNode,
sudo service
followed by Journal nodes hadoop-hdfs-secondarynamenode stop
(if present) and then each of
DataNodes.
sudo service hadoop-hdfs-namenode stop
14 KMS (Key Only present if HDFS at rest sudo service hadoop-kms-server stop
Management encryption is enabled
Server)
15 ZooKeeper sudo service zookeeper-server stop
72 | Cloudera Administration
Managing CDH and Managed Services
$ hadoop distcp
Important:
• Do not run distcp as the hdfs user which is blacklisted for MapReduce jobs by default.
• Do not use Hadoop shell commands (such as cp, copyfromlocal, put, get) for large copying
jobs or you may experience I/O bottlenecks.
You can also use a specific path, such as /hbase to move HBase data, for example:
HFTP Protocol
The HFTP protocol allows you to use FTP resources in an HTTP request. When copying with distcp across different
versions of CDH, use hftp:// for the source file system and hdfs:// for the destination file system, and run distcp
from the destination cluster. The default port for HFTP is 50070 and the default port for HDFS is 8020.
Example of a source URI: hftp://namenode-location:50070/basePath
• hftp:// is the source protocol.
• namenode-location is the CDH 4 (source) NameNode hostname as defined by its configured fs.default.name.
• 50070 is the NameNode's HTTP server port, as defined by the configured dfs.http.address.
Example of a destination URI: hdfs://nameservice-id/basePath or hdfs://namenode-location
• hdfs:// is the destination protocol
• nameservice-id or namenode-location is the CDH 5 (destination) NameNode hostname as defined by its
configured fs.defaultFS.
• basePath in both examples refers to the directory you want to copy, if one is specifically needed.
Important:
• HFTP is a read-only protocol and can only be used for the source cluster, not the destination.
• HFTP cannot be used when copying with distcp from an insecure cluster to a secure cluster.
S3 Protocol
Amazon S3 block and native filesystems are also supported with the s3a:// protocol.
Example of an Amazon S3 Block Filesystem URI: s3a://bucket_name/path/to/file
S3 credentials can be provided in a configuration file (for example, core-site.xml):
<property>
<name>fs.s3a.access.key</name>
Cloudera Administration | 73
Managing CDH and Managed Services
<value>...</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>...</value>
</property>
<property>
<name>ipc.client.fallback-to-simple-auth-allowed</name>
<value>true</value>
</property>
Note: Copying between a secure cluster and an insecure cluster is only supported with CDH 5.1.3
and higher (CDH 5.1.3+) in accordance with HDFS-6776.
74 | Cloudera Administration
Managing CDH and Managed Services
Note: JDK version 1.7.x is required on both clusters when copying data between Kerberized clusters
that are in different realms. For information about supported JDK versions, see Supported JDK Versions.
[realms]
HADOOP.QA.domain.COM = { kdc = kdc.domain.com:88 admin_server = admin.test.com:749
default_domain = domain.com supported_enctypes = arcfour-hmac:normal des-cbc-crc:normal
des-cbc-md5:normal des:normal des:v4 des:norealm des:onlyrealm des:afs3 }
[domain_realm]
.domain.com = HADOOP.test.domain.COM
domain.com = HADOOP.test.domain.COM
test03.domain.com = HADOOP.QA.domain.COM
Cloudera Administration | 75
Managing CDH and Managed Services
<property>
<name>dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>ssl.client.truststore.location</name>
<value>path_to_truststore</value>
</property>
<property>
<name>ssl.client.truststore.password</name>
<value>XXXXXX</value>
</property>
<property>
<name>ssl.client.truststore.type</name>
<value>jks</value>
</property>
If launching distcp fails, force Kerberos to use TCP instead of UDP by adding the following parameter to the krb5.conf
file on the client.
[libdefaults]
udp_preference_limit = 1
76 | Cloudera Administration
Managing CDH and Managed Services
Copying Data between a Secure and an Insecure Cluster using DistCp and WebHDFS
You can use DistCp and WebHDFS to copy data between a secure cluster and an insecure cluster by doing the following:
1. Set ipc.client.fallback-to-simple-auth-allowed to true in core-site.xml on the secure cluster side:
<property>
<name>ipc.client.fallback-to-simple-auth-allowed</name>
<value>true</value>
</property>
2. Use commands such as the following from the secure cluster side only:
Post-migration Verification
After migrating data between the two clusters, it is a good idea to use hadoop fs -ls /basePath to verify the
permissions, ownership and other aspects of your files, and correct any problems before using the files in your new
cluster.
Managing Flume
The Flume packages are installed by the Installation wizard, but the service is not created. This page documents how
to add, configure, and start the Flume service.
to the right of the cluster name and select Add a Service. A list of service types display. You can add one type of
service at a time.
2. Select the Flume service and click Continue.
3. Select the radio button next to the services on which the new service should depend. All services must depend
on the same ZooKeeper service. Click Continue.
4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
Cloudera Administration | 77
Managing CDH and Managed Services
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
Important: The name-value property pairs in the Configuration File property must include an
equal sign (=). For example, tier1.channels.channel1.capacity = 10000.
78 | Cloudera Administration
Managing CDH and Managed Services
4. Locate the Agent Name property or search for it by typing its name in the Search box.
5. Enter a name for the agent.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
6. Click Save Changes to commit the changes.
Note: If you are using Flume with HBase, make sure that the /etc/zookeeper/conf/zoo.cfg file
either does not exist on the host of the Flume agent that is using an HBase sink, or that it contains
the correct ZooKeeper quorum.
Cloudera Administration | 79
Managing CDH and Managed Services
80 | Cloudera Administration
Managing CDH and Managed Services
The HBase coprocessor framework provides a way to extend HBase with custom functionality. To configure these
properties in Cloudera Manager:
1. Select the HBase service.
2. Click the Configuration tab.
3. Select Scope > All.
4. Select Category > All.
5. Type HBase Coprocessor in the Search box.
6. You can configure the values of the following properties:
• HBase Coprocessor Abort on Error (Service-Wide)
• HBase Coprocessor Master Classes (Master Default Group)
• HBase Coprocessor Region Classes (RegionServer Default Group)
7. Click Save Changes to commit the changes.
Enabling Hedged Reads on HBase
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
1. Go to the HBase service.
2. Click the Configuration tab.
3. Select Scope > HBASE-1 (Service-Wide).
4. Select Category > Performance.
5. Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The
descriptions for each of these properties on the configuration pages provide more information.
6. Click Save Changes to commit the changes.
Advanced Configuration for Write-Heavy Workloads
HBase includes several advanced configuration parameters for adjusting the number of threads available to service
flushes and compactions in the presence of write-heavy workloads. Tuning these parameters incorrectly can severely
degrade performance and is not necessary for most HBase clusters. If you use Cloudera Manager, configure these
options using the HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml.
hbase.hstore.flusher.count
The number of threads available to flush writes from memory to disk. Never increase
hbase.hstore.flusher.count to more of 50% of the number of disks available to HBase. For example, if you
have 8 solid-state drives (SSDs), hbase.hstore.flusher.count should never exceed 4. This allows scanners
and compactions to proceed even in the presence of very high writes.
hbase.regionserver.thread.compaction.large and hbase.regionserver.thread.compaction.small
The number of threads available to handle small and large compactions, respectively. Never increase either of these
options to more than 50% of the number of disks available to HBase.
Ideally, hbase.regionserver.thread.compaction.small should be greater than or equal to
hbase.regionserver.thread.compaction.large, since the large compaction threads do more intense work
and will be in use longer for a given operation.
Cloudera Administration | 81
Managing CDH and Managed Services
In addition to the above, if you use compression on some column families, more CPU will be used when flushing these
column families to disk during flushes or compaction. The impact on CPU usage depends on the size of the flush or the
amount of data to be decompressed and compressed during compactions.
Warning: Disabling security on a production HBase system is difficult and could cause data loss.
Contact Cloudera Support if you need to disable security in HBase.
Important:
To enable HBase to work with Kerberos security on your Hadoop cluster, make sure you perform the
installation and configuration steps in Configuring Hadoop Security in CDH 5 and ZooKeeper Security
Configuration.
Note:
These instructions have been tested with CDH and MIT Kerberos 5 only.
Important:
Although an HBase Thrift server can connect to a secured Hadoop cluster, access is not secured from
clients to the HBase Thrift server.
Warning: Disabling security on a production HBase system is difficult and could cause data loss.
Contact Cloudera Support if you need to disable security in HBase.
After you have configured HBase authentication as described in the previous section, you must establish authorization
rules for the resources that a client is allowed to access. HBase currently allows you to establish authorization rules at
the table, column and cell-level. for Cell-level authorization was added as an experimental feature in CDH 5.2 and is
still considered experimental.
82 | Cloudera Administration
Managing CDH and Managed Services
In a production environment, it is likely that different users will have only one of Admin and Create permissions.
Warning:
In the current implementation, a Global Admin with Admin permission can grant himself Read
and Write permissions on a table and gain access to that table's data. For this reason, only grant
Global Admin permissions to trusted user who actually need them.
Also be aware that a Global Admin with Create permission can perform a Put operation on
the ACL table, simulating a grant or revoke and circumventing the authorization check for
Global Admin permissions. This issue (but not the first one) is fixed in CDH 5.3 and higher, as
well as CDH 5.2.1. It is not fixed in CDH 4.x or CDH 5.1.x.
Due to these issues, be cautious with granting Global Admin privileges.
Cloudera Administration | 83
Managing CDH and Managed Services
• Table Admins - A table admin can perform administrative operations only on that table. A table admin with Create
permissions can create snapshots from that table or restore that table from a snapshot. A table admin with Admin
permissions can perform operations such as splits or major compactions on that table.
• Users - Users can read or write data, or both. Users can also execute coprocessor endpoints, if given Executable
permissions.
Important:
If you are using Kerberos principal names when setting ACLs for users, note that Hadoop uses only
the first part (short) of the Kerberos principal when converting it to the user name. Hence, for the
principal ann/[email protected], HBase ACLs should only be
set for user ann.
This table shows some typical job descriptions at a hypothetical company and the permissions they might require in
order to get their jobs done using HBase.
Further Reading
• Access Control Matrix
• Security - Apache HBase Reference Guide
Enable HBase Authorization
HBase authorization is built on top of the Coprocessors framework, specifically AccessController Coprocessor.
Note: Once the Access Controller coprocessor is enabled, any user who uses the HBase shell will be
subject to access control. Access control will also be in effect for native (Java API) client access to
HBase.
84 | Cloudera Administration
Managing CDH and Managed Services
have access to execute endpoint coprocessors. This option is not enabled when you enable HBase Secure
Authorization for backward compatibility.
<property>
<name>hbase.security.exec.permission.checks</name>
<value>true</value>
</property>
5. Optionally, search for and configure HBase Coprocessor Master Classes and HBase Coprocessor Region Classes.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
To enable HBase authorization, add the following properties to the hbase-site.xml file on every HBase server host
(Master or RegionServer):
<property>
<name>hbase.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hbase.security.exec.permission.checks</name>
<value>true</value>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
Important:
The host running the shell must be configured with a keytab file as described in Configuring Kerberos
Authentication for HBase.
The commands that control ACLs take the following form. Group names are prefixed with the @ symbol.
In the above commands, fields encased in <> are variables, and fields in [] are optional. The permissions variable
must consist of zero or more character from the set "RWCA".
Cloudera Administration | 85
Managing CDH and Managed Services
• R denotes read permissions, which is required to perform Get, Scan, or Exists calls in a given scope.
• W denotes write permissions, which is required to perform Put, Delete, LockRow, UnlockRow,
IncrementColumnValue, CheckAndDelete, CheckAndPut, Flush, or Compact in a given scope.
• X denotes execute permissions, which is required to execute coprocessor endpoints.
• C denotes create permissions, which is required to perform Create, Alter, or Drop in a given scope.
• A denotes admin permissions, which is required to perform Enable, Disable, Snapshot, Restore, Clone,
Split, MajorCompact, Grant, Revoke, and Shutdown in a given scope.
Be sure to review the information in Understanding HBase Access Levels to understand the implications of the different
access levels.
Configuring the HBase Thrift Server Role
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
The Thrift Server role is not added by default when you install HBase, but it is required before you can use certain other
features such as the Hue HBase browser. To add the Thrift Server role:
1. Go to the HBase service.
2. Click the Instances tab.
3. Click the Add Role Instances button.
4. Select the host(s) where you want to add the Thrift Server role (you only need one for Hue) and click Continue.
The Thrift Server role should appear in the instances list for the HBase server.
5. Select the Thrift Server role instance.
6. Select Actions for Selected > Start.
Other HBase Security Topics
• Using BulkLoad On A Secure Cluster on page 118
• Configuring Secure HBase Replication
86 | Cloudera Administration
Managing CDH and Managed Services
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
If you need the ability to perform a rolling restart, Cloudera recommends managing your cluster with Cloudera Manager.
1. To start a HBase cluster using the command line, start the HBase Master by using the sudo hbase-master
start command on RHEL or SuSE, or the sudo hadoop-hbase-regionserver start command on Ubuntu
or Debian. The HMaster starts the RegionServers automatically.
2. To start a RegionServer manually, use the sudo hbase-regionserver start command on RHEL or SuSE, or
the sudo hadoop-hbase-regionserver start command on Ubuntu or Debian. Running multiple RegionServer
processes on the same host is not supported.
3. The Thrift service has no dependencies and can be restarted at any time. To start the Thrift server, use the
hbase-thrift start on RHEL or SuSE, or the hadoop-hbase-thrift start on Ubuntu or Debian.
Stopping HBase
You can stop a single HBase host, all hosts of a given type, or all hosts in the cluster.
2. To stop or decommission a single HMaster, select the Master and go through the same steps as above.
3. To stop or decommission the entire cluster, select the Actions button at the top of the screen (not Actions for
selected) and select Decommission (Graceful Stop) or Stop.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
1. Shut down the Thrift server by using the hbase-thrift stop command on the Thrift server host. sudo service
hbase-thrift stop
Cloudera Administration | 87
Managing CDH and Managed Services
2. Shut down each RegionServer by using the hadoop-hbase-regionserver stop command on the RegionServer
host.
3. Shut down backup HMasters, followed by the main HMaster, by using the hbase-master stop command.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
The HBase canary is a Java class. To run it from the command line, in the foreground, issue a command similar to the
following, as the HBase user:
$ /usr/bin/hbase org.apache.hadoop.hbase.tool.Canary
To start the canary in the background, add the --daemon option. You can also use this option in your HBase startup
scripts.
The canary has many options. To see usage instructions, add the --help parameter:
88 | Cloudera Administration
Managing CDH and Managed Services
Warning: The following hbck options modify HBase metadata and are dangerous. They are not
coordinated by the HMaster and can cause further corruption by conflicting with commands that are
currently in progress or coordinated by the HMaster. Even if the HMaster is down, it may try to recover
the latest operation when it restarts. These options should only be used as a last resort. The hbck
command can only fix actual HBase metadata corruption and is not a general-purpose maintenance
tool. Before running these commands, consider contacting Cloudera Support for guidance. In addition,
running any of these commands requires a HMaster restart.
• If region-level inconsistencies are found, use the -fix argument to direct hbck to try to fix them. The following
sequence of steps is followed:
1. The standard check for inconsistencies is run.
2. If needed, repairs are made to tables.
3. If needed, repairs are made to regions. Regions are closed during repair.
• You can also fix individual region-level inconsistencies separately, rather than fixing them automatically with the
-fix argument.
• To try to repair all inconsistencies and corruption at once, use the -repair option, which includes all the region
and table consistency options.
For more details about the hbck command, see Appendix C of the HBase Reference Guide.
Cloudera Administration | 89
Managing CDH and Managed Services
Hedged Reads
Hadoop 2.4 introduced a new feature called hedged reads. If a read from a block is slow, the HDFS client starts up
another parallel, 'hedged' read against a different block replica. The result of whichever read returns first is used, and
the outstanding read is cancelled. This feature helps in situations where a read occasionally takes a long time rather
than when there is a systemic problem. Hedged reads can be enabled for HBase when the HFiles are stored in HDFS.
This feature is disabled by default.
Enabling Hedged Reads for HBase Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
1. Go to the HBase service.
2. Click the Configuration tab.
3. Select Scope > HBASE-1 (Service-Wide).
4. Select Category > Performance.
5. Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The
descriptions for each of these properties on the configuration pages provide more information.
6. Click Save Changes to commit the changes.
Enabling Hedged Reads for HBase Using the Command Line
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
To enable hedged reads for HBase, edit the hbase-site.xml file on each server. Set
dfs.client.hedged.read.threadpool.size to the number of threads to dedicate to running hedged threads,
and set the dfs.client.hedged.read.threshold.millis configuration property to the number of milliseconds
to wait before starting a second read against a different block replica. Set
dfs.client.hedged.read.threadpool.size to 0 or remove it from the configuration to disable the feature.
After changing these properties, restart your cluster.
The following is an example configuration for hedged reads for HBase.
<property>
<name>dfs.client.hedged.read.threadpool.size</name>
<value>20</value> <!-- 20 threads -->
</property>
<property>
<name>dfs.client.hedged.read.threshold.millis</name>
<value>10</value> <!-- 10 milliseconds -->
</property>
90 | Cloudera Administration
Managing CDH and Managed Services
The default blocksize is 64 KB. The appropriate blocksize is dependent upon your data and usage patterns. Use the
following guidelines to tune the blocksize size, in combination with testing and benchmarking as appropriate.
Warning: The default blocksize is appropriate for a wide range of data usage patterns, and tuning
the blocksize is an advanced operation. The wrong configuration can negatively impact performance.
• Consider the average key/value size for the column family when tuning the blocksize. You can find the average
key/value size using the HFile utility:
• Consider the pattern of reads to the table or column family. For instance, if it is common to scan for 500 rows on
various parts of the table, performance might be increased if the blocksize is large enough to encompass 500-1000
rows, so that often, only one read operation on the HFile is required. If your typical scan size is only 3 rows,
returning 500-1000 rows would be overkill.
It is difficult to predict the size of a row before it is written, because the data will be compressed when it is written
to the HFile. Perform testing to determine the correct blocksize for your data.
After changing the blocksize, the HFiles will be rewritten during the next major compaction. To trigger a major
compaction, issue the following command in HBase Shell.
Depending on the size of the table, the major compaction can take some time and have a performance impact while
it is running.
Monitoring Blocksize Metrics
Several metrics are exposed for monitoring the blocksize by monitoring the blockcache itself.
Cloudera Administration | 91
Managing CDH and Managed Services
• Your data: Each time a Get or Scan operation occurs, the result is added to the BlockCache if it was not already
cached there. If you use the BucketCache, data blocks are always cached in the BucketCache.
• Row keys: When a value is loaded into the cache, its row key is also cached. This is one reason to make your row
keys as small as possible. A larger row key takes up more space in the cache.
• hbase:meta: The hbase:meta catalog table keeps track of which RegionServer is serving which regions. It can
consume several megabytes of cache if you have a large number of regions, and has in-memory access priority,
which means HBase attempts to keep it in the cache as long as possible.
• Indexes of HFiles: HBase stores its data in HDFS in a format called HFile. These HFiles contain indexes which allow
HBase to seek for data within them without needing to open the entire HFile. The size of an index is a factor of
the block size, the size of your row keys, and the amount of data you are storing. For big data sets, the size can
exceed 1 GB per RegionServer, although the entire index is unlikely to be in the cache at the same time. If you
use the BucketCache, indexes are always cached on-heap.
• Bloom filters: If you use Bloom filters, they are stored in the BlockCache. If you use the BucketCache, Bloom filters
are always cached on-heap.
The sum of the sizes of these objects is highly dependent on your usage patterns and the characteristics of your data.
For this reason, the HBase Web UI and Cloudera Manager each expose several metrics to help you size and tune the
BlockCache.
Deciding Whether To Use the BucketCache
The HBase team has published the results of exhaustive BlockCache testing, which revealed the following guidelines.
• If the result of a Get or Scan typically fits completely in the heap, the default configuration, which uses the on-heap
LruBlockCache, is the best choice, as the L2 cache will not provide much benefit. If the eviction rate is low,
garbage collection can be 50% less than that of the BucketCache, and throughput can be at least 20% higher.
• Otherwise, if your cache is experiencing a consistently high eviction rate, use the BucketCache, which causes
30-50% of the garbage collection of LruBlockCache when the eviction rate is high.
• BucketCache using file mode on solid-state disks has a better garbage-collection profile but lower throughput
than BucketCache using off-heap memory.
Bypassing the BlockCache
If the data needed for a specific but atypical operation does not all fit in memory, using the BlockCache can be
counter-productive, because data that you are still using may be evicted, or even if other data is not evicted, excess
garbage collection can adversely effect performance. For this type of operation, you may decide to bypass the BlockCache.
To bypass the BlockCache for a given Scan or Get, use the setCacheBlocks(false) method.
In addition, you can prevent a specific column family's contents from being cached, by setting its BLOCKCACHE
configuration to false. Use the following syntax in HBase Shell:
hbase> alter 'myTable', CONFIGURATION => {NAME => 'myCF', BLOCKCACHE => 'false'}
92 | Cloudera Administration
Managing CDH and Managed Services
To configure a column family for in-memory access, use the following syntax in HBase Shell:
To use the Java API to configure a column family for in-memory access, use the
HColumnDescriptor.setInMemory(true) method.
Cloudera Administration | 93
Managing CDH and Managed Services
94 | Cloudera Administration
Managing CDH and Managed Services
Cloudera Administration | 95
Managing CDH and Managed Services
96 | Cloudera Administration
Managing CDH and Managed Services
<property>
<name>hbase.bucketcache.ioengine</name>
<value>offheap</value>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.2</value>
</property>
<property>
<name>hbase.bucketcache.size</name>
<value>4096</value>
</property>
Cloudera Administration | 97
Managing CDH and Managed Services
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
1. First, verify the RegionServer's off-heap size, and if necessary, tune it by editing the hbase-env.sh file and adding
a line like the following:
HBASE_OFFHEAPSIZE=5G
Set it to a value which will accommodate your desired L2 cache size, in addition to space reserved for cache
management.
2. Edit the parameter HBASE_OPTS in the hbase-env.sh file and add the JVM option
-XX:MaxDirectMemorySize=<size>G, replacing <size> with a value large enough to contain your heap and
off-heap BucketCache, expressed as a number of gigabytes.
3. Next, configure the properties in Table 2: BucketCache Configuration Properties on page 93 as appropriate, using
the example below as a model.
<property>
<name>hbase.bucketcache.ioengine</name>
<value>offheap</value>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.2</value>
</property>
<property>
<name>hbase.bucketcache.size</name>
<value>4194304</value>
</property>
98 | Cloudera Administration
Managing CDH and Managed Services
To make sure the timeout period is not too short, you can configure hbase.cells.scanned.per.heartbeat.check
to a minimum number of cells that must be scanned before a timeout check occurs. The default value is 10000. A
smaller value causes timeout checks to occur more often.
Configure the Scanner Heartbeat Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
1. Go to the HBase service.
2. Click the Configuration tab.
3. Select HBase or HBase Service-Wide.
4. Select Category > Main.
5. Locate the RPC Timeout property or search for it by typing its name in the Search box.
6. Edit the property.
7. To modify the default values for hbase.client.scanner.timeout.period or
hbase.cells.scanned.per.heartbeat.check, search for HBase Service Advanced Configuration Snippet
(Safety Valve) for hbase-site.xml. Paste one or both of the following properties into the field and modify the
values as needed.
<property>
<name>hbase.client.scanner.timeout.period</name>
<value>60000</value>
</property>
<property>
<name>hbase.cells.scanned.per.heartbeat.check</name>
<value>10000</value>
</property>
<property>
<name>hbase.rpc.timeout</name>
<value>60000</value>
</property>
<property>
<name>hbase.client.scanner.timeout.period</name>
<value>60000</value>
</property>
<property>
<name>hbase.cells.scanned.per.heartbeat.check</name>
<value>10000</value>
</property>
2. Distribute the modified hbase-site.xml to all your cluster hosts and restart the HBase master and RegionServer
processes for the change to take effect.
Cloudera Administration | 99
Managing CDH and Managed Services
<property>
<name>hbase.regionserver.throughput.controller</name>
<value>org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController</value>
</property>
<property>
<name>hbase.hstore.compaction.throughput.higher.bound</name>
<value>20971520</value>
<description>The default is 20 MB/sec</description>
</property>
<property>
<name>hbase.hstore.compaction.throughput.lower.bound</name>
<value>10485760</value>
<description>The default is 10 MB/sec</description>
</property>
<property>
<name>hbase.hstore.compaction.throughput.offpeak</name>
<value>9223372036854775807</value>
<description>The default is Long.MAX_VALUE, which effectively means no
limitation</description>
</property>
<property>
<name>hbase.offpeak.start.hour</name>
<value>20</value>
<value>When to begin using off-peak compaction settings, expressed as an integer
between 0 and 23.</value>
</property>
<property>
<name>hbase.offpeak.end.hour</name>
<value>6</value>
<value>When to stop using off-peak compaction settings, expressed as an integer between
0 and 23.</value>
</property>
<property>
<name>hbase.hstore.compaction.throughput.tune.period</name>
<value>60000</value>
<description>
</property>
2. Distribute the modified hbase-site.xml to all your cluster hosts and restart the HBase master and RegionServer
processes for the change to take effect.
Scan()
Scan(byte[] startRow)
Scan(byte[] startRow, byte[] stopRow)
• Specify a scanner cache that will be filled before the Scan result is returned, setting setCaching to the number
of rows to cache before returning the result. By default, the caching setting on the table is used. The goal is to
balance IO and network load.
• To limit the number of columns if your table has very wide rows (rows with a large number of columns), use
setBatch(int batch) and set it to the number of columns you want to return in one batch. A large number of columns
is not a recommended design pattern.
• To specify a maximum result size, use setMaxResultSize(long), with the number of bytes. The goal is to
reduce IO and network.
• When you use setCaching and setMaxResultSize together, single server requests are limited by either number
of rows or maximum result size, whichever limit comes first.
• You can limit the scan to specific column families or columns by using addFamily or addColumn. The goal is to
reduce IO and network. IO is reduced because each column family is represented by a Store on each RegionServer,
and only the Stores representing the specific column families in question need to be accessed.
• You can specify a range of timestamps or a single timestamp by specifying setTimeRange or setTimestamp.
• You can use a filter by using setFilter. Filters are discussed in detail in HBase Filtering on page 103 and the Filter
API.
• You can disable the server-side block cache for a specific scan using the API setCacheBlocks(boolean). This
is an expert setting and should only be used if you know what you are doing.
# Specify a startrow, limit the result to 10 rows, and only return selected columns
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
# Specify a timerange
hbase> scan 't1', {TIMERANGE => [1303668804, 1303668904]}
Hedged Reads
Hadoop 2.4 introduced a new feature called hedged reads. If a read from a block is slow, the HDFS client starts up
another parallel, 'hedged' read against a different block replica. The result of whichever read returns first is used, and
the outstanding read is cancelled. This feature helps in situations where a read occasionally takes a long time rather
than when there is a systemic problem. Hedged reads can be enabled for HBase when the HFiles are stored in HDFS.
This feature is disabled by default.
Enabling Hedged Reads for HBase Using the Command Line
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
To enable hedged reads for HBase, edit the hbase-site.xml file on each server. Set
dfs.client.hedged.read.threadpool.size to the number of threads to dedicate to running hedged threads,
<property>
<name>dfs.client.hedged.read.threadpool.size</name>
<value>20</value> <!-- 20 threads -->
</property>
<property>
<name>dfs.client.hedged.read.threshold.millis</name>
<value>10</value> <!-- 10 milliseconds -->
</property>
HBase Filtering
When reading data from HBase using Get or Scan operations, you can use custom filters to return a subset of results
to the client. While this does not reduce server-side IO, it does reduce network bandwidth and reduces the amount
of data the client needs to process. Filters are generally used using the Java API, but can be used from HBase Shell for
testing and debugging purposes.
For more information on Gets and Scans in HBase, see Reading Data from HBase on page 101.
Comparison Operators
• LESS (<)
• LESS_OR_EQUAL (<=)
• EQUAL (=)
• NOT_EQUAL (!=)
• GREATER_OR_EQUAL (>=)
• GREATER (>)
• NO_OP (no operation)
Comparators
• BinaryComparator - lexicographically compares against the specified byte array using the
Bytes.compareTo(byte[], byte[]) method.
• BinaryPrefixComparator - lexicographically compares against a specified byte array. It only compares up to the
length of this byte array.
• RegexStringComparator - compares against the specified byte array using the given regular expression. Only EQUAL
and NOT_EQUAL comparisons are valid with this comparator.
• SubStringComparator - tests whether or not the given substring appears in a specified byte array. The comparison
is case insensitive. Only EQUAL and NOT_EQUAL comparisons are valid with this comparator.
Examples
Example1: >, 'binary:abc' will match everything that is lexicographically greater than
"abc"
Example2: =, 'binaryprefix:abc' will match everything whose first 3 characters are
lexicographically equal to "abc"
Example3: !=, 'regexstring:ab*yz' will match everything that doesn't begin with "ab"
and ends with "yz"
Example4: =, 'substring:abc123' will match everything that begins with the substring
"abc123"
Compound Operators
Within an expression, parentheses can be used to group clauses together, and parentheses have the highest order of
precedence.
SKIP and WHILE operators are next, and have the same precedence.
A filter string of the form: “Filter1 AND Filter2 OR Filter3” will be evaluated as:
“(Filter1 AND Filter2) OR Filter3”
A filter string of the form: “Filter1 AND SKIP Filter2 OR Filter3” will be evaluated
as: “(Filter1 AND (SKIP Filter2)) OR Filter3”
Filter Types
HBase includes several filter types, as well as the ability to group filters together and create your own custom filters.
• KeyOnlyFilter - takes no arguments. Returns the key portion of each key-value pair.
Syntax: KeyOnlyFilter ()
• FirstKeyOnlyFilter - takes no arguments. Returns the key portion of the first key-value pair.
Syntax: FirstKeyOnlyFilter ()
• PrefixFilter - takes a single argument, a prefix of a row key. It returns only those key-values present in a row that
start with the specified row prefix
• ColumnPrefixFilter - takes a single argument, a column prefix. It returns only those key-values present in a column
that starts with the specified column prefix.
• MultipleColumnPrefixFilter - takes a list of column prefixes. It returns key-values that are present in a column
that starts with any of the specified column prefixes.
• ColumnCountGetFilter - takes one argument, a limit. It returns the first limit number of columns in the table.
• PageFilter - takes one argument, a page size. It returns page size number of rows from the table.
• ColumnPaginationFilter - takes two arguments, a limit and offset. It returns limit number of columns after offset
number of columns. It does this for all the rows.
• InclusiveStopFilter - takes one argument, a row key on which to stop scanning. It returns all key-values present
in rows up to and including the specified row.
• TimeStampsFilter - takes a list of timestamps. It returns those key-values whose timestamps matches any of the
specified timestamps.
• RowFilter - takes a compare operator and a comparator. It compares each row key with the comparator using
the compare operator and if the comparison returns true, it returns all the key-values in that row.
• FamilyFilter - takes a compare operator and a comparator. It compares each family name with the comparator
using the compare operator and if the comparison returns true, it returns all the key-values in that family.
• QualifierFilter - takes a compare operator and a comparator. It compares each qualifier name with the comparator
using the compare operator and if the comparison returns true, it returns all the key-values in that column.
• ValueFilter - takes a compare operator and a comparator. It compares each value with the comparator using the
compare operator and if the comparison returns true, it returns that key-value.
• DependentColumnFilter - takes two arguments required arguments, a family and a qualifier. It tries to locate this
column in each row and returns all key-values in that row that have the same timestamp. If the row does not
contain the specified column, none of the key-values in that row will be returned.
The filter can also take an optional boolean argument, dropDependentColumn. If set to true, the column used
for the filter does not get returned.
The filter can also take two more additional optional arguments, a compare operator and a value comparator,
which are further checks in addition to the family and qualifier. If the dependent column is found, its value should
also pass the value check. If it does pass the value check, only then is its timestamp taken into consideration.
• SingleColumnValueFilter - takes a column family, a qualifier, a compare operator and a comparator. If the specified
column is not found, all the columns of that row will be emitted. If the column is found and the comparison with
the comparator returns true, all the columns of the row will be emitted. If the condition fails, the row will not
be emitted.
This filter also takes two additional optional boolean arguments, filterIfColumnMissing and
setLatestVersionOnly.
If the filterIfColumnMissing flag is set to true, the columns of the row will not be emitted if the specified
column to check is not found in the row. The default value is false.
If the setLatestVersionOnly flag is set to false, it will test previous versions (timestamps) in addition to the
most recent. The default value is true.
These flags are optional and dependent on each other. You must set neither or both of them together.
• ColumnRangeFilter - takes either minColumn, maxColumn, or both. Returns only those keys with columns that
are between minColumn and maxColumn. It also takes two boolean variables to indicate whether to include the
minColumn and maxColumn or not. If you don’t want to set the minColumn or the maxColumn, you can pass in
an empty argument.
• Custom Filter - You can create a custom filter by implementing the Filter class. The JAR must be available on all
RegionServers.
/**
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.hbase.filter;
import java.util.regex.Pattern;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.SmallTests;
import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.Before;
import org.junit.Test;
import org.junit.experimental.categories.Category;
/**
* Tests the value filter
*/
@Category(SmallTests.class)
public class TestSingleColumnValueFilter {
private static final byte[] ROW = Bytes.toBytes("test");
private static final byte[] COLUMN_FAMILY = Bytes.toBytes("test");
private static final byte [] COLUMN_QUALIFIER = Bytes.toBytes("foo");
private static final byte[] VAL_1 = Bytes.toBytes("a");
private static final byte[] VAL_2 = Bytes.toBytes("ab");
private static final byte[] VAL_3 = Bytes.toBytes("abc");
private static final byte[] VAL_4 = Bytes.toBytes("abcd");
private static final byte[] FULLSTRING_1 =
Bytes.toBytes("The quick brown fox jumps over the lazy dog.");
private static final byte[] FULLSTRING_2 =
Bytes.toBytes("The slow grey fox trips over the lazy dog.");
private static final String QUICK_SUBSTR = "quick";
private static final String QUICK_REGEX = ".+quick.+";
private static final Pattern QUICK_PATTERN = Pattern.compile("QuIcK",
Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Filter basicFilter;
Filter nullFilter;
Filter substrFilter;
Filter regexFilter;
Filter regexPatternFilter;
@Before
public void setUp() throws Exception {
basicFilter = basicFilterNew();
nullFilter = nullFilterNew();
substrFilter = substrFilterNew();
regexFilter = regexFilterNew();
regexPatternFilter = regexFilterNew(QUICK_PATTERN);
}
assertFalse("basicFilterNotNull", filter.filterRow());
filter.reset();
kv = new KeyValue(ROW, COLUMN_FAMILY, COLUMN_QUALIFIER, VAL_1);
assertTrue("basicFilter4", filter.filterKeyValue(kv) == Filter.ReturnCode.NEXT_ROW);
assertFalse("basicFilterAllRemaining", filter.filterAllRemaining());
assertTrue("basicFilterNotNull", filter.filterRow());
filter.reset();
filter.setLatestVersionOnly(false);
kv = new KeyValue(ROW, COLUMN_FAMILY, COLUMN_QUALIFIER, VAL_1);
assertTrue("basicFilter5", filter.filterKeyValue(kv) == Filter.ReturnCode.INCLUDE);
assertFalse("basicFilterNotNull", filter.filterRow());
}
assertTrue("regexTrue",
filter.filterKeyValue(kv) == Filter.ReturnCode.INCLUDE);
assertFalse("regexFilterAllRemaining", filter.filterAllRemaining());
assertFalse("regexFilterNotNull", filter.filterRow());
}
// Recompose filter.
Filter newFilter = SingleColumnValueFilter.parseFrom(buffer);
return newFilter;
}
/**
* Tests identification of the stop row
* @throws Exception
*/
@Test
public void testStop() throws Exception {
basicFilterTests((SingleColumnValueFilter) basicFilter);
nullFilterTests(nullFilter);
substrFilterTests(substrFilter);
regexFilterTests(regexFilter);
regexPatternFilterTests(regexPatternFilter);
}
/**
* Tests serialization
* @throws Exception
*/
@Test
public void testSerialization() throws Exception {
Filter newFilter = serializationTest(basicFilter);
basicFilterTests((SingleColumnValueFilter)newFilter);
newFilter = serializationTest(nullFilter);
nullFilterTests(newFilter);
newFilter = serializationTest(substrFilter);
substrFilterTests(newFilter);
newFilter = serializationTest(regexFilter);
regexFilterTests(newFilter);
newFilter = serializationTest(regexPatternFilter);
regexPatternFilterTests(newFilter);
}
Variations on Put
There are several different ways to write data into HBase. Some of them are listed below.
• A Put operation writes data into HBase.
• A Delete operation deletes data from HBase. What actually happens during a Delete depends upon several
factors.
• A CheckAndPut operation performs a Scan before attempting the Put, and only does the Put if a value matches
what is expected, and provides row-level atomicity.
• A CheckAndDelete operation performs a Scan before attempting the Delete, and only does the Delete if a
value matches what is expected.
• An Increment operation increments values of one or more columns within a single row, and provides row-level
atomicity.
Refer to the API documentation for a full list of methods provided for writing data to HBase.Different methods require
different access levels and have other differences.
Versions
When you put data into HBase, a timestamp is required. The timestamp can be generated automatically by the
RegionServer or can be supplied by you. The timestamp must be unique per version of a given cell, because the
timestamp identifies the version. To modify a previous version of a cell, for instance, you would issue a Put with a
different value for the data itself, but the same timestamp.
HBase's behavior regarding versions is highly configurable. The maximum number of versions defaults to 1 in CDH 5,
and 3 in previous versions. You can change the default value for HBase by configuring hbase.column.max.version
in hbase-site.xml, either using an advanced configuration snippet if you use Cloudera Manager, or by editing the
file directly otherwise.
You can also configure the maximum and minimum number of versions to keep for a given column, or specify a default
time-to-live (TTL), which is the number of seconds before a version is deleted. The following examples all use alter
statements in HBase Shell to create new column families with the given characteristics, but you can use the same
syntax when creating a new table or to alter an existing column family. This is only a fraction of the options you can
specify for a given column family.
HBase sorts the versions of a cell from newest to oldest, by sorting the timestamps lexicographically. When a version
needs to be deleted because a threshold has been reached, HBase always chooses the "oldest" version, even if it is in
fact the most recent version to be inserted. Keep this in mind when designing your timestamps. Consider using the
default generated timestamps and storing other version-specific data elsewhere in the row, such as in the row key. If
MIN_VERSIONS and TTL conflict, MIN_VERSIONS takes precedence.
Deletion
When you request for HBase to delete data, either explicitly using a Delete method or implicitly using a threshold such
as the maximum number of versions or the TTL, HBase does not delete the data immediately. Instead, it writes a
deletion marker, called a tombstone, to the HFile, which is the physical file where a given RegionServer stores its region
of a column family. The tombstone markers are processed during major compaction operations, when HFiles are
rewritten without the deleted data included.
Even after major compactions, "deleted" data may not actually be deleted. You can specify the KEEP_DELETED_CELLS
option for a given column family, and the tombstones will be preserved in the HFile even after major compaction. One
scenario where this approach might be useful is for data retention policies.
Another reason deleted data may not actually be deleted is if the data would be required to restore a table from a
snapshot which has not been deleted. In this case, the data is moved to an archive during a major compaction, and
only deleted when the snapshot is deleted. This is a good reason to monitor the number of snapshots saved in HBase.
Examples
This abbreviated example writes data to an HBase table using HBase Shell and then scans the table to show the result.
This abbreviated example uses the HBase API to write data to an HBase table, using the automatic timestamp created
by the Region Server.
publicstaticfinalbyte[] CF = "cf".getBytes();
publicstaticfinalbyte[] ATTR = "attr".getBytes();
...
Put put = new Put(Bytes.toBytes(row));
put.add(CF, ATTR, Bytes.toBytes( data));
htable.put(put);
This example uses the HBase API to write data to an HBase table, specifying the timestamp.
publicstaticfinalbyte[] CF = "cf".getBytes();
publicstaticfinalbyte[] ATTR = "attr".getBytes();
...
Put put = new Put( Bytes.toBytes(row));
long explicitTimeInMs = 555; // just an example
put.add(CF, ATTR, explicitTimeInMs, Bytes.toBytes(data));
htable.put(put);
Further Reading
• Refer to the HTableInterface and HColumnDescriptor API documentation for more details about configuring tables
and columns, as well as reading and writing to HBase.
• Refer to the Apache HBase Reference Guide for more in-depth information about HBase, including details about
versions and deletions not covered here.
• To move the data from one HBase cluster to another without downtime on either cluster, use replication.
• To migrate data between HBase version that are not wire compatible, such as from CDH 4 to CDH 5, see Importing
HBase Data From CDH 4 to CDH 5 on page 113.
If the data currently exists outside HBase:
• If possible, write the data to HFile format, and use a BulkLoad to import it into HBase. The data is immediately
available to HBase and you can bypass the normal write path, increasing efficiency.
• If you prefer not to use bulk loads, and you are using a tool such as Pig, you can use it to import your data.
Most likely, at least one of these methods works in your situation. If not, you can use MapReduce directly. Test the
most feasible methods with a subset of your data to determine which one is optimal.
Using CopyTable
CopyTable uses HBase read and write paths to copy part or all of a table to a new table in either the same cluster or
a different cluster. CopyTable causes read load when reading from the source, and write load when writing to the
destination. Region splits occur on the destination table in real time as needed. To avoid these issues, use snapshot
and export commands instead of CopyTable. Alternatively, you can pre-split the destination table to avoid excessive
splits. The destination table can be partitioned differently from the source table. See this section of the Apache HBase
documentation for more information.
Edits to the source table after the CopyTable starts are not copied, so you may need to do an additional CopyTable
operation to copy new data into the destination table. Run CopyTable as follows, using --help to see details about
possible parameters.
The starttime/endtime and startrow/endrow pairs function in a similar way: if you leave out the first of the pair,
the first timestamp or row in the table is the starting point. Similarly, if you leave out the second of the pair, the
operation continues until the end of the table. To copy the table to a new table in the same cluster, you must specify
--new.name, unless you want to write the copy back to the same table, which would add a new version of each cell
(with the same data), or just overwrite the cell with the same value if the maximum number of versions is set to 1 (the
default in CDH 5). To copy the table to a new table in a different cluster, specify --peer.adr and optionally, specify
a new table name.
The following example creates a new table using HBase Shell in non-interactive mode, and then copies data in two
ColumnFamilies in rows starting with timestamp 1265875194289 and including the last row before the CopyTable
started, to the new table.
the commands without arguments to view the usage instructions. The output below is an example, and may be
different for different HBase versions.
$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import
$ /usr/bin/hbase org.apache.hadoop.hbase.mapreduce.Export
-D mapreduce.output.fileoutputformat.compress.type=BLOCK
Additionally, the following SCAN properties can be specified
to control/limit what is exported..
-D hbase.mapreduce.scan.column.family=<familyName>
-D hbase.mapreduce.include.deleted.rows=true
-D hbase.mapreduce.scan.row.start=<ROWSTART>
-D hbase.mapreduce.scan.row.stop=<ROWSTOP>
For performance consider the following properties:
-Dhbase.client.scanner.caching=100
-Dmapreduce.map.speculative=false
-Dmapreduce.reduce.speculative=false
For tables with very wide rows consider setting the batch size as below:
-Dhbase.export.scanner.batch=10
2. On the CDH 4 cluster, export the contents of the table to sequence files in a given directory using a command like
the following.
4. Create the table on the CDH 5 cluster using HBase Shell. Column families must be identical to the table on the
CDH 4 cluster.
Warning: Only use this procedure if the destination cluster is a brand new HBase cluster with empty
tables, and is not currently hosting any data. If this is not the case, or if you are unsure, contact Cloudera
Support before following this procedure.
1. Use the distcp command on the CDH 5 cluster to copy the HFiles from the CDH 4 cluster.
2. In the destination cluster, upgrade the HBase tables. In Cloudera Manager, go to Cluster > HBase and choose
Upgrade HBase from the Action menu. This checks that the HBase tables can be upgraded, and then upgrades
them.
3. Start HBase on the CDH 5 cluster. The upgraded tables are available. Verify the data and confirm that no errors
are logged.
Using Snapshots
As of CDH 4.7, Cloudera recommends snapshots instead of CopyTable where possible. A snapshot captures the state
of a table at the time the snapshot was taken. Because no data is copied when a snapshot is taken, the process is very
quick. As long as the snapshot exists, cells in the snapshot are never deleted from HBase, even if they are explicitly
deleted by the API. Instead, they are archived so that the snapshot can restore the table to its state at the time of the
snapshot.
After taking a snapshot, use the clone_snapshot command to copy the data to a new (immediately enabled) table
in the same cluster, or the Export utility to create a new table based on the snapshot, in the same cluster or a new
cluster. This is a copy-on-write operation. The new table shares HFiles with the original table until writes occur in the
new table but not the old table, or until a compaction or split occurs in either of the tables. This can improve performance
in the short term compared to CopyTable.
To export the snapshot to a new cluster, use the ExportSnapshot utility, which uses MapReduce to copy the snapshot
to the new cluster. Run the ExportSnapshot utility on the source cluster, as a user with HBase and HDFS write
permission on the destination cluster, and HDFS read permission on the source cluster. This creates the expected
amount of IO load on the destination cluster. Optionally, you can limit bandwidth consumption, which affects IO on
the destination cluster. After the ExportSnapshot operation completes, you can see the snapshot in the new cluster
using the list_snapshot command, and you can use the clone_snapshot command to create the table in the
new cluster from the snapshot.
For full instructions for the snapshot and clone_snapshot HBase Shell commands, run the HBase Shell and type
help snapshot. The following example takes a snapshot of a table, uses it to clone the table to a new table in the
same cluster, and then uses the ExportSnapshot utility to copy the table to a different cluster, with 16 mappers and
limited to 200 Mb/sec bandwidth.
$ bin/hbase shell
hbase(main):005:0> snapshot 'TestTable', 'TestTableSnapshot'
0 row(s) in 2.3290 seconds
The bold italic line contains the URL from which you can track the ExportSnapshot job. When it finishes, a new set
of HFiles, comprising all of the HFiles that were part of the table when the snapshot was taken, is created at the HDFS
location you specified.
You can use the SnapshotInfo command-line utility included with HBase to verify or debug snapshots.
Using BulkLoad
HBase uses the well-known HFile format to store its data on disk. In many situations, writing HFiles programmatically
with your data, and bulk-loading that data into HBase on the RegionServer, has advantages over other data ingest
mechanisms. BulkLoad operations bypass the write path completely, providing the following benefits:
• The data is available to HBase immediately but does cause additional load or latency on the cluster when it appears.
• BulkLoad operations do not use the write-ahead log (WAL) and do not cause flushes or split storms.
• BulkLoad operations do not cause excessive garbage collection.
Note: Because they bypass the WAL, BulkLoad operations are not propagated between clusters
using replication. If you need the data on all replicated clusters, you must perform the BulkLoad
on each cluster.
If you use BulkLoads with HBase, your workflow is similar to the following:
1. Extract your data from its existing source. For instance, if your data is in a MySQL database, you might run the
mysqldump command. The process you use depends on your data. If your data is already in TSV or CSV format,
skip this step and use the included ImportTsv utility to process your data into HFiles. See the ImportTsv
documentation for details.
2. Process your data into HFile format. See http://hbase.apache.org/book.html#_hfile_format_2 for details about
HFile format. Usually you use a MapReduce job for the conversion, and you often need to write the Mapper
yourself because your data is unique. The job must to emit the row key as the Key, and either a KeyValue, a Put,
or a Delete as the Value. The Reducer is handled by HBase; configure it using
HFileOutputFormat.configureIncrementalLoad() and it does the following:
• Inspects the table to configure a total order partitioner
• Uploads the partitions file to the cluster and adds it to the DistributedCache
• Sets the number of reduce tasks to match the current number of regions
• Sets the output key/value class to match HFileOutputFormat requirements
• Sets the Reducer to perform the appropriate sorting (either KeyValueSortReducer or PutSortReducer)
3. One HFile is created per region in the output folder. Input data is almost completely re-written, so you need
available disk space at least twice the size of the original data set. For example, for a 100 GB output from
mysqldump, you should have at least 200 GB of available disk space in HDFS. You can delete the original input
file at the end of the process.
4. Load the files into HBase. Use the LoadIncrementalHFiles command (more commonly known as the
completebulkload tool), passing it a URL that locates the files in HDFS. Each file is loaded into the relevant region
on the RegionServer for the region. You can limit the number of versions that are loaded by passing the
--versions= N option, where N is the maximum number of versions to include, from newest to oldest
(largest timestamp to smallest timestamp).
If a region was split after the files were created, the tool automatically splits the HFile according to the new
boundaries. This process is inefficient, so if your table is being written to by other processes, you should load as
soon as the transform step is done.
• You also need to configure the HMaster to set the permissions of the HBase root directory correctly. If you use
Cloudera Manager, edit the Master Advanced Configuration Snippet (Safety Valve) for hbase-site.xml. Otherwise,
edit hbase-site.xml on the HMaster. Add the following:
<property>
<name>hbase.rootdir.perms</name>
<value>711</value>
</property>
If you skip this step, a previously-working BulkLoad setup will start to fail with permission errors when you restart
the HMaster.
<property>
<name>hbase.bulkload.staging.dir</name>
<value>/tmp/hbase-staging</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint</value>
</property>
Cluster replication uses an active-push methodology. An HBase cluster can be a source (also called active, meaning
that it writes new data), a destination (also called passive, meaning that it receives data using replication), or can fulfill
both roles at once. Replication is asynchronous, and the goal of replication is consistency.
When data is replicated from one cluster to another, the original source of the data is tracked with a cluster ID, which
is part of the metadata. In CDH 5, all clusters that have already consumed the data are also tracked. This prevents
replication loops.
Common Replication Topologies
• A central source cluster might propagate changes to multiple destination clusters, for failover or due to geographic
distribution.
• A source cluster might push changes to a destination cluster, which might also push its own changes back to the
original cluster.
• Many different low-latency clusters might push changes to one centralized cluster for backup or resource-intensive
data-analytics jobs. The processed data might then be replicated back to the low-latency clusters.
• Multiple levels of replication can be chained together to suit your needs. The following diagram shows a hypothetical
scenario. Use the arrows to follow the data paths.
At the top of the diagram, the San Jose and Tokyo clusters, shown in red, replicate changes to each other, and each
also replicates changes to a User Data and a Payment Data cluster.
Each cluster in the second row, shown in blue, replicates its changes to the All Data Backup 1 cluster, shown in
grey. The All Data Backup 1 cluster replicates changes to the All Data Backup 2 cluster (also shown in grey),
as well as the Data Analysis cluster (shown in green). All Data Backup 2 also propagates any of its own changes
back to All Data Backup 1.
The Data Analysis cluster runs MapReduce jobs on its data, and then pushes the processed data back to the San
Jose and Tokyo clusters.
Important: To run replication-related HBase comands, your user must have HBase administrator
permissions. If ZooKeeper uses Kerberos, configure HBase Shell to authenticate to ZooKeeper using
Kerberos before attempting to run replication-related commands. There are currently no
replication-related ACLs.
$ kinit -k -t /etc/hbase/conf/hbase.keytab
hbase/[email protected]
5. On the source cluster, in HBase Shell, add the destination cluster as a peer, using the add_peer command. The
syntax is as follows:
add_peer 'ID', 'CLUSTER_KEY'
The ID must be a short integer. To compose the CLUSTER_KEY, use the following template:
hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent
If both clusters use the same ZooKeeper cluster, you must use a different zookeeper.znode.parent, because they
cannot write in the same folder.
6. On the source cluster, configure each column family to be replicated by setting its REPLICATION_SCOPE to 1,
using commands such as the following in HBase Shell.
7. Verify that replication is occurring by examining the logs on the source cluster for messages such as the following.
8. To verify the validity of replicated data, use the included VerifyReplication MapReduce job on the source
cluster, providing it with the ID of the replication peer and table name to verify. Other options are available, such
as a time range or specific families to verify.
The command has the following form:
hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication
[--starttime=timestamp1] [--stoptime=timestamp] [--families=comma separated list of
families] <peerId> <tablename>
The VerifyReplication command prints GOODROWS and BADROWS counters to indicate rows that did and did
not replicate correctly.
Note:
Some changes are not replicated and must be propagated by other means, such as Snapshots or
CopyTable. See Initiating Replication When Data Already Exists on page 412 for more details.
• Data that existed in the master before replication was enabled.
• Operations that bypass the WAL, such as when using BulkLoad or API calls such as
writeToWal(false).
2. Using the hadoop fs command, put the data into HDFS. This example places the file into an /imported_data/
directory.
3. Create and register a new HBase table in HCatalog, using the hcat command, passing it a DDL file to represent
your table. You could also register an existing HBase table, using the same command. The DDL file format is
specified as part of the Hive REST API. The following example illustrates the basic mechanism.
CREATE TABLE
zones_frequency_table (id STRING, ngram STRING, year STRING, freq STRING, sources STRING)
STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
TBLPROPERTIES (
'hbase.table.name' = 'zones_frequency_table',
'hbase.columns.mapping' = 'd:ngram,d:year,d:freq,d:sources',
'hcat.hbase.output.bulkMode' = 'true'
);
$ hcat -f zones_frequency_table.ddl
4. Create a Pig file to process the TSV file created in step 1, using the DDL file created in step 3. Modify the file names
and other parameters in this command to match your values if you use data different from this working example.
USING PigStorage('\t') indicates that the input file is tab-delimited. For more details about Pig syntax, see
the Pig Latin reference documentation.
...
HTable table = null;
try {
table = myCode.createTable(tableName, fam);
int i = 1;
List<Put> puts = new ArrayList<Put>();
for (String labelExp : labelExps) {
Put put = new Put(Bytes.toBytes("row" + i));
put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value);
puts.add(put);
i++;
}
table.put(puts);
} finally {
if (table != null) {
table.flushCommits();
}
}
...
API, which allows you to write HBase applications in Python, C, C++, or another language that Thrift supports. The Thrift
Proxy API is slower than the Java API and may have fewer features. T use the Thrift Proxy API, you need to configure
and run the HBase Thrift server on your cluster. See Installing and Starting the HBase Thrift Server. You also need to
install the Apache Thrift compiler on your development system.
After the Thrift server is configured and running, generate Thrift bindings for the language of your choice, using an IDL
file. A HBase IDL file named HBase.thrift is included as part of HBase. After generating the bindings, copy the Thrift
libraries for your language into the same directory as the generated bindings. In the following Python example, these
libraries provide the thrift.transport and thrift.protocol libraries. These commands show how you might
generate the Thrift bindings for Python and copy the libraries on a Linux system.
$ mkdir HBaseThrift
$ cd HBaseThrift/
$ thrift -gen py /path/to/Hbase.thrift
$ mv gen-py/* .
$ rm -rf gen-py/
$ mkdir thrift
$ cp -rp ~/Downloads/thrift-0.9.0/lib/py/src/* ./thrift/
The following iexample shows a simple Python application using the Thrift Proxy API.
mutationsbatch.append(Hbase.BatchMutation(row=rowkey,mutations=mutations))
transport.close()
The Thrift Proxy API does not support writing to HBase clusters that are secured using Kerberos.
This example was modified from the following two blog posts on http://www.cloudera.com. See them for more details.
• Using the HBase Thrift Interface, Part 1
• Using the HBase Thrift Interface, Part 2
The REST Proxy API does not support writing to HBase clusters that are secured using Kerberos.
For full documentation and more examples, see the REST Proxy API documentation.
Using Flume
Apache Flume is a fault-tolerant system designed for ingesting data into HDFS, for use with Hadoop. You can configure
Flume to write data directly into HBase. Flume includes two different sinks designed to work with HBase: HBaseSink
(org.apache.flume.sink.hbase.HBaseSink) and AsyncHBaseSink (org.apache.flume.sink.hbase.AsyncHBaseSink). HBaseSink
supports HBase IPC calls introduced in HBase 0.96, and allows you to write data to an HBase cluster that is secured by
Kerberos, whereas AsyncHBaseSink does not. However, AsyncHBaseSink uses an asynchronous model and guarantees
atomicity at the row level.
You configure HBaseSink and AsyncHBaseSink nearly identically. Following is an example configuration for each. Bold
lines highlight differences in the configurations. For full documentation about configuring HBaseSink and AsyncHBaseSink,
see the Flume documentation. The table, columnFamily, and column parameters correlate to the HBase table,
column family, and column where the data is to be imported. The serializer is the class that converts the data at the
source into something HBase can use. Configure your sinks in the Flume configuration file.
In practice, you usually need to write your own serializer, which implements either AsyncHBaseEventSerializer or
HBaseEventSerializer. The HBaseEventSerializer converts Flume Events into one or more HBase Puts, sends them to
the HBase cluster, and is closed when the HBaseSink stops. AsyncHBaseEventSerializer starts and listens for Events.
When it receives an Event, it calls the setEvent method and then calls the getActions and getIncrements methods.
When the AsyncHBaseSink is stopped, the serializer cleanUp method is called. These methods return PutRequest and
AtomicIncrementRequest, which are part of the asynchbase API.
AsyncHBaseSink:
HBaseSink:
host1.sinks.sink1.serializer.incrementColumn = icol
host1.channels.ch1.type=memory
The following serializer, taken from an Apache Flume blog post by Dan Sandler, splits the event body based on a
delimiter and inserts each split into a different column. The row is defined in the event header. When each event is
received, a counter is incremented to track the number of events received.
/**
* A serializer for the AsyncHBaseSink, which splits the event body into
* multiple columns and inserts them into a row whose key is available in
* the headers
*/
public class SplittingSerializer implements AsyncHbaseEventSerializer {
private byte[] table;
private byte[] colFam;
private Event currentEvent;
private byte[][] columnNames;
private final List<PutRequest> puts = new ArrayList<PutRequest>();
private final List<AtomicIncrementRequest> incs = new
ArrayList<AtomicIncrementRequest>();
private byte[] currentRowKey;
private final byte[] eventCountCol = "eventCount".getBytes(); @Override
public void initialize(byte[] table, byte[] cf) {
this.table = table;
this.colFam = cf;
} @Override
public void setEvent(Event event) {
// Set the event and verify that the rowKey is not present
this.currentEvent = event;
String rowKeyStr = currentEvent.getHeaders().get("rowKey");
if (rowKeyStr == null) {
throw new FlumeException("No row key found in headers!");
}
currentRowKey = rowKeyStr.getBytes();
} @Override
public List<PutRequest> getActions() {
// Split the event body and get the values for the columns
String eventStr = new String(currentEvent.getBody());
String[] cols = eventStr.split(",");
puts.clear();
for (int i = 0; i < cols.length; i++) {
//Generate a PutRequest for each column.
PutRequest req = new PutRequest(table, currentRowKey, colFam,
columnNames[i], cols[i].getBytes());
puts.add(req);
}
return puts;
} @Override
public List<AtomicIncrementRequest> getIncrements() {
incs.clear();
//Increment the number of events received
incs.add(new AtomicIncrementRequest(table, "totalEvents".getBytes(), colFam,
eventCountCol));
return incs;
} @Override
public void cleanUp() {
table = null;
colFam = null;
currentEvent = null;
columnNames = null;
currentRowKey = null;
} @Override
public void configure(Context context) {
//Get the column names from the configuration
String cols = new String(context.getString("columns"));
String[] names = cols.split(",");
byte[][] columnNames = new byte[names.length][];
int i = 0;
for(String name : names) {
columnNames[i++] = name.getBytes();
}
} @Override
Using Spark
You can write data to HBase from Apache Spark by using def saveAsHadoopDataset(conf: JobConf): Unit.
This example is adapted from a post on the spark-users mailing list.
import org.apache.hadoop.hbase.mapred.TableOutputFormat
import org.apache.hadoop.hbase.client
// ... some other settings
Next, provide the mapping between how the data looks in Spark and how it should look in HBase. The following example
assumes that your HBase table has two column families, col_1 and col_2, and that your data is formatted in sets of
three in Spark, like (row_key, col_1, col_2).
new PairRDDFunctions(localData.map(convert)).saveAsHadoopDataset(jobConfig)
package org.apache.spark.streaming.examples
import java.util.Properties
import kafka.producer._
import org.apache.hadoop.hbase.mapred.TableOutputFormat
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.mapred.JobConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.{ PairRDDFunctions, RDD }
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.spark.streaming.kafka._
object MetricAggregatorHBase {
def main(args : Array[String]) {
if (args.length < 6) {
System.err.println("Usage: MetricAggregatorTest <master> <zkQuorum> <group> <topics>
<destHBaseTableName> <numThreads>")
System.exit(1)
}
ssc.start
ssc.awaitTermination
}
record.add(Bytes.toBytes("metric"), Bytes.toBytes("col"),
Bytes.toBytes(value.toString))
producer.send(messages : _*)
Thread.sleep(100)
}
}
}
Note: In the current implementation of MultiWAL, incoming edits are partitioned by Region. Therefore,
throughput to a single Region is not increased.
To configure MultiWAL for a RegionServer, set the value of the property hbase.wal.provider to multiwal and
restart the RegionServer. To disable MultiWAL for a RegionServer, unset the property and restart the RegionServer.
RegionServers using the original WAL implementation and those using the MultiWAL implementation can each handle
recovery of either set of WALs, so a zero-downtime configuration update is possible through a rolling restart.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
1. Edit hbase-site.xml on each RegionServer where you want to enable MultiWAL. Add the following property
by pasting the XML.
<property>
<name>hbase.wal.provider</name>
<value>multiwal</value>
</property>
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
<property>
<name>hfile.format.version</name>
<value>3</value>
</property>
Restart HBase. Changes will take effect for a given region during its next major compaction.
Configuring Columns to Store MOBs
Use the following options to configure a column to store MOBs:
• IS_MOB is a Boolean option, which specifies whether or not the column can store MOBs.
• MOB_THRESHOLD configures the number of bytes at which an object is considered to be a MOB. If you do not
specify a value for MOB_THRESHOLD, the default is 100 KB. If you write a value larger than this threshold, it is
treated as a MOB.
You can configure a column to store MOBs using the HBase Shell or the Java API.
Using HBase Shell:
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400}
hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD =>
102400}
<property>
<name>hbase.mob.cache.evict.period</name>
<value>5000</value>
</property>
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
Because there can be a large number of MOB files at any time, as compared to the number of HFiles, MOB files are
not always kept open. The MOB file reader cache is a LRU cache which keeps the most recently used MOB files open.
To customize the configuration of the MOB file reader's cache on each RegionServer, configure the MOB cache properties
in the RegionServer's hbase-site.xml. Customize the configuration to suit your environment, and restart or rolling
restart the RegionServer. Cloudera recommends testing your configuration with the default settings first. The following
example sets the hbase.mob.cache.evict.period property to 5000 seconds. See Table 3: HBase MOB Cache
Properties on page 130 for a full list of configurable properties for HBase MOB.
<property>
<name>hbase.mob.cache.evict.period</name>
<value>5000</value>
</property>
• threshold is the threshold at which cells are considered to be MOBs. The default is 1 kB, expressed in bytes.
• minMobDataSize is the minimum value for the size of MOB data. The default is 512 B, expressed in bytes.
• maxMobDataSize is the maximum value for the size of MOB data. The default is 5 kB, expressed in bytes.
Compacting MOB Files Manually
You can trigger manual compaction of MOB files manually, rather than waiting for them to be triggered by your
configuration, using the HBase Shell commands compact_mob and major_compact_mob. Each of these commands
requires the first parameter to be the table name, and takes an optional column family name as the second argument.
If the column family is provided, only that column family's files are compacted. Otherwise, all MOB-enabled column
families' files are compacted.
This functionality is also available using the API, using the Admin.compact and Admin.majorCompact methods.
hbase.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
hbase.sink.ganglia.servers=<Ganglia server>:<port>
hbase.sink.ganglia.period=10
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
7. Click Save Changes to commit the changes.
8. Restart the role.
9. Restart the service.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
hbase.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
hbase.sink.ganglia.servers=<Ganglia server>:<port>
hbase.sink.ganglia.period=10
Managing HDFS
The section contains configuration tasks for the HDFS service. For information on configuring HDFS for high availability,
see HDFS High Availability on page 291.
Important: Configuring a new nameservice shut downs the services that depend upon HDFS. Once
the new nameservice has been started, the services that depend upon HDFS must be restarted, and
the client configurations must be redeployed. (This can be done as part of the Add Nameservice
workflow, as an option.)
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
7. Click Save Changes to commit the changes.
8. Click the Instances tab. The Federation and High Availability section displays with the nameservice listed.
Editing the List of Mountpoints for a Nameservice Using Cloudera Manager
1. Go to the HDFS service.
2. Click the Instances tab. The Federation and High Availability section displays with the nameservices listed.
3. Select Actions > Edit. In the Mount Points field, change the mount point to a list of mount points in the namespace
that the nameservice will manage.
4. Click OK.
Adding a Nameservice Using Cloudera Manager
The instructions below for adding a nameservice assume that one nameservice is already set up. The first nameservice
cam be set up either by configuring the first nameservice or by enabling HDFS high availability.
1. Go to the HDFS service.
2. Click the Instances tab. At the top of this page you should see the Federation and High Availability section. If this
section does not appear, it means you do not have any nameservices configured. You must have one nameservice
already configured in order to add a nameservice.
3. Click the Add Nameservice button.
a. In the Nameservice Name field, enter a name for the new nameservice. The name must be unique and can
contain only alphanumeric characters.
b. In the Mount Points field, enter at least one mount point for the nameservice. This defines the portion of
HDFS that will be managed under the new nameservice. (Click the to the right of the field to add a new
mount point). You cannot use "/" as a mount point; you must specify HDFS directories by name.
• The mount points must be unique for this nameservice; you cannot specify any of the same mount points
you have used for other nameservices.
• You can specify mount points that do not yet exist, and create the corresponding directories in a later
step in this procedure.
• If you want to use a mount point previously associated with another nameservice you must first remove
that mount point from that service. You can do this using the Edit command from the Actions menu for
that nameservice, and later add the mount point to the new nameservice.
• After you have brought up the new nameservice, you must create the directories that correspond with
the mount points you specified in the new namespace.
• If a mount point corresponds to a directory that formerly was under a different nameservice, you must
also move any contents of that directory, if appropriate as described in step 8.
• If an HBase service is set to depend on the federated HDFS service, edit the mount points of the existing
nameservice to reference:
– HBase root directory (default /hbase)
– MapReduce system directory (default /tmp/mapred/system)
– MapReduce JobTracker staging root directory (default value /user).
c. If you want to configure high availability for the nameservice, leave the Highly Available checkbox checked.
d. Click Continue.
4. Select the hosts on which the new NameNode and Secondary NameNodes will be created. (These must be hosts
that are not already running other NameNode or SecondaryNameNode instances, and their /dfs/nn and /dfs/snn
directories should be empty if they exist. Click Continue.
5. Enter or confirm the directory property values (these will differ depending on whether you are enabling high
availability for this nameservice, or not).
6. Select the Start Dependent Services checkbox if you need to create directories or move data onto the new
nameservice. Leave this checked if you want the workflow to restart services and redeploy the client configurations
as the last steps in the workflow.
7. Click Continue. If the process finished successfully, click Finish. The new nameservice displays in the Federation
and High Availability section in the Instances tab of the HDFS service.
8. Create the directories you want under the new nameservice using the CLI:
a. To create a directory in the new namespace, use the command hadoop fs -mkdir
/nameservices/nameservice/directory where nameservice is the new nameservice you just created
and directory is the directory that corresponds to a mount point you specified.
b. To move data from one nameservice to another, use distcp or manual export/import. dfs -cp and dfs
-mv will not work.
c. Verify that the directories and data are where you expect them to be.
9. Restart the dependent services.
Note: The monitoring configurations at the HDFS level apply to all nameservices. If you have two
nameservices, it is not possible to disable a check on one but not the other. Likewise, it's not possible
to have different event thresholds for the two nameservices.
Also see Changing a Nameservice Name for Highly Available HDFS Using Cloudera Manager on page 313.
Nameservice and Quorum-based Storage
With Quorum-based Storage, JournalNodes are shared across nameservices. So, if JournalNodes are present in an
HDFS service, all nameservices will have Quorum-based Storage enabled. To override this:
• The dfs.namenode.shared.edits.dir configuration of the two NameNodes of a high availability nameservice
should be configured to include the value of the dfs.namenode.name.dirs setting, or
• The dfs.namenode.edits.dir configuration of the one NameNode of a non-high availability nameservice
should be configured to include the value of the dfs.namenode.name.dirs setting.
NameNodes
NameNodes maintain the namespace tree for HDFS and a mapping of file blocks to DataNodes where the data is stored.
A simple HDFS cluster can have only one primary NameNode, supported by a secondary NameNode that periodically
compresses the NameNode edits log file that contains a list of HDFS metadata modifications. This reduces the amount
of disk space consumed by the log file on the NameNode, which also reduces the restart time for the primary NameNode.
A high availability cluster contains two NameNodes: active and standby.
Formatting the NameNode and Creating the /tmp Directory
Formatting the NameNode and Creating the /tmp Directory Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
When you add an HDFS service, the wizard automatically formats the NameNode and creates the /tmp directory on
HDFS. If you quit the wizard or it does not finish, you can format the NameNode and create the /tmp directory outside
the wizard by doing these steps:
1. Stop the HDFS service if it is running. See Starting, Stopping, and Restarting Services on page 39.
2. Click the Instances tab.
3. Click the NameNode role instance.
4. Select Actions > Format.
5. Start the HDFS service.
6. Select Actions > Create /tmp Directory.
Formatting the NameNode and Creating the /tmp Directory Using the Command Line
See Formatting the NameNode.
# cd /data/dfs/nn
# tar -cvf /root/nn_backup_data.tar .
./
./current/
./current/fsimage
./current/fstime
./current/VERSION
./current/edits
./image/
./image/fsimage
If there is a file with the extension lock in the NameNode data directory, the NameNode most likely is still running.
Repeat the steps, starting by shutting down the NameNode role.
Restoring HDFS Metadata From a Backup Using Cloudera Manager
The following process assumes a scenario where both NameNode hosts have failed and you must restore from a
backup.
1. Remove the NameNode, JournalNode, and Failover Controller roles from the HDFS service.
2. Add the host on which the NameNode role will run.
3. Create the NameNode data directory, ensuring that the permissions, ownership, and group are set correctly.
4. Copy the backed up files to the NameNode data directory.
5. Add the NameNode role to the host.
6. Add the Secondary NameNode role to another host.
7. Enable high availability. If not all roles are started after the wizard completes, restart the HDFS service. Upon
startup, the NameNode reads the fsimage file and loads it into memory. If the JournalNodes are up and running
and there are edit files present, any edits newer than the fsimage are applied.
Moving NameNode Roles
This section describes two procedures for moving NameNode roles. Both procedures require cluster downtime. If
highly availability is enabled for the NameNode, you can use a Cloudera Manager wizard to automate the migration
process. Otherwise you must manually delete and add the NameNode role to a new host.
After moving a NameNode, if you have a Hive or Impala service, perform the steps in NameNode Post-Migration Steps
on page 138.
Moving Highly Available NameNode, Failover Controller, and JournalNode Roles Using the Migrate Roles Wizard
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
The Migrate Roles wizard allows you to move roles of a highly available HDFS service from one host to another. You
can use it to move NameNode, JournalNode, and Failover Controller roles.
• IP addresses
• Rack name
Select the checkboxes next to the desired host. The list of available roles to migrate displays. Deselect any roles
you do not want to migrate. When migrating a NameNode, the co-located Failover Controller must be migrated
as well.
6. Click the Destination Host text field and specify the host to which the roles will be migrated. On destination hosts,
indicate whether to delete data in the NameNode data directories and JournalNode edits directory. If you choose
not to delete data and such role data exists, the Migrate Roles command will not complete successfully.
7. Acknowledge that the migration process incurs service unavailability by selecting the Yes, I am ready to restart
the cluster now checkbox.
8. Click Continue. The Command Progress screen displays listing each step in the migration process.
9. When the migration completes, click Finish.
Moving a NameNode to a Different Host Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
1. If the host to which you want to move the NameNode is not in the cluster, follow the instructions in Adding a Host
to the Cluster on page 54 to add the host.
2. Stop all cluster services.
3. Make a backup of the dfs.name.dir directories on the existing NameNode host. Make sure you back up the
fsimage and edits files. They should be the same across all of the directories specified by the dfs.name.dir
property.
4. Copy the files you backed up from dfs.name.dir directories on the old NameNode host to the host where you
want to run the NameNode.
5. Go to the HDFS service.
6. Click the Instances tab.
7. Select the checkbox next to the NameNode role instance and then click the Delete button. Click Delete again to
confirm.
8. In the Review configuration changes page that appears, click Skip.
9. Click Add Role Instances to add a NameNode role instance.
10. Select the host where you want to run the NameNode and then click Continue.
11. Specify the location of the dfs.name.dir directories where you copied the data on the new host, and then click
Accept Changes.
12. Start cluster services. After the HDFS service has started, Cloudera Manager distributes the new configuration
files to the DataNodes, which will be configured with the IP address of the new NameNode host.
NameNode Post-Migration Steps
After moving a NameNode, if you have a Hive or Impala service, perform the following steps:
1. Go to the Hive service.
2. Stop the Hive service.
3. Select Actions > Update Hive Metastore NameNodes.
4. If you have an Impala service, restart the Impala service or run an INVALIDATE METADATA query.
DataNodes
DataNodes store data in a Hadoop cluster and is the name of the daemon that manages the data. File data is replicated
on multiple DataNodes for reliability and so that localized computation can be executed near the data.
How NameNode Manages Blocks on a Failed DataNode
After a period without any heartbeats (which by default is 10.5 minutes), a DataNode is assumed to be failed. The
following describes how the NameNode manages block replication in such cases.
1. NameNode determines which blocks were on the failed DataNode.
2. NameNode locates other DataNodes with copies of these blocks.
3. The DataNodes with block copies are instructed to copy those blocks to other DataNodes to maintain the configured
replication factor.
4. Follow the procedure in Replacing a Disk on a DataNode Host on page 139 or Performing Disk Hot Swap for
DataNodes on page 140 to bring a repaired DataNode back online.
Replacing a Disk on a DataNode Host
Minimum Required Role: Operator (also provided by Configurator, Cluster Administrator, Full Administrator)
For CDH 5.3 and higher, see Performing Disk Hot Swap for DataNodes on page 140.
If one of your DataNode hosts experiences a disk failure, follow this process to replace the disk:
1. Stop managed services.
2. Decommission the DataNode role instance.
3. Replace the failed disk.
4. Recommission the DataNode role instance.
5. Run the HDFS fsck utility to validate the health of HDFS. The utility normally reports over-replicated blocks
immediately after a DataNode is reintroduced to the cluster, which is automatically corrected over time.
6. Start managed services.
Adding and Removing Storage Directories for DataNodes
Adding and Removing Storage Directories Using Cloudera Manager
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
6. Click Save Changes to commit the changes.
7. Restart the role.
Configuring Storage-Balancing for DataNodes Using the Command Line
See Configuring Storage-Balancing for DataNodes.
Performing Disk Hot Swap for DataNodes
This section describes how to replace HDFS disks without shutting down a DataNode. This is referred to as hot swap.
Warning: Change the value of this property only for the specific DataNode instance where
you are planning to hot swap the disk. Do not edit the role group value for this property.
Doing so will cause data loss.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
Use these instructions to perform hot swap of disks in a cluster that is not managed by Cloudera Manager
To add and remove disks:
1. If you are adding disks, format and mount them.
2. Change the value of dfs.datanode.data.dir in hdfs-site.xml on the DataNode to reflect the directories
that will be used from now on (add new points and remove obsolete ones). For more information, see the
instructions for DataNodes under Configuring Local Storage Directories.
3. Start the reconfiguration process:
• If Kerberos is enabled:
where HOST:PORT is the DataNode's dfs.datanode.ipc.address (or its hostname and the port specified in
dfs.datanode.ipc.address; for example dnhost1.example.com:5678)
To check on the progress of the reconfiguration, you can use the status option of the command; for example,
if Kerberos is not enabled:
4. Once the reconfiguration is complete, unmount any disks you have removed from the configuration.
5. Run the HDFS fsck utility to validate the health of HDFS.
To perform maintenance on a disk:
1. Change the value of dfs.datanode.data.dir in hdfs-site.xml on the DataNode to exclude the mount point
directories that reside on the affected disk and reflect only the directories that will be used during the maintenance
window. For more information, see the instructions for DataNodes under Configuring Local Storage Directories.
2. Start the reconfiguration process:
• If Kerberos is enabled:
where HOST:PORT is the DataNode's dfs.datanode.ipc.address, or its hostname and the port specified in
dfs.datanode.ipc.address.
To check on the progress of the reconfiguration, you can use the status option of the command; for example,
if Kerberos is not enabled:
JournalNodes
High-availabilty clusters use JournalNodes to synchronize active and standby NameNodes. The active NameNode writes
to each JournalNode with changes, or "edits," to HDFS namespace metadata. During failover, the standby NameNode
applies all edits from the JournalNodes before promoting itself to the active state.
Moving the JournalNode Edits Directory
Moving the JournalNode Edits Directory for an Role Instance Using Cloudera Manager
To change the location of the edits directory for one JournalNode instance:
1. Reconfigure the JournalNode Edits Directory.
a. Go to the HDFS service in Cloudera Manager.
b. Click JournalNode under Status Summary.
c. Click the JournalNode link for the instance you are changing.
d. Click the Configuration tab.
e. Set dfs.journalnode.edits.dir to the path of the new jn directory.
cp -a /<old_path_to_jn_dir>/jn /<new_path_to_jn_dir>/jn
mv /<old_path_to_jn_dir>/jn /<old_path_to_jn_dir>/jn_to_delete
cp -a /<old_path_to_jn_dir>/jn /<new_path_to_jn_dir>/jn
mv /<old_path_to_jn_dir>/jn /<old_path_to_jn_dir>/jn_to_delete
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
Configure the following properties in hdfs-site.xml to enable short-circuit reads in a cluster that is not managed
by Cloudera Manager:
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.streams.cache.size</name>
<value>1000</value>
</property>
<property>
<name>dfs.client.read.shortcircuit.streams.cache.expiry.ms</name>
<value>10000</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hadoop-hdfs/dn._PORT</value>
</property>
Note: The text _PORT appears just as shown; you do not need to substitute a number.
Important:
• The trash feature is disabled by default. Cloudera recommends that you enable it on all production
clusters.
• The trash feature works by default only for files and directories deleted using the Hadoop shell.
Files or directories deleted programmatically using other interfaces (WebHDFS or the Java APIs,
for example) are not moved to trash, even if trash is enabled, unless the program has implemented
a call to the trash functionality. (Hue, for example, implements trash as of CDH 4.4.)
Users can bypass trash when deleting files using the shell by specifying the -skipTrash option
to the hadoop fs -rm -r command. This can be useful when it is necessary to delete files that
are too large for the user's quota.
Note: The trash interval is measured from the point at which the files are moved to trash, not
from the last time the files were modified.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
5. Click Save Changes to commit the changes.
6. Restart all NameNodes.
Configuring HDFS Trash Using the Command Line
See Enabling Trash.
HDFS Balancers
HDFS data might not always be distributed uniformly across DataNodes. One common reason is addition of new
DataNodes to an existing cluster. HDFS provides a balancer utility that analyzes block placement and balances data
across the DataNodes. The balancer moves blocks until the cluster is deemed to be balanced, which means that the
utilization of every DataNode (ratio of used space on the node to total capacity of the node) differs from the utilization
of the cluster (ratio of used space on the cluster to total capacity of the cluster) by no more than a given threshold
percentage. The balancer does not balance between individual volumes on a single DataNode.
Configuring and Running the HDFS Balancer Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
In Cloudera Manager, the HDFS balancer utility is implemented by the Balancer role. The Balancer role usually shows
a health of None on the HDFS Instances tab because it does not run continuously.
The Balancer role is normally added (by default) when the HDFS service is installed. If it has not been added, you must
add a Balancer role in order to rebalance HDFS and to see the Rebalance action.
Configuring and Running the HDFS Balancer Using the Command Line
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
The HDFS balancer re-balances data across the DataNodes, moving blocks from overutilized to underutilized nodes.
As the system administrator, you can run the balancer from the command-line as necessary -- for example, after adding
new DataNodes to the cluster.
Points to note:
• The balancer requires the capabilities of an HDFS superuser (for example, the hdfs user) to run.
• The balancer does not balance between individual volumes on a single DataNode.
• You can run the balancer without parameters, as follows:
Note: If Kerberos is enabled, do not use commands in the form sudo -u <user> hadoop
<command>; they will fail with a security error. Instead, use the following commands: $ kinit
<user> (if you are using a password) or $ kinit -kt <keytab> <principal> (if you are
using a keytab) and then, for each command executed by this user, $ <command>
This runs the balancer with a default threshold of 10%, meaning that the script will ensure that disk usage on each
DataNode differs from the overall usage in the cluster by no more than 10%. For example, if overall usage across
all the DataNodes in the cluster is 40% of the cluster's total disk-storage capacity, the script ensures that each
DataNode's disk usage is between 30% and 50% of that DataNode's disk-storage capacity.
• You can run the script with a different threshold; for example:
This specifies that each DataNode's disk usage must be (or will be adjusted to be) within 5% of the cluster's overall
usage.
• You can adjust the network bandwidth used by the balancer, by running the dfsadmin -setBalancerBandwidth
command before you run the balancer; for example:
where newbandwidth is the maximum amount of network bandwidth, in bytes per second, that each DataNode
can use during the balancing operation. For more information about the bandwidth command, see
BalancerBandwidthCommand.
• The balancer can take a long time to run, especially if you are running it for the first time, or do not run it regularly.
Enabling WebHDFS
Enabling WebHDFS Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
To enable WebHDFS, proceed as follows:
1. Select the HDFS service.
2. Click the Configuration tab.
3. Select Scope > HDFS-1 (Service Wide)
4. Select the Enable WebHDFS property.
5. Click the Save Changes button.
Adding HttpFS
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
Apache Hadoop HttpFS is a service that provides HTTP access to HDFS.
HttpFS has a REST HTTP API supporting all HDFS filesystem operations (both read and write).
Common HttpFS use cases are:
• Read and write data in HDFS using HTTP utilities (such as curl or wget) and HTTP libraries from languages other
than Java (such as Perl).
• Transfer data between HDFS clusters running different versions of Hadoop (overcoming RPC versioning issues),
for example using Hadoop DistCp.
• Read and write data in HDFS in a cluster behind a firewall. (The HttpFS server acts as a gateway and is the only
system that is allowed to send and receive data through the firewall).
HttpFS supports Hadoop pseudo-authentication, HTTP SPNEGO Kerberos, and additional authentication mechanisms
using a plugin API. HttpFS also supports Hadoop proxy user functionality.
The webhdfs client file system implementation can access HttpFS using the Hadoop filesystem command (hadoop
fs), by using Hadoop DistCp, and from Java applications using the Hadoop file system Java API.
The HttpFS HTTP REST API is interoperable with the WebHDFS REST HTTP API.
For more information about HttpFS, see Hadoop HDFS over HTTP.
The HttpFS role is required for Hue when you enable HDFS high availability.
Note:
When you set this property, Cloudera Manager regenerates the keytabs for HttpFS roles. The principal
in these keytabs contains the load balancer hostname.
If there is a Hue service that depends on this HDFS service, the Hue service has the option to use the
load balancer as its HDFS Web Interface Role.
Important:
HDFS does not currently provide ACL support for an NFS gateway.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
The subsections that follow provide information on installing and configuring the gateway.
$ umount /hdfs_nfs_mount
• On a SLES system:
• On a RHEL-compatible system:
• On a SLES system:
6. Now proceed with Starting the NFSv3 Gateway on page 152, and then remount the HDFS gateway mounts.
Installing the Packages for the First Time
On RHEL and similar systems:
Install the following packages on the cluster host you choose for NFSv3 Gateway machine (we'll refer to it as the NFS
server from here on).
• nfs-utils
• nfs-utils-lib
• hadoop-hdfs-nfs3
The first two items are standard NFS utilities; the last is a CDH package.
Use the following command:
On SLES:
Install nfs-utils on the cluster host you choose for NFSv3 Gateway machine (referred to as the NFS server from
here on):
<property>
<name>dfs.namenode.accesstime.precision</name>
<value>3600000</value>
<description>The access time for an HDFS file is precise up to this value. The
default value is 1 hour.
Setting a value of 0 disables access times for HDFS.</description>
</property>
<property>
<name>dfs.nfs3.dump.dir</name>
<value>/tmp/.hdfs-nfs</value>
</property>
Note:
You should change the location of the file dump directory, which temporarily saves out-of-order
writes before writing them to HDFS. This directory is needed because the NFS client often reorders
writes, and so sequential writes can arrive at the NFS gateway in random order and need to be
saved until they can be ordered correctly. After these out-of-order writes have exceeded 1MB in
memory for any given file, they are dumped to the dfs.nfs3.dump.dir (the memory threshold
is not currently configurable).
Make sure the directory you choose has enough space. For example, if an application uploads 10
files of 100MB each, dfs.nfs3.dump.dir should have roughly 1GB of free space to allow for
a worst-case reordering of writes to every file.
3. Configure the user running the gateway (normally the hdfs user as in this example) to be a proxy for other users.
To allow the hdfs user to be a proxy for all other users, add the following entries to core-site.xml on the
NameNode:
<property>
<name>hadoop.proxyuser.hdfs.groups</name>
<value>*</value>
<description>
Set this to '*' to allow the gateway user to proxy any group.
</description>
</property>
<property>
<name>hadoop.proxyuser.hdfs.hosts</name>
<value>*</value>
<description>
Set this to '*' to allow requests from any hosts to be proxied.
</description>
</property>
$ rpcinfo -p <nfs_server_ip_address>
To verify that the HDFS namespace is exported and can be mounted, use the showmount command.
$ showmount -e <nfs_server_ip_address>
Note:
When you create a file or directory as user hdfs on the client (that is, in the HDFS file system imported
using the NFS mount), the ownership may differ from what it would be if you had created it in HDFS
directly. For example, ownership of a file created on the client might be hdfs:hdfs when the same
operation done natively in HDFS resulted in hdfs:supergroup. This is because in native HDFS, BSD
semantics determine the group ownership of a newly-created file: it is set to the same group as the
parent directory where the file is created. When the operation is done over NFS, the typical Linux
semantics create the file with the group of the effective GID (group ID) of the process creating the
file, and this characteristic is explicitly passed to the NFS gateway and HDFS.
• File counts are based on the intended replication factor for the files; changing the replication factor for a file will
credit or debit quotas.
About disk space limits
• The space quota is a hard limit on the number of bytes used by files in the tree rooted at the directory being
configured.
• Each replica of a block counts against the quota.
• The disk space quota calculation takes replication into account, so it uses the replicated size of each file, not the
user-facing size.
• The disk space quota calculation includes open files (files presently being written), as well as files already written.
• Block allocations for files being written will fail if the quota would not allow a full block to be written.
Setting HDFS Quotas Using Cloudera Manager
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
1. From the HDFS service page, select the File Browser tab.
2. Browse the file system to find the directory for which you want to set quotas.
3. Click the directory name so that it appears in the gray panel above the listing of its contents and in the detail
section to the right of the File Browser table.
4. Click the Edit Quota button for the directory. A Manage Quota pop-up displays, where you can set file count or
disk space limits for the directory you have selected.
5. When you have set the limits you want, click OK.
Setting HDFS Quotas Using the Command Line
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
where n is a number of bytes and directory is the directory the quota applies to. You can specify multiple directories
in a single command; n applies to each.
To remove space quotas from a directory:
where n is the number of file and directory names in directory. You can specify multiple directories in a single command;
n applies to each.
To remove name quotas from a directory:
Before you start: You must have a working HDFS cluster and know the hostname and port that your NameNode exposes.
To install hadoop-hdfs-fuses On Red Hat-compatible systems:
You now have everything you need to begin mounting HDFS on Linux.
To set up and test your mount point in a non-HA installation:
$ mkdir -p <mount_point>
$ hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port> <mount_point>
$ mkdir -p <mount_point>
$ hadoop-fuse-dfs dfs://<nameservice_id> <mount_point>
where nameservice_id is the value of fs.defaultFS. In this case the port defined for
dfs.namenode.rpc-address.[nameservice ID].[name node ID] is used automatically. See Enabling HDFS
HA on page 293 for more information about these properties.
You can now run operations as if they are on your mount point. Press Ctrl+C to end the fuse-dfs program, and
umount the partition if it is still mounted.
Note:
To find its configuration directory, hadoop-fuse-dfs uses the HADOOP_CONF_DIR configured at
the time the mount command is invoked.
$ umount <mount_point>
You can now add a permanent HDFS mount which persists through reboots.
To add a system mount:
For example:
Note:
In an HA deployment, use the HDFS nameservice instead of the NameNode URI; that is, use the
value of dfs.nameservices in hdfs-site.xml.
$ mount <mount_point>
Your system is now configured to allow you to use the ls command and use that mount point as if it were a normal
system disk.
For more information, see the help for hadoop-fuse-dfs:
$ hadoop-fuse-dfs --help
Be careful not to set the minimum to a higher value than the maximum.
Use Cases
Centralized cache management is best used for files that are accessed repeatedly. For example, a fact table in Hive
that is often used in JOIN clauses is a good candidate for caching. Caching the input of an annual reporting query is
probably less useful, as the historical data might be read only once.
Centralized cache management is also useful for mixed workloads with performance SLAs. Caching the working set of
a high-priority workload insures that it does not contend for disk I/O with a low-priority workload.
Architecture
In this architecture, the NameNode is responsible for coordinating all the DataNode off-heap caches in the cluster. The
NameNode periodically receives a "cache report" from each DataNode which describes all the blocks cached on a given
DataNode. The NameNode manages DataNode caches by piggybacking cache and uncache commands on the DataNode
heartbeat.
The NameNode queries its set of cache directives to determine which paths should be cached. Cache directives are
persistently stored in the fsimage and edit log, and can be added, removed, and modified using Java and command-line
APIs. The NameNode also stores a set of cache pools, which are administrative entities used to group cache directives
together for resource management and enforcing permissions.
The NameNode periodically rescans the namespace and active cache directories to determine which blocks need to
be cached or uncached and assign caching to DataNodes. Rescans can also be triggered by user actions like adding or
removing a cache directive or removing a cache pool.
We do not currently cache blocks which are under construction, corrupt, or otherwise incomplete. If a cache directive
covers a symlink, the symlink target is not cached. Caching is currently done on a per-file basis, although we would like
to add block-level granularity in the future.
Concepts
Cache Directive
A cache directive defines a path that should be cached. Paths can be either directories or files. Directories are cached
non-recursively, meaning only files in the first-level listing of the directory will be cached. Directives also specify
additional parameters, such as the cache replication factor and expiration time. The replication factor specifies the
number of block replicas to cache. If multiple cache directives refer to the same file, the maximum cache replication
factor is applied.
The expiration time is specified on the command line as a time-to-live (TTL), a relative expiration time in the future.
After a cache directive expires, it is no longer considered by the NameNode when making caching decisions.
Cache Pool
A cache pool is an administrative entity used to manage groups of cache directives. Cache pools have UNIX-like
permissions that restrict which users and groups have access to the pool. Write permissions allow users to add and
remove cache directives to the pool. Read permissions allow users to list the cache directives in a pool, as well as
additional metadata. Execute permissions are unused.
Cache pools are also used for resource management. Pools can enforce a maximum limit, which restricts the number
of bytes that can be cached in aggregate by directives in the pool. Normally, the sum of the pool limits will approximately
equal the amount of aggregate memory reserved for HDFS caching on the cluster. Cache pools also track a number of
statistics to help cluster users determine what is and should be cached.
Pools also enforce a maximum time-to-live. This restricts the maximum expiration time of directives being added to
the pool.
cacheadmin Command-Line Interface
On the command-line, administrators and users can interact with cache pools and directives using the hdfs cacheadmin
subcommand. Cache directives are identified by a unique, non-repeating 64-bit integer ID. IDs are not reused even if
a cache directive is later removed. Cache pools are identified by a unique string name.
Cache Directive Commands
addDirective
Description: Add a new cache directive.
Usage: hdfs cacheadmin -addDirective -path <path> -pool <pool-name> [-force] [-replication
<replication>] [-ttl <time-to-live>]
time-to-live: Time period for which the directive is valid. Can be specified in seconds, minutes, hours, and days,
for example: 30m, 4h, 2d. The value never indicates a directive that never expires. If unspecified, the directive never
expires.
removeDirective
Description: Remove a cache directive.
Usage: hdfs cacheadmin -removeDirective <id>
Where, id: The id of the cache directive to remove. You must have write permission on the pool of the directive in
order to remove it. To see a list of PathBasedCache directive IDs, use the -listDirectives command.
removeDirectives
Description: Remove every cache directive with the specified path.
Usage: hdfs cacheadmin -removeDirectives <path>
Where, path: The path of the cache directives to remove. You must have write permission on the pool of the directive
in order to remove it.
listDirectives
Description: List PathBasedCache directives.
addPool
Description: Add a new cache pool.
Usage: hdfs cacheadmin -addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>]
[-limit <limit>] [-maxTtl <maxTtl>]
group: Group of the pool. Defaults to the primary group name of the current user.
mode: UNIX-style permissions for the pool. Permissions are specified in octal, for example: 0755. By default, this is set
to 0755.
limit: The maximum number of bytes that can be cached by directives in this pool, in aggregate. By default, no limit
is set.
maxTtl: The maximum allowed time-to-live for directives being added to the pool. This can be specified in seconds,
minutes, hours, and days, for example: 120s, 30m, 4h, 2d. By default, no maximum is set. A value of never specifies
that there is no limit.
modifyPool
Description: Modify the metadata of an existing cache pool.
Usage: hdfs cacheadmin -modifyPool <name> [-owner <owner>] [-group <group>] [-mode <mode>]
[-limit <limit>] [-maxTtl <maxTtl>]
maxTtl: The maximum allowed time-to-live for directives being added to the pool.
removePool
Description: Remove a cache pool. This also uncaches paths associated with the pool.
Usage: hdfs cacheadmin -removePool <name>
Where, name: Name of the cache pool to remove.
listPools
Description: Display information about one or more cache pools, for example: name, owner, group, permissions, and
so on.
Usage: hdfs cacheadmin -listPools [-stats] [<name>]
help
Description: Get detailed help about a command.
Usage: hdfs cacheadmin -help <command-name>
Where, command-name: The command for which to get detailed help. If no command is specified, print detailed help
for all commands.
Configuration
Native Libraries
In order to lock block files into memory, the DataNode relies on native JNI code found in libhadoop.so. Be sure to
enable JNI if you are using HDFS centralized cache management.
Configuration Properties
Required
Be sure to configure the following in /etc/default/hadoop/conf/hdfs-default.xml:
• dfs.datanode.max.locked.memory: The maximum amount of memory a DataNode will use for caching (in
bytes). The "locked-in-memory size" ulimit (ulimit -l) of the DataNode user also needs to be increased to
match this parameter (see OS Limits). When setting this value, remember that you will need space in memory for
other things as well, such as the DataNode and application JVM heaps and the operating system page cache.
Optional
The following properties are not required, but may be specified for tuning:
• dfs.namenode.path.based.cache.refresh.interval.ms: The NameNode uses this as the amount of
milliseconds between subsequent path cache rescans. This calculates the blocks to cache and each DataNode
containing a replica of the block that should cache it. By default, this parameter is set to 300000, which is five
minutes.
• dfs.datanode.fsdatasetcache.max.threads.per.volume: The DataNode uses this as the maximum
number of threads per volume to use for caching new data. By default, this parameter is set to 4.
• dfs.cachereport.intervalMsec: The DataNode uses this as the amount of milliseconds between sending a
full report of its cache state to the NameNode. By default, this parameter is set to 10000, which is 10 seconds.
• dfs.namenode.path.based.cache.block.map.allocation.percent: The percentage of the Java heap
which we will allocate to the cached blocks map. The cached blocks map is a hash map which uses chained hashing.
Smaller maps may be accessed more slowly if the number of cached blocks is large; larger maps will consume
more memory. By default, this parameter is set to 0.25 percent.
OS Limits
If you get the error Cannot start datanode because the configured max locked memory size... is
more than the datanode's available RLIMIT_MEMLOCK ulimit, that means that the operating system is
imposing a lower limit on the amount of memory that you can lock than what you have configured. To fix this, you
must adjust the ulimit -l value that the DataNode runs with. Usually, this value is configured in
/etc/security/limits.conf. However, it will vary depending on what operating system and distribution you are
using.
You will know that you have correctly configured this value when you can run ulimit -l from the shell and get back
either a higher value than what you have configured with dfs.datanode.max.locked.memory, or the string
unlimited, indicating that there is no limit. Note that it's typical for ulimit -l to output the memory lock limit in
KB, but dfs.datanode.max.locked.memory must be specified in bytes.
Note: This documentation covers only the Cloudera Manager portion of using EMC Isilon storage
with Cloudera Manager. For information about tasks performed on Isilon OneFS, see the information
hub for Cloudera on the EMC Community Network: https://community.emc.com/docs/DOC-39529.
Supported Versions
The following versions of Cloudera and Isilon products are supported:
Example:
/ifs/your-access-zone/hdfs
Note: The above is simply an example; the HDFS root directory does not have to begin with ifs
or end with hdfs.
• Create a tmp directory and set ownership to hdfs:supergroup, and permissions to 1777.
Example:
cd hdfs_root_directory
isi_run -z zone_id mkdir tmp
isi_run -z zone_id chown hdfs:supergroup tmp
isi_run -z zone_id chmod 1777 tmp
b. Create a user directory in the access zone and set ownership to hdfs:supergroup, and permissions to
755
Example:
cd hdfs_root_directory
isi_run -z zone_id mkdir user
isi_run -z zone_id chown hdfs:supergroup user
isi_run -z zone_id chmod 755 user
3. Create the service-specific users, groups, or directories for each CDH service you plan to use. Create the directories
under the access zone you have created.
Note: Many of the values provided in the examples below are default values in Cloudera Manager
and must match the Cloudera Manager configuration settings. The format for the examples is:
dir user:group permission . Create the directories below under the access zone you have
created, for example, /ifs/ your-access-zone /hdfs/
Example:
hdfs_root_directory/hbase hbase:hbase 755
• YARN (MR2)
– Create mapred group with mapred user.
– Create history directory for YARN:
Example:
hdfs_root_directory/user/history mapred:hadoop 777
Example:
hdfs_root_directory/tmp/logs mapred:hadoop 775
• Oozie
– Create oozie group with oozie user.
– Create the user directory for Oozie:
Example:
hdfs_root_directory/user/oozie oozie:oozie 775
• Flume
Example:
hdfs_root_directory/user/flume flume:flume 775
• Hive
– Create hive group with hive user.
– Create the user directory for Hive:
Example:
hdfs_root_directory/user/hive hive:hive 775
Example:
hdfs_root_directory/user/hive/warehouse hive:hive 1777
Example:
hdfs_root_directory/tmp/hive hive:supergroup 777
• Solr
– Create solr group with solr user.
– Create the data directory for Solr:
Example:
hdfs_root_directory/solr solr:solr 775
• Sqoop
– Create sqoop group with sqoop2 user.
– Create the user directory for Sqoop:
Example:
hdfs_root_directory/user/sqoop2 sqoop2:sqoop 775
• Hue
– Create hue group with hue user.
– Create sample group with sample user.
• Spark
– Create spark group with spark user.
– Create the user directory for Spark:
Example:
hdfs_root_directory/user/spark spark:spark 751
Example:
hdfs_root_directory/user/spark/applicationHistory spark:spark 1777
Once the users, groups, and directories are created in Isilon OneFS, you are ready to install Cloudera Manager.
Installing Cloudera Manager with Isilon
To install Cloudera Manager follow the instructions provided in Installation.
• The simplest installation procedure, suitable for development or proof of concept, is Installation Path A, which
uses embedded databases that are installed as part of the Cloudera Manager installation process.
• For production environments, Installation Path B - Manual Installation Using Cloudera Manager Packages describes
configuring external databases for Cloudera Manager and CDH storage needs.
If you choose parcel installation on the Cluster Installation screen, the installation wizard will point to the latest parcels
of CDH available.
On the installation wizard's Cluster Setup page, choose Custom Services, and choose the services you want installed
in the cluster. Be sure to choose Isilon among the selected services, do not select the HDFS service, and do not check
Include Cloudera Navigator at the bottom of the Cluster Setup page. Also, on the Role Assignments page, be sure to
specify the hosts that will serve as gateway roles for the Isilon service. You can add gateway roles to one, some, or all
nodes in the cluster.
Installing a Secure Cluster with Isilon
To set up a secure cluster with Isilon using Kerberos, perform the following steps:
1. Create an unsecure Cloudera Manager cluster as described above in Installing Cloudera Manager with Isilon on
page 164.
2. Follow the Isilon documentation to enable Kerberos for your access zone:
https://community.emc.com/docs/DOC-39529. This includes adding a Kerberos authentication provider to your
Isilon access zone.
3. Add the following proxy users in Isilon if your Cloudera Manager cluster includes the corresponding CDH services.
The procedure for configuring proxy users is described in the Isilon documentation,
https://community.emc.com/docs/DOC-39529.
• proxy user hdfs for hdfs user.
• proxy user mapred for mapred user.
• proxy user hive for hive user.
• proxy user impala for impala user.
• proxy user oozie for oozie user
• proxy user flume for flume user
• proxy user hue for hue user
4. Follow the Cloudera Manager documentation for information on configuring a secure cluster with Kerberos:
Configuring Authentication in Cloudera Manager.
Upgrading a Cluster with Isilon
To upgrade CDH and Cloudera Manager in a cluster that uses Isilon:
1. If required, upgrade OneFS to a version compatible with the version of CDH to which you are upgrading. For
compatibility information, see Product Compatibility Matrix for EMC Isilon. For OneFS upgrade instructions, see
the EMC Isilon documentation.
2. (Optional) Upgrade Cloudera Manager. See Upgrading Cloudera Manager.
The Cloudera Manager minor version must always be equal to or greater than the CDH minor version because
older versions of Cloudera Manager may not support features in newer versions of CDH. For example, if you want
to upgrade to CDH 5.4.8 you must first upgrade to Cloudera Manager 5.4 or higher.
3. Upgrade CDH. See Upgrading CDH and Managed Services Using Cloudera Manager.
The typical use case for Impala and Isilon together is to use Isilon for the default filesystem, replacing HDFS entirely.
In this configuration, when you create a database, table, or partition, the data always resides on Isilon storage and you
do not need to specify any special LOCATION attribute. If you do specify a LOCATION attribute, its value refers to a
path within the Isilon filesystem. For example:
Impala can write to, delete, and rename data files and database, table, and partition directories on Isilon storage.
Therefore, Impala statements such as CREATE TABLE, DROP TABLE, CREATE DATABASE, DROP DATABASE, ALTER
TABLE, and INSERT work the same with Isilon storage as with HDFS.
When the Impala spill-to-disk feature is activated by a query that approaches the memory limit, Impala writes all the
temporary data to a local (not Isilon) storage device. Because the I/O bandwidth for the temporary data depends on
the number of local disks, and clusters using Isilon storage might not have as many local disks attached, pay special
attention on Isilon-enabled clusters to any queries that use the spill-to-disk feature. Where practical, tune the queries
or allocate extra memory for Impala to avoid spilling. Although you can specify an Isilon storage device as the destination
for the temporary data for the spill-to-disk feature, that configuration is not recommended due to the need to transfer
the data both ways using remote I/O.
When tuning Impala queries on HDFS, you typically try to avoid any remote reads. When the data resides on Isilon
storage, all the I/O consists of remote reads. Do not be alarmed when you see non-zero numbers for remote read
measurements in query profile output. The benefit of the Impala and Isilon integration is primarily convenience of not
having to move or copy large volumes of data to HDFS, rather than raw query performance. You can increase the
performance of Impala I/O for Isilon systems by increasing the value for the num_remote_hdfs_io_threads
configuration parameter, in the Cloudera Manager user interface for clusters using Cloudera Manager, or through the
--num_remote_hdfs_io_threads startup option for the impalad daemon on clusters not using Cloudera Manager.
For information about managing Isilon storage devices through Cloudera Manager, see Using CDH with Isilon Storage
on page 161.
Required Configurations
Specify the following configurations in Cloudera Manager on the Clusters > Isilon Service > Configuration tab:
• In HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml hdfs-site.xml and the
Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml properties for the Isilon service,
set the value of the dfs.client.file-block-storage-locations.timeout.millis property to 10000.
• In the Isilon Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml property for the Isilon
service, set the value of the hadoop.security.token.service.use_ip property to FALSE.
• If you see errors that reference the .Trash directory, make sure that the Use Trash property is selected.
Managing Hive
Use the following procedures to manage HiveServer2 and the Hive metastore. To configure high availability for the
Hive metastore, see Hive Metastore High Availability on page 344.
Number of Concurrent Connections HiveServer2 Heap Size Minimum Hive Metastore Heap Size Minimum
Recommendation Recommendation
Up to 40 concurrent connections 12 GB 12 GB
(Cloudera recommends splitting
HiveServer2 into multiple instances
and load balancing once you start
allocating >12 GB to HiveServer2. The
objective is to size to reduce impact
of Java garbage collection on active
processing by the service.
Up to 20 concurrent connections 6 GB 10 GB
Up to 10 concurrent connections 4 GB 8 GB
Single connection 2 GB 4 GB
Important: These numbers are general guidance only, and may be affected by factors such as number
of columns, partitions, complex joins, and client activity among other things. It is important to review
and refine through testing based on your anticipated deployment to arrive at best values for your
environment.
In addition, the Beeline CLI should use a heap size of at least 2 GB.
The permGenSize should be set to 512M for all.
The settings to change are in bold. All of these lines are commented out (prefixed with a # character) by default.
Uncomment the lines by removing the # character.
else
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xmx12288m -Xms10m
-XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
fi
fi
export HADOOP_HEAPSIZE=2048
You can choose whether to use the Concurrent Collector or the New Parallel Collector for garbage collection, by passing
-XX:+UseParNewGC or -XX:+UseConcMarkSweepGC in the HADOOP_OPTS lines above, and you can tune the garbage
collection overhead limit by setting -XX:-UseGCOverheadLimit. To enable the garbage collection overhead limit,
remove the setting or change it to -XX:+UseGCOverheadLimit.
export PYTHON_CMD=/usr/bin/python
• The Hive metastore database password and JDBC drivers don’t need to be shared with every Hive client; only the
Hive Metastore Server does. Sharing passwords with many hosts is a security concern.
• You can control activity on the Hive metastore database. To stop all activity on the database, just stop the Hive
Metastore Server. This makes it easy to perform tasks such as backup and upgrade, which require all Hive activity
to stop.
Information about the initial configuration of a remote Hive metastore database with Cloudera Manager can be found
at Cloudera Manager and Managed Service Datastores.
Perform one of the following procedures depending on whether you want to create permanent or temporary functions.
hive.server2.builtin.udf.blacklist A comma separated list of built-in UDFs that are not allowed to be executed.
A UDF that is included in the list will return an error if invoked from a query.
Default value: Empty
Note: If the Hive Metastore is running on a different host, create the same directory there that
you created on the HiveServer2 host. You do not need to copy the JAR file onto the Hive Metastore
host, but the same directory must be there. For example, if you copied the JAR file to
/opt/local/hive/lib/ on the HiveServer2 host, you must create the same directory on the
Hive Metastore host. If the same directory is not present on the Hive Metastore host, Hive
Metastore service will not start.
8. Click Save Changes. The JARs are added to HIVE_AUX_JARS_PATH environment variable.
9. Redeploy the Hive client configuration.
a. In the Cloudera Manager Admin Console, go to the Hive service.
b. From the Actions menu at the top right of the service page, select Deploy Client Configuration.
c. Click Deploy Client Configuration.
10. Restart the Hive service.
11. With Sentry enabled - Grant privileges on the JAR files to the roles that require access. Log in to Beeline as user
hive and use the Hive SQL GRANT statement to do so. For example:
12. Run the CREATE FUNCTION command to create the UDF from the JAR file and point to the JAR file location in
HDFS. For example:
Note: If the Hive Metastore is running on a different host, create the same directory there that
you created on the HiveServer2 host. You do not need to copy the JAR file onto the Hive Metastore
host, but the same directory must be there. For example, if you copied the JAR file to
/opt/local/hive/lib/ on the HiveServer2 host, you must create the same directory on the
Hive Metastore host. If the same directory is not present on the Hive Metastore host, Hive
Metastore service will not start.
hive.aux.jars.path=file:///opt/local/hive/lib/my.jar
3. Copy the JAR file (and its dependent libraries) to the host running HiveServer2/Impala. Make sure the hive user
has read, write, and execute access to these files on the HiveServer2/Impala host.
4. On the HiveServer2/Impala host, open /etc/default/hive-server2 and set the AUX_CLASSPATH variable
to a comma-separated list of the fully-qualified paths to the JAR file and any dependent libraries.
AUX_CLASSPATH=/opt/local/hive/lib/my.jar
5. Restart HiveServer2.
6. If Sentry is enabled - Grant privileges on the JAR files to the roles that require access. Login to Beeline as user
hive and use the Hive SQL GRANT statement to do so. For example:
If you are using Sentry policy files, you can grant the URI privilege as follows:
udf_r = server=server1->uri=file:///opt/local/hive/lib
udf_r = server=server1->uri=hdfs:///path/to/jar
7. Run the CREATE FUNCTION command and point to the JAR from Hive:
hive.aux.jars.path=file:///opt/local/hive/lib/my.jar
2. Copy the JAR file (and its dependent libraries) to the host running HiveServer2/Impala. Make sure the hive user
has read, write, and execute access to these files on the HiveServer2/Impala host.
3. On the HiveServer2/Impala host, open /etc/default/hive-server2 and set the AUX_CLASSPATH variable
to a comma-separated list of the fully-qualified paths to the JAR file and any dependent libraries.
AUX_CLASSPATH=/opt/local/hive/lib/my.jar
4. If Sentry is enabled - Grant privileges on the local JAR files to the roles that require access. Login to Beeline as
user hive and use the Hive SQL GRANT statement to do so. For example:
If you are using Sentry policy files, you can grant the URI privilege as follows:
udf_r = server=server1->uri=file:///opt/local/hive/lib
5. Restart HiveServer2.
6. Run the CREATE TEMPORARY FUNCTION command and point to the JAR from Hive:
Important: Hive on Spark is included in CDH 5.5 but is not currently supported nor recommended
for production use. To try this feature in CDH 5.5, use it in a test environment.
Important: Hive on Spark is included in CDH 5.5 but is not currently supported nor recommended
for production use. To try this feature in CDH 5.5, use it in a test environment.
This topic explains the configuration properties you set up to run Hive on Spark.
Note: We recommend that you use HiveServer2 with Beeline. The following content, except for
Configuring Hive on Spark for Hive CLI on page 174, is based on this assumption.
Installation Considerations
For Hive to work on Spark, you must deploy Spark gateway roles on the same machine that hosts HiveServer2. Otherwise,
Hive on Spark cannot read from Spark configurations and cannot submit Spark jobs. For more information about
gateway roles, see Managing Roles on page 44.
After installation, run the following command in Hive so that Hive will use Spark as the back-end engine for all subsequent
queries.
set hive.execution.engine=spark;
Configuration Properties
Property Description
hive.stats.collect.rawdatasize Hive on Spark uses statistics to determine the threshold
for converting common join to map join. There are two
types of statistics about the size of data:
• totalSize: The approximate size of data on the disk
• rawDataSize: The approximate size of data in
memory
When both metrics are available, Hive chooses
rawDataSize.
Default: True
hive.auto.convert.join.noconditionaltask.size The threshold for the sum of all the small table size (by
default, rawDataSize), for map join conversion. You can
increase the value if you want better performance by
converting more common joins to map joins. However, if
you set this value too high, tasks may fail because too
much memory is being used by data from small tables.
Default: 20MB
Configuring Hive
For improved performance, Cloudera recommends that you configure the following additional properties for Hive. In
Cloudera Manager, set these properties in the advanced configuration snippet for HiveServer2.
• hive.stats.fetch.column.stats=true
• hive.optimize.index.filter=true
Important: Hive on Spark is included in CDH 5.5 but is not currently supported nor recommended
for production use. To try this feature in CDH 5.5, use it in a test environment.
Problem: Delayed result from the first query after starting a new Hive on Spark session
The first query after starting a new Hive on Spark session might be delayed due to the start-up time for the Spark
on YARN cluster. The query waits for YARN containers to initialize. Subsequent queries will be faster.
Problem: Exception Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)
and HiveServer2 is down
HiveServer2 memory is set too small. For more information, see STDOUT for HiveServer2. To fix this issue:
1. In Cloudera Manager, go to HIVE.
2. Click Configuration.
3. Search for Java Heap Size of HiveServer2 in Bytes, and change it to be a larger value. Cloudera
recommends a minimum value of 256 MB.
4. Restart HiveServer2.
Problem: Out-of-memory error
You might get an out-of-memory error similar to the following:
This error indicates that the Spark driver does not have enough off-heap memory. Increase the off-heap memory
by setting spark.yarn.driver.memoryOverhead or spark.driver.memory.
Problem: Hive on Spark does not work with HBase
Hive on Spark with HBase is not supported. If you use HBase, use Hive on MapReduce instead of Hive on Spark.
Problem: Spark applications stay alive forever and occupy cluster resources
This can occur if there are multiple concurrent Hive sessions. To manually terminate the Spark applications:
1. Find the YARN application IDs for the applications by going to Cloudera Manager and clicking Yarn >
ResourceManager > ResourceManager Web UI.
2. Log in to the YARN ResourceManager host.
3. Open a terminal and run:
Important: Hive on Spark is included in CDH 5.5 but is not currently supported nor recommended
for production use. To try this feature in CDH 5.5, use it in a test environment.
Managing Hue
Hue is a set of web UIs that enable you to interact with a CDH cluster. This section describes tasks for managing Hue.
to the right of the cluster name and select Add a Service. A list of service types display.
2. Select Hue.
3. Click Continue.
A page displays where you can specify the dependencies for the Hue service.
4. Select the row with the Hue dependencies required for your cluster. For more information, see Hue Dependencies.
5. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
6. Click Continue.
Cloudera Manager starts the Hue service.
7. Click Continue.
8. Click Finish.
9. If your cluster uses Kerberos, Cloudera Manager will automatically add a Hue Kerberos Ticket Renewer role to
each host where you assigned the Hue Server role instance. Also see, Enable Hue to Work with Hadoop Security
using Cloudera Manager.
Adding a Hue Role Instance
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
1. In Cloudera Manager Administration Console, go to the Hue service.
2. Click the Instances tab.
3. Click the Add Role Instances button.
4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
5. If your cluster uses Kerberos, you must add the Hue Kerberos Ticket Renewer role to each host where you assigned
the Hue Server role instance. Cloudera Manager will throw a validation error if the new Hue Server role does not
have a colocated KT Renewer role. Also see, Enable Hue to Work with Hadoop Security using Cloudera Manager.
6. Click Continue.
h. Locate the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini property
and add the following property:
[hbase]
hbase_conf_dir=/etc/hbase/conf
Important: Cloudera strongly recommends an external database for clusters with multiple Hue
servers (or "hue" users). With the default embedded database (one per server), in a multi-server
environment, the data on server "A" appears lost when working on server "B" and vice versa. Use an
external database, and configure each server to point to it to ensure that no matter which server is
being used by Hue, your data is always accessible.
To configure Hue with any of the supported external databases, the high-level steps are:
1. Stop Hue service.
2. Backup default SQLite database (if applicable).
3. Install database software and dependencies.
4. Create and configure database and load data.
5. Start Hue service.
See the tasks on this page for details. If you do not need to migrate a SQLite database, you can skip the steps on
dumping the database and editing the JSON objects.
Configuring the Hue Server to Store Data in MariaDB
For information about installing and configuring a MariaDB database , see MariaDB Database.
1. In the Cloudera Manager Admin Console, go to the Hue service status page.
2. Select Actions > Stop. Confirm you want to stop the service by clicking Stop.
3. Select Actions > Dump Database. Confirm you want to dump the database by clicking Dump Database.
4. Note the host to which the dump was written under Step in the Dump Database Command window. You can also
find it by selecting Commands > Recent Commands > Dump Database.
5. Open a terminal window for the host and go to the dump file in /tmp/hue_database_dump.json.
6. Remove all JSON objects with useradmin.userprofile in the model field, for example:
{
"pk": 14,
"model": "useradmin.userprofile",
"fields":
{ "creation_method": "EXTERNAL", "user": 14, "home_directory": "/user/tuser2" }
},
[mysqld]
sql_mode=STRICT_ALL_TABLES
8. Create a new database and grant privileges to a Hue user to manage this database. For example:
c. (InnoDB only) Drop the foreign key that you retrieved in the previous step.
e. In Hue service instance page, click Actions > Load Database. Confirm you want to load the database by clicking
Load Database.
f. (InnoDB only) Add back the foreign key.
mysql > ALTER TABLE auth_permission ADD FOREIGN KEY (content_type_id) REFERENCES
django_content_type (id);
Note: Cloudera recommends InnoDB over MyISAM as the Hue MySQL engine. On CDH 5, Hue requires
InnoDB.
For information about installing and configuring a MySQL database , see MySQL Database.
1. In the Cloudera Manager Admin Console, go to the Hue service status page.
2. Select Actions > Stop. Confirm you want to stop the service by clicking Stop.
3. Select Actions > Dump Database. Confirm you want to dump the database by clicking Dump Database.
4. Note the host to which the dump was written under Step in the Dump Database Command window. You can also
find it by selecting Commands > Recent Commands > Dump Database.
5. Open a terminal window for the host and go to the dump file in /tmp/hue_database_dump.json.
6. Remove all JSON objects with useradmin.userprofile in the model field, for example:
{
"pk": 14,
"model": "useradmin.userprofile",
"fields":
{ "creation_method": "EXTERNAL", "user": 14, "home_directory": "/user/tuser2" }
},
[mysqld]
sql_mode=STRICT_ALL_TABLES
8. Create a new database and grant privileges to a Hue user to manage this database. For example:
c. (InnoDB only) Drop the foreign key that you retrieved in the previous step.
e. In Hue service instance page, click Actions > Load Database. Confirm you want to load the database by clicking
Load Database.
f. (InnoDB only) Add back the foreign key.
mysql > ALTER TABLE auth_permission ADD FOREIGN KEY (content_type_id) REFERENCES
django_content_type (id);
{
"pk": 14,
"model": "useradmin.userprofile",
"fields":
{ "creation_method": "EXTERNAL", "user": 14, "home_directory": "/user/tuser2" }
},
SLES
Ubuntu or Debian
b. Set the authentication methods for local to trust and for host to password and add the following line at
the end.
$ su - postgres
# /usr/bin/postgres -D /var/lib/pgsql/data > logfile 2>&1 &
12. Create the hue database and grant privileges to a hue user to manage the database.
# psql -U postgres
postgres=# create database hue;
postgres=# \c hue;
You are now connected to database 'hue'.
postgres=# create user hue with password 'secretpassword';
postgres=# grant all privileges on database hue to hue;
postgres=# \q
SLES
Ubuntu or Debian
e. Set Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini with the
following:
[desktop]
[[database]]
engine=postgresql_psycopg2
name=hue
host=localhost
port=5432
user=hue
password=secretpassword
Note: If you set Hue Database Hostname, Hue Database Port, Hue Database Username,
and Hue Database Password at the service-level, under Service-Wide > Database, you can
omit those properties from the server-lever configuration above and avoid storing the Hue
password as plain text. In either case, set engine and name in the server-level safety-valve.
bash# su – postgres
$ psql –h localhost –U hue –d hue
postgres=# \d auth_permission;
c. Drop the foreign key that you retrieved in the previous step.
e. In Hue service instance page, Actions > Load Database. Confirm you want to load the database by clicking
Load Database.
f. Add back the foreign key you dropped.
bash# su – postgres
$ psql –h localhost –U hue –d hue
postgres=# ALTER TABLE auth_permission ADD CONSTRAINT content_type_id_refs_id_XXXXXX
FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY
DEFERRED;
Important: Configure the database for character set AL32UTF8 and national character set UTF8.
RHEL
SLES:
Python devel packages are not included in SLES. Add the SLES Software Development Kit (SDK) as a repository and
then install:
Ubuntu or Debian
2. Download and add the Oracle Client parcel to the Cloudera Manager remote parcel repository URL list and
download, distribute, and activate the parcel.
3. For CDH versions lower than 5.3, install the Python Oracle library:
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
5. In the Cloudera Manager Admin Console, go to the Hue service status page.
6. Select Actions > Stop. Confirm you want to stop the service by clicking Stop.
7. Select Actions > Dump Database. Confirm you want to dump the database by clicking Dump Database.
8. Click the Configuration tab.
9. Select Scope > All.
10. Select Category > Advanced.
11. Set the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini property.
Note: If you set Hue Database Hostname, Hue Database Port, Hue Database Username, and
Hue Database Password at the service-level, under Service-Wide > Database, you can omit those
properties from the server-lever configuration above and avoid storing the Hue password as plain
text. In either case, set engine and name in the server-level safety-valve.
Add the following options (and modify accordingly for your setup):
[desktop]
[[database]]
host=localhost
port=1521
engine=oracle
user=hue
password=secretpassword
name=<SID of the Oracle database, for example, 'XE'>
For CDH 5.1 and higher you can use an Oracle service name. To use the Oracle service name instead of the SID,
use the following configuration instead:
port=0
engine=oracle
user=hue
password=secretpassword
name=oracle.example.com:1521/orcl.example.com
The directive port=0 allows Hue to use a service name. The name string is the connect string, including hostname,
port, and service name.
To add support for a multithreaded environment, set the threaded option to true under the
[desktop]>[[database]] section.
options={"threaded":true}
13. Go to the Hue Server instance in Cloudera Manager and select Actions > Synchronize Database.
14. Ensure you are connected to Oracle as the hue user, then run the following command to delete all data from
Oracle tables:
commit;
17. Load the data that you dumped. Go to the Hue Server instance and select Actions > Load Database. This step is
not necessary if you have a fresh Hue install with no data or if you don’t want to save the Hue data.
18. Start the Hue service.
Configuring the Hue Server to Store Data in Oracle (Package Installation)
If you have a parcel-based environment, see Configuring the Hue Server to Store Data in Oracle (Parcel Installation)
on page 183.
Important: Configure the database for character set AL32UTF8 and national character set UTF8.
1. Download the Oracle libraries at Instant Client for Linux x86-64 Version 11.1.0.7.0, Basic and SDK (with headers)
zip files to the same directory.
2. Unzip the Oracle client zip files.
3. Set environment variables to reference the libraries.
$ export ORACLE_HOME=oracle_download_directory
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME
$ cd $ORACLE_HOME
$ ln -sf libclntsh.so.11.1 libclntsh.so
SLES:
Python devel packages are not included in SLES. Add the SLES Software Development Kit (SDK) as a repository and
then install:
Ubuntu or Debian
6. For CDH versions lower than 5.3, install the Python Oracle library:
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
8. In the Cloudera Manager Admin Console, go to the Hue service status page.
9. Select Actions > Stop. Confirm you want to stop the service by clicking Stop.
10. Select Actions > Dump Database. Confirm you want to dump the database by clicking Dump Database.
11. Click the Configuration tab.
12. Select Scope > All.
13. Select Category > Advanced.
14. Set the Hue Service Environment Advanced Configuration Snippet (Safety Valve) property to
ORACLE_HOME=oracle_download_directory
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:oracle_download_directory
15. Set the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini property.
Note: If you set Hue Database Hostname, Hue Database Port, Hue Database Username, and
Hue Database Password at the service-level, under Service-Wide > Database, you can omit those
properties from the server-lever configuration above and avoid storing the Hue password as plain
text. In either case, set engine and name in the server-level safety-valve.
Add the following options (and modify accordingly for your setup):
[desktop]
[[database]]
host=localhost
port=1521
engine=oracle
user=hue
password=secretpassword
name=<SID of the Oracle database, for example, 'XE'>
For CDH 5.1 and higher you can use an Oracle service name. To use the Oracle service name instead of the SID,
use the following configuration instead:
port=0
engine=oracle
user=hue
password=secretpassword
name=oracle.example.com:1521/orcl.example.com
The directive port=0 allows Hue to use a service name. The name string is the connect string, including hostname,
port, and service name.
To add support for a multithreaded environment, set the threaded option to true under the
[desktop]>[[database]] section.
options={"threaded":true}
17. Go to the Hue Server instance in Cloudera Manager and select Actions > Synchronize Database.
18. Ensure you are connected to Oracle as the hue user, then run the following command to delete all data from
Oracle tables:
commit;
21. Load the data that you dumped. Go to the Hue Server instance and select Actions > Load Database. This step is
not necessary if you have a fresh Hue install with no data or if you don’t want to save the Hue data.
22. Start the Hue service.
Using an External Database for Hue Using the Command Line
The Hue server requires a SQL database to store small amounts of data such as user account information, job submissions,
and Hive queries. SQLite is the default embedded database. Hue also supports several types of external databases.
This page explains how to configure Hue with a selection of external Supported Databases.
Important: Cloudera strongly recommends an external database for clusters with multiple Hue users.
To configure Hue with any of the supported external databases, the high-level steps are:
# sqlite3 /var/lib/hue/desktop.db
SQLite version 3.6.22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select username from auth_user;
admin
test
sample
sqlite>
Important: It is strongly recommended that you avoid making any modifications to the database
directly using sqlite3, though sqlite3 is useful for management or troubleshooting.
Note: Cloudera recommends InnoDB over MyISAM as the Hue MySQL engine. On CDH 5, Hue requires
InnoDB.
2. Open <some-temporary-file>.json and remove all JSON objects with useradmin.userprofile in the
model field. Here are some examples of JSON objects that should be deleted.
{
"pk": 1,
"model": "useradmin.userprofile",
"fields": {
"creation_method": "HUE",
"user": 1,
"home_directory": "/user/alice"
}
},
{
"pk": 2,
"model": "useradmin.userprofile",
"fields": {
"creation_method": "HUE",
"user": 1100714,
"home_directory": "/user/bob"
}
},
.....
OS Command
RHEL $ sudo yum install mariadb-devel
OS Command
RHEL $ sudo yum install mariadb-connector-java
OS Command
RHEL $ sudo yum install mariadb-server
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=<ip-address>
default-storage-engine=InnoDB
sql_mode=STRICT_ALL_TABLES
9. Configure MariaDB to use a strong password. In the following procedure, your current root password is blank.
Press the Enter key when you're prompted for the root password.
$ sudo /usr/bin/mysql_secure_installation
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!
OS Command
RHEL $ sudo /sbin/chkconfig mariadb on
$ sudo /sbin/chkconfig --list mariadb
mysqld 0:off 1:off 2:on 3:on 4:on 5:on
6:off
11. Create the Hue database and grant privileges to a hue user to manage the database.
host=localhost
port=3306
engine=mysql
user=hue
password=<secretpassword>
name=hue
14. As the hue user, load the existing data and create the necessary database tables using syncdb and migrate
commands. When running these commands, Hue will try to access a logs directory, located at
/opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the case, first create the
logs directory and give the hue user and group ownership of the directory.
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
Note: Cloudera recommends InnoDB over MyISAM as the Hue MySQL engine. On CDH 5, Hue requires
InnoDB.
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
3. Open <some-temporary-file>.json and remove all JSON objects with useradmin.userprofile in the
model field. Here are some examples of JSON objects that should be deleted.
{
"pk": 1,
"model": "useradmin.userprofile",
"fields": {
"creation_method": "HUE",
"user": 1,
"home_directory": "/user/alice"
}
},
{
"pk": 2,
"model": "useradmin.userprofile",
"fields": {
"creation_method": "HUE",
"user": 1100714,
"home_directory": "/user/bob"
}
},
.....
OS Command
RHEL $ sudo yum install mysql-devel
OS Command
RHEL $ sudo yum install mysql-connector-java
OS Command
RHEL $ sudo yum install mysql-server
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=<ip-address>
default-storage-engine=InnoDB
sql_mode=STRICT_ALL_TABLES
OS Command
RHEL $ sudo service mysqld start
10. Configure MySQL to use a strong password. In the following procedure, your current root password is blank.
Press the Enter key when you're prompted for the root password.
$ sudo /usr/bin/mysql_secure_installation
[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!
OS Command
RHEL $ sudo /sbin/chkconfig mysqld on
$ sudo /sbin/chkconfig --list mysqld
mysqld 0:off 1:off 2:on 3:on 4:on 5:on
6:off
12. Create the Hue database and grant privileges to a hue user to manage the database.
host=localhost
port=3306
engine=mysql
user=hue
password=<secretpassword>
name=hue
15. As the hue user, load the existing data and create the necessary database tables using syncdb and migrate
commands. When running these commands, Hue will try to access a logs directory, located at
/opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the case, first create the
logs directory and give the hue user and group ownership of the directory.
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
3. Open <some-temporary-file>.json and remove all JSON objects with useradmin.userprofile in the
model field. Here are some examples of JSON objects that should be deleted.
{
"pk": 1,
"model": "useradmin.userprofile",
"fields": {
"creation_method": "HUE",
"user": 1,
"home_directory": "/user/alice"
}
},
{
"pk": 2,
"model": "useradmin.userprofile",
"fields": {
"creation_method": "HUE",
"user": 1100714,
"home_directory": "/user/bob"
}
},
.....
OS Command
RHEL $ sudo yum install postgresql-devel gcc python-devel
OS Command
RHEL $ sudo yum install postgresql-server
$ su - postgres
# /usr/bin/postgres -D /var/lib/pgsql/data > logfile 2>&1 &
11. Create the hue database and grant privileges to a hue user to manage the database.
# psql -U postgres
postgres=# create database hue;
postgres=# \c hue;
You are now connected to database 'hue'.
postgres=# create user hue with password '<secretpassword>';
postgres=# grant all privileges on database hue to hue;
postgres=# \q
OS Command
RHEL $ sudo /sbin/chkconfig postgresql on
$ sudo /sbin/chkconfig --list postgresql
postgresql 0:off 1:off 2:on 3:on 4:on
5:on 6:off
host=localhost
port=5432
engine=postgresql_psycopg2
user=hue
password=<secretpassword>
name=hue
17. As the hue user, configure Hue to load the existing data and create the necessary database tables. You will need
to run both the migrate and syncdb commands. When running these commands, Hue will try to access a logs
directory, located at /opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the
case, first create the logs directory and give the hue user and group ownership of the directory.
bash# su - postgres
$ psql -h localhost -U hue -d hue
postgres=# \d auth_permission;
19. Drop the foreign key that you retrieved in the previous step.
bash# su - postgres
$ psql -h localhost -U hue -d hue
postgres=# ALTER TABLE auth_permission ADD CONSTRAINT content_type_id_refs_id_<XXXXXX>
FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY
DEFERRED;
Important: Configure the database for character set AL32UTF8 and national character set UTF8.
1. Ensure Python 2.6 or higher is installed on the server Hue is running on.
2. Download the Oracle client libraries at Instant Client for Linux x86-64 Version 11.1.0.7.0, Basic and SDK (with
headers) zip files to the same directory.
3. Unzip the zip files.
4. Set environment variables to reference the libraries.
$ cd $ORACLE_HOME
$ ln -sf libclntsh.so.11.1 libclntsh.so
Note: HUE_HOME is a reference to the location of your Hue installation. For package installs, this
is usually /usr/lib/hue; for parcel installs, this is usually, /opt/cloudera/parcels/<parcel
version>/lib/hue/.
7. Edit the Hue configuration file hue.ini. Directly below the [[database]] section under the [desktop] line,
add the following options (and modify accordingly for your setup):
host=localhost
port=1521
engine=oracle
user=hue
password=<secretpassword>
name=<SID of the Oracle database, for example, 'XE'>
To use the Oracle service name instead of the SID, use the following configuration instead:
port=0
engine=oracle
user=hue
password=password
name=oracle.example.com:1521/orcl.example.com
The directive port=0 allows Hue to use a service name. The name string is the connect string, including hostname,
port, and service name.
To add support for a multithreaded environment, set the threaded option to true under the
[desktop]>[[database]] section.
options={'threaded':true}
9. As the hue user, configure Hue to load the existing data and create the necessary database tables. You will need
to run both the syncdb and migrate commands. When running these commands, Hue will try to access a logs
directory, located at /opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the
case, first create the logs directory and give the hue user and group ownership of the directory.
10. Ensure you are connected to Oracle as the hue user, then run the following command to delete all data from
Oracle tables:
commit;
Managing Impala
This section explains how to configure Impala to accept connections from applications that use popular programming
APIs:
• Post-Installation Configuration for Impala on page 201
• Configuring Impala to Work with ODBC on page 203
• Configuring Impala to Work with JDBC on page 205
This type of configuration is especially useful when using Impala in combination with Business Intelligence tools, which
use these standard interfaces to query different kinds of database and Big Data systems.
You can also configure these other aspects of Impala:
• Overview of Impala Security
• Modifying Impala Startup Options
Note:
When you set this property, Cloudera Manager regenerates the keytabs for Impala Daemon roles.
The principal in these keytabs contains the load balancer hostname.
If there is a Hue service that depends on this Impala service, it also uses the load balancer to
communicate with Impala.
Important: If Cloudera Manager cannot find the .pem file on the host for a specific role instance,
that role will fail to start.
When you access the Web UI for the Impala Catalog Server, Daemon, and StateStore, https will be used.
reads manually. If you installed Impala without Cloudera Manager, or if you want to customize your environment,
consider making the changes described in this topic.
In some cases, depending on the level of Impala, CDH, and Cloudera Manager, you might need to add particular
component configuration details in one of the free-form fields on the Impala configuration pages within Cloudera
Manager. In Cloudera Manager 4, these fields are labelled Safety Valve; in Cloudera Manager 5, they are called
Advanced Configuration Snippet.
• You must enable short-circuit reads, whether or not Impala was installed through Cloudera Manager. This setting
goes in the Impala configuration settings, not the Hadoop-wide settings.
• If you installed Impala in an environment that is not managed by Cloudera Manager, you must enable block location
tracking, and you can optionally enable native checksumming for optimal performance.
• If you deployed Impala using Cloudera Manager see Testing Impala Performance to confirm proper configuration.
Note: If you use Cloudera Manager, you can enable short-circuit reads through a checkbox in the
user interface and that setting takes effect for Impala as well.
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hdfs-sockets/dn</value>
</property>
<property>
<name>dfs.client.file-block-storage-locations.timeout.millis</name>
<value>10000</value>
</property>
Note: If you are also going to enable block location tracking, you can skip copying configuration
files and restarting DataNodes and go straight to Optional: Block Location Tracking. Configuring
short-circuit reads and block location tracking require the same process of copying files and
restarting services, so you can complete that process once when you have completed all
configuration changes. Whether you copy files and restart services now or during configuring
block location tracking, short-circuit reads are not enabled until you complete those final steps.
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
2. Copy the client core-site.xml and hdfs-site.xml configuration files from the Hadoop configuration directory
to the Impala configuration directory. The default Impala configuration location is /etc/impala/conf.
3. After applying these changes, restart all DataNodes.
Important: As of late 2015, most business intelligence applications are certified with the 2.x ODBC
drivers. Although the instructions on this page cover both the 2.x and 1.x drivers, expect to use the
2.x drivers exclusively for most ODBC applications connecting to Impala.
the underlying IODBC driver needed for non-Windows systems, then for the Cloudera ODBC Connector, and finally for
the BI tool itself.
$ ls -1
Cloudera-ODBC-Driver-for-Impala-Install-Guide.pdf
BI_Tool_Installer.dmg
iodbc-sdk-3.52.7-macosx-10.5.dmg
ClouderaImpalaODBC.dmg
$ open iodbc-sdk-3.52.7-macosx-10.dmg
Install the IODBC driver using its installer
$ open ClouderaImpalaODBC.dmg
Install the Cloudera ODBC Connector using its installer
$ installer_dir=$(pwd)
$ cd /opt/cloudera/impalaodbc
$ ls -1
Cloudera ODBC Driver for Impala Install Guide.pdf
Readme.txt
Setup
lib
ErrorMessages
Release Notes.txt
Tools
$ cd Setup
$ ls
odbc.ini odbcinst.ini
$ cp odbc.ini ~/.odbc.ini
$ vi ~/.odbc.ini
$ cat ~/.odbc.ini
[ODBC]
# Specify any global ODBC configuration here such as ODBC tracing.
# Values for HOST, PORT, KrbFQDN, and KrbServiceName should be set here.
# They can also be specified on the connection string.
HOST=hostname.sample.example.com
PORT=21050
Schema=default
TrustedCerts=/opt/cloudera/impalaodbc/lib/universal/cacerts.pem
# General settings
TSaslTransportBufSize=1000
RowsFetchedPerBlock=10000
SocketTimeout=0
StringColumnLength=32767
UseNativeQuery=0
$ pwd
/opt/cloudera/impalaodbc/Setup
$ cd $installer_dir
$ open BI_Tool_Installer.dmg
Install the BI tool using its installer
$ ls /Applications | grep BI_Tool
BI_Tool.app
$ open -a BI_Tool.app
In the BI tool, connect to a data source using port 21050
Notes about JDBC and ODBC Interaction with Impala SQL Features
Most Impala SQL features work equivalently through the impala-shell interpreter of the JDBC or ODBC APIs. The
following are some exceptions to keep in mind when switching between the interactive shell and applications using
the APIs:
Note: If your JDBC or ODBC application connects to Impala through a load balancer such as haproxy,
be cautious about reusing the connections. If the load balancer has set up connection timeout values,
either check the connection frequently so that it never sits idle longer than the load balancer timeout
value, or check the connection validity before using it and create a new one if the connection has
been closed.
• The Impala complex types (STRUCT, ARRAY, or MAP) are available in CDH 5.5 / Impala 2.3 and higher. To use these
types with JDBC requires version 2.5.28 or higher of the Cloudera JDBC Connector for Impala. To use these types
with ODBC requires version 2.5.30 or higher of the Cloudera ODBC Connector for Impala. Consider upgrading all
JDBC and ODBC drivers at the same time you upgrade from CDH 5.5 or higher.
• Although the result sets from queries involving complex types consist of all scalar values, the queries involve join
notation and column references that might not be understood by a particular JDBC or ODBC connector. Consider
defining a view that represents the flattened version of a table containing complex type columns, and pointing
the JDBC or ODBC application at the view. See Complex Types (CDH 5.5 or higher only) for details.
port, specify that alternative port number with the --hs2_port option when starting impalad. See Starting Impala
for details about Impala startup options. See Ports Used by Impala for information about all ports used for communication
between Impala and clients or between Impala components.
Choosing the JDBC Driver
In Impala 2.0 and later, you have the choice between the Cloudera JDBC Connector and the Hive 0.13 JDBC driver.
Cloudera recommends using the Cloudera JDBC Connector where practical.
If you are already using JDBC applications with an earlier Impala release, you must update your JDBC driver to one of
these choices, because the Hive 0.12 driver that was formerly the only choice is not compatible with Impala 2.0 and
later.
Both the Cloudera JDBC 2.5 Connector and the Hive JDBC driver provide a substantial speed increase for JDBC applications
with Impala 2.0 and higher, for queries that return large result sets.
Complex type considerations:
The Impala complex types (STRUCT, ARRAY, or MAP) are available in CDH 5.5 / Impala 2.3 and higher. To use these
types with JDBC requires version 2.5.28 or higher of the Cloudera JDBC Connector for Impala. To use these types with
ODBC requires version 2.5.30 or higher of the Cloudera ODBC Connector for Impala. Consider upgrading all JDBC and
ODBC drivers at the same time you upgrade from CDH 5.5 or higher.
Although the result sets from queries involving complex types consist of all scalar values, the queries involve join
notation and column references that might not be understood by a particular JDBC or ODBC connector. Consider
defining a view that represents the flattened version of a table containing complex type columns, and pointing the
JDBC or ODBC application at the view. See Complex Types (CDH 5.5 or higher only) for details.
Enabling Impala JDBC Support on Client Systems
Note: The latest JDBC driver, corresponding to Hive 0.13, provides substantial performance
improvements for Impala queries that return large result sets. Impala 2.0 and later are compatible
with the Hive 0.13 driver. If you already have an older JDBC driver installed, and are running Impala
2.0 or higher, consider upgrading to the latest Hive JDBC driver for best performance with JDBC
applications.
If you are using JDBC-enabled applications on hosts outside the CDH cluster, you cannot use the CDH install procedure
on the non-CDH hosts. Install the JDBC driver on at least one CDH host using the preceding procedure. Then download
the JAR files to each client machine that will use JDBC with Impala:
commons-logging-X.X.X.jar
hadoop-common.jar
hive-common-X.XX.X-cdhX.X.X.jar
hive-jdbc-X.XX.X-cdhX.X.X.jar
hive-metastore-X.XX.X-cdhX.X.X.jar
hive-service-X.XX.X-cdhX.X.X.jar
httpclient-X.X.X.jar
httpcore-X.X.X.jar
libfb303-X.X.X.jar
libthrift-X.X.X.jar
log4j-X.X.XX.jar
slf4j-api-X.X.X.jar
slf4j-logXjXX-X.X.X.jar
To enable JDBC support for Impala on the system where you run the JDBC application:
1. Download the JAR files listed above to each client machine.
Note: For Maven users, see this sample github page for an example of the dependencies you
could add to a pom file instead of downloading the individual JARs.
2. Store the JAR files in a location of your choosing, ideally a directory already referenced in your CLASSPATH setting.
For example:
• On Linux, you might use a location such as /opt/jars/.
• On Windows, you might use a subdirectory underneath C:\Program Files.
3. To successfully load the Impala JDBC driver, client programs must be able to locate the associated JAR files. This
often means setting the CLASSPATH for the client process to include the JARs. Consult the documentation for
your JDBC client for more details on how to install new JDBC drivers, but some examples of how to set CLASSPATH
variables include:
• On Linux, if you extracted the JARs to /opt/jars/, you might issue the following command to prepend the
JAR files path to an existing classpath:
export CLASSPATH=/opt/jars/*.jar:$CLASSPATH
• On Windows, use the System Properties control panel item to modify the Environment Variables for your
system. Modify the environment variables to include the path to which you extracted the files.
Note: If the existing CLASSPATH on your client machine refers to some older version of the
Hive JARs, ensure that the new JARs are the first ones listed. Either put the new JAR files
earlier in the listings, or delete the other references to Hive JAR files.
Note: If your JDBC or ODBC application connects to Impala through a load balancer such as haproxy,
be cautious about reusing the connections. If the load balancer has set up connection timeout values,
either check the connection frequently so that it never sits idle longer than the load balancer timeout
value, or check the connection validity before using it and create a new one if the connection has
been closed.
• com.cloudera.impala.jdbc4.DataSource
• com.cloudera.impala.jdbc3.Driver
• com.cloudera.impala.jdbc3.DataSource
The connection string has the following format:
jdbc:impala://Host:Port[/Schema];Property1=Value;Property2=Value;...
jdbc:hive2://myhost.example.com:21050/;auth=noSasl
To connect to an instance of Impala that requires Kerberos authentication, use a connection string of the form
jdbc:hive2://host:port/;principal=principal_name. The principal must be the same user principal you
used when starting Impala. For example, you might use:
jdbc:hive2://myhost.example.com:21050/;principal=impala/[email protected]
To connect to an instance of Impala that requires LDAP authentication, use a connection string of the form
jdbc:hive2://host:port/db_name;user=ldap_userid;password=ldap_password. For example, you might
use:
jdbc:hive2://myhost.example.com:21050/test_db;user=fred;password=xyz123
Note:
Currently, the Hive JDBC driver does not support connections that use both Kerberos authentication
and SSL encryption. To use both of these security features with Impala through a JDBC application,
use the Cloudera JDBC Connector as the JDBC driver.
Notes about JDBC and ODBC Interaction with Impala SQL Features
Most Impala SQL features work equivalently through the impala-shell interpreter of the JDBC or ODBC APIs. The
following are some exceptions to keep in mind when switching between the interactive shell and applications using
the APIs:
• Complex type considerations:
– Queries involving the complex types (ARRAY, STRUCT, and MAP) require notation that might not be available
in all levels of JDBC and ODBC drivers. If you have trouble querying such a table due to the driver level or
inability to edit the queries used by the application, you can create a view that exposes a “flattened” version
of the complex columns and point the application at the view. See Complex Types (CDH 5.5 or higher only)
for details.
– The complex types available in CDH 5.5 / Impala 2.3 and higher are supported by the JDBC getColumns()
API. Both MAP and ARRAY are reported as the JDBC SQL Type ARRAY, because this is the closest matching Java
SQL type. This behavior is consistent with Hive. STRUCT types are reported as the JDBC SQL Type STRUCT.
To be consistent with Hive's behavior, the TYPE_NAME field is populated with the primitive type name for
scalar types, and with the full toSql() for complex types. The resulting type names are somewhat inconsistent,
because nested types are printed differently than top-level types. For example, the following list shows how
toSQL() for Impala types are translated to TYPE_NAME values:
to the right of the cluster name and select Add a Service. A list of service types display. You can add one type of
service at a time.
2. Select the Key-Value Store Indexer service and click Continue.
3. Select the radio button next to the services on which the new service should depend. All services must depend
on the same ZooKeeper service. Click Continue.
4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
5. Click Continue.
6. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file paths
required vary based on the services to be installed. If you chose to add the Sqoop service, indicate whether to use
the default Derby database or the embedded PostgreSQL database. If the latter, type the database name, host,
and user credentials that you specified when you created the database.
Warning: Do not place DataNode data directories on NAS devices. When resizing an NAS, block
replicas can be deleted, which will result in reports of missing blocks.
• For information on configuring MapReduce and YARN resource management features, see Resource Management
on page 236.
Once you have migrated to YARN and deleted the MapReduce service, you can remove local data from each TaskTracker
node. The mapred.local.dir parameter is a directory on the local filesystem of each TaskTracker that contains
temporary data for MapReduce. Once the service is stopped, you can remove this directory to free disk space on each
node.
For detailed information on migrating from MapReduce to YARN, see Migrating from MapReduce 1 (MRv1) to MapReduce
2 (MRv2, YARN).
Managing MapReduce
For an overview of computation frameworks, insight into their usage and restrictions, and examples of common tasks
they perform, see Managing MapReduce and YARN on page 210.
Configuring the MapReduce Scheduler
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
The MapReduce service is configured by default to use the FairScheduler. You can change the scheduler type to FIFO
or Capacity Scheduler. You can also modify the Fair Scheduler and Capacity Scheduler configuration. For further
information on schedulers, see Schedulers on page 236.
Managing YARN
For an overview of computation frameworks, insight into their usage and restrictions, and examples of common tasks
they perform, see Managing MapReduce and YARN on page 210.
Adding the YARN Service
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
1. On the Home > Status tab, click
to the right of the cluster name and select Add a Service. A list of service types display. You can add one type of
service at a time.
2. Click the YARN (MR2 Included) radio button and click Continue.
3. Select the radio button next to the services on which the new service should depend. All services must depend
on the same ZooKeeper service. Click Continue.
4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
Cloudera Manager Property CDH Property Name Default Configuration Cloudera Tuning Guidelines
Name
Container Memory yarn.scheduler. 1 GB 0
minimum-allocation-mb
Minimum
Container Memory yarn.scheduler. 64 GB amount of memory on
maximum-allocation-mb
Maximum largest node
Container Memory yarn.scheduler. 512 MB Use a fairly large value, such
increment-allocation-mb
Increment as 128 MB
Container Memory yarn.nodemanager. 8 GB 8 GB
resource.memory-mb
Configuring Directories
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
When you upgrade from CDH 4 to CDH 5, you can import MapReduce configurations to YARN as part of the upgrade
wizard. If you do not import configurations during upgrade, you can manually import the configurations at a later time:
1. Go to the YARN service page.
2. Stop the YARN service.
3. Select Actions > Import MapReduce Configuration. The import wizard presents a warning letting you know that
it will import your configuration, restart the YARN service and its dependent services, and update the client
configuration.
4. Click Continue to proceed. The next page indicates some additional configuration required by YARN.
5. Verify or modify the configurations and click Continue. The Switch Cluster to MR2 step proceeds.
6. When all steps have been completed, click Finish.
7. (Optional) Remove the MapReduce service.
a. Click the Cloudera Manager logo to return to the Home page.
b. In the MapReduce row, right-click
Exit code 154 is used in RecoveredContainerLaunch#call to indicate containers that were lost between
NodeManager restarts without an exit code being recorded. This is usually a bug, and requires investigation.
Other exit codes The JVM might exit if there is an unrecoverable error while executing a task. The exit code and the
message logged should provide more detail. A Java stack trace might also be logged as part of the exit. These exits
should be investigated further to discover a root cause.
In the case of a streaming MapReduce job, the exit code of the JVM is the same as the mapper or reducer in use. The
mapper or reducer can be a shell script or Python script. This means that the underlying script dictates the exit code:
in streaming jobs, you should take this into account during your investigation.
Managing Oozie
This section describes tasks for managing Oozie.
Requirements
The requirements for Oozie high availability are:
• Multiple active Oozie servers, preferably identically configured.
• JDBC JAR in the same location across all Oozie hosts (for example, /var/lib/oozie/).
• External database that supports multiple concurrent connections, preferably with HA support. The default Derby
database does not support multiple concurrent connections.
• ZooKeeper ensemble with distributed locks to control database access, and service discovery for log aggregation.
• Load balancer (preferably with HA support, for example HAProxy), virtual IP, or round-robin DNS to provide a
single entry point (of the multiple active servers), and for callbacks from the Application Master or JobTracker.
For information on setting up TLS/SSL communication with Oozie HA enabled, see Additional Considerations when
Configuring TLS/SSL for Oozie HA.
Configuring Oozie High Availability Using Cloudera Manager
Minimum Required Role: Full Administrator
Important: Enabling or disabling high availability makes the previous monitoring history unavailable.
3. Select the one host to run the Oozie server and click Continue. Cloudera Manager stops the Oozie service, removes
the additional Oozie servers, configures Hue to reference the Oozie service, and restarts the Oozie service and
dependent services.
Configuring Oozie High Availability Using the Command Line
For installation and configuration instructions for configuring Oozie HA using the command line, see
https://archive.cloudera.com/cdh5/cdh/5/oozie.
Note: If your instance of Cloudera Manager uses an external database, you must also configure Oozie
with an external database. See Configuring an External Database for Oozie.
Note: Releases are only included in the following tables if a schema was added or removed. If a
release is not in the table, it has the same set of schemas as the previous release that is in the table.
CDH 5.5.0 CDH 5.4.0 CDH 5.2.0 CDH 5.1.0 CDH 5.0.1 CDH 5.0.0
distcp distcp-action-0.1 distcp-action-0.1 distcp-action-0.1 distcp-action-0.1 distcp-action-0.1 distcp-action-0.1
distcp-action-0.2 distcp-action-0.2 distcp-action-0.2 distcp-action-0.2 distcp-action-0.2 distcp-action-0.2
CDH 5.5.0 CDH 5.4.0 CDH 5.2.0 CDH 5.1.0 CDH 5.0.1 CDH 5.0.0
oozie-bundle-0.2 oozie-bundle-0.2 oozie-bundle-0.2
-Duser.timezone=GMT
• To set the timezone just for Oozie in MySQL, add the following argument to
oozie.service.JPAService.jdbc.url:
useLegacyDatetimeCode=false&serverTimezone=GMT
Important: Changing the timezone on an existing Oozie database while Coordinators are already
running might cause Coordinators to shift by the offset of their timezone from GMT one time after
you make this change.
For more information about how to set your database's timezone, see your database's documentation.
Location
Set the scheduling information in the frequency attribute of the coordinator.xml file. A simple file looks like the
following example. The frequency attribute and scheduling information appear in bold.
</action>
</coordinator-app>
Important: Before CDH 5 Oozie used fixed-frequency scheduling. You could only schedule according
to a set amount of minutes or a set time configured in an EL (Expression Language) function. The
cron-like syntax allows more flexibility.
The following table summarizes the valid values for each field.
Hour 0-23 , - * /
Day-of-month 0-31 , - * ? / L W
For more information about Oozie cron-like syntax, see Cron syntax in coordinator frequency.
Important: Some cron implementations accept 0-6 as the range for days of the week. Oozie accepts
1-7 instead.
Scheduling Examples
The following examples show cron scheduling in Oozie. Oozie’s processing time zone is UTC. If you are in a different
time zone, add to or subtract from the appropriate offset in these examples.
Run at the 30th minute of every hour
Set the minute field to 30 and the remaining fields to * so they match every value.
frequency="30 * * * *"
frequency="30 14 * * *"
frequency="30 14 * 2 *"
Run every 20 minutes between 5:00-10:00 a.m. and between 12:00-2:00 p.m. on the fifth day of each month
Set the minute field to 0/20, the hour field to 5-9,12-14, the day-of-month field to 0/5, and the remaining fields
to *.
frequency="0 5 ? * MON"
Note: If the ? was set to *, this expression would run the job every day at 5:00 a.m., not just
Mondays.
frequency="0 5 L * ?"
Run at 5:00 a.m. on the weekday closest to the 15th day of each month
Set the minute field to 0, the hour field to 5, the day-of-month field to 15W, the month field to *, and the day-of-week
field to ?.
Run every 33 minutes from 9:00-3:00 p.m. on the first Monday of every month
Set the minute field to 0/33, the hour field to 9-14, the day-of-week field to 2#1 (the first Monday), and the
remaining fields to *.
Oozie uses Quartz, a job scheduler library, to parse the cron syntax. For more examples, go to the CronTrigger Tutorial
on the Quartz website. Quartz has two fields (second and year) that Oozie does not support.
Managing Solr
You can install the Solr service through the Cloudera Manager installation wizard, using either parcels or packages.
See Installing Search.
You can elect to have the service created and started as part of the Installation wizard. If you elect not to create the
service using the Installation wizard, you can use the Add Service wizard to perform the installation. The wizard will
automatically configure and start the dependent services and the Solr service. See Adding a Service on page 36 for
instructions.
For further information on the Solr service, see Cloudera Search Guide.
The following sections describe how to configure other CDH components to work with the Solr service.
Configuring the Flume Morphline Solr Sink for Use with the Solr Service
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
To use a Flume Morphline Solr sink, the Flume service must be running on your cluster. See the Flume Near Real-Time
Indexing Reference (CDH 5) for information about the Flume Morphline Solr Sink and Managing Flume on page 77.
1. Go to the Flume service.
2. Click the Configuration tab.
3. Select Scope > Agent
4. Select Category > Flume-NG Solr Sink.
5. Edit the following settings, which are templates that you must modify for your deployment:
• Morphlines File (morphlines.conf) - Configures Morphlines for Flume agents. You must use $ZK_HOST in
this field instead of specifying a ZooKeeper quorum. Cloudera Manager automatically replaces the $ZK_HOST
variable with the correct value during the Flume configuration deployment.
• Custom MIME-types File (custom-mimetypes.xml) - Configuration for the detectMimeTypes command.
See the Cloudera Morphlines Reference Guide for details on this command.
• Grok Dictionary File (grok-dictionary.conf) - Configuration for the grok command. See the Cloudera
Morphlines Reference Guide for details on this command.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
Once configuration is complete, Cloudera Manager automatically deploys the required files to the Flume agent process
directory when it starts the Flume agent. Therefore, you can reference the files in the Flume agent configuration using
their relative path names. For example, you can use the name morphlines.conf to refer to the location of the
Morphlines configuration file.
[search]
## URL of the Solr Server
solr_url=http://SOLR_HOST:8983/solr
Important: If you are using parcels with CDH 4.3, you must register the "hue-search" application
manually or access will fail. You do not need to do this if you are using CDH 4.4 or higher.
1. Stop the Hue service.
2. From the command line do the following:
a.
cd /opt/cloudera/parcels/CDH 4.3.0-1.cdh4.3.0.pXXX/share/hue
(Substitute your own local repository path for the /opt/cloudera/parcels/... if yours
is different, and specify the appropriate name of the CDH 4.3 parcel that exists in your
repository.)
b.
./build/env/bin/python ./tools/app_reg/app_reg.py
--install
/opt/cloudera/parcels/SOLR-0.9.0-1.cdh4.3.0.pXXX/share/hue/apps/search
c.
sed -i 's/\.\/apps/..\/..\/..\/..\/..\/apps/g'
./build/env/lib/python2.X/site-packages/hue.pth
where python2.X should be the version you are using (for example, python2.4).
3. Start the Hue service.
Note:
When you set this property, Cloudera Manager regenerates the keytabs for Solr roles. The principal
in these keytabs contains the load balancer hostname.
If there is a Hue service that depends on this Solr service, it also uses the load balancer to
communicate with Solr.
http://example.com:8983/solr/admin/collections?action=ADDREPLICA&collection=email&shard=email_shard1&node=192.0.2.2:7542_solr
2. Verify that the replica creation succeeds and moves from recovery state to ACTIVE. You can check the replica
status in the Cloud view, which can be found at a URL similar to:
http://destination.example.com:8983/solr/#/~cloud.
Note: Do not delete the original replica until the new one is in the ACTIVE state. When the newly
added replica is listed as ACTIVE, the index has been fully replicated to the newly added replica.
The total time to replicate an index varies according to factors such as network bandwidth and
the size of the index. Replication times on the scale of hours are not uncommon and do not
necessarily indicate a problem.
3. Use the CLUSTERSTATUS API to retrieve information about the cluster, including current cluster status:
http://example.com:8983/solr/admin/collections?action=clusterstatus&wt=json
http://example.com:8983/solr/admin/collections?action=DELETEREPLICA&collection=email&shard=shard1&replica=core_node2
Managing Spark
Apache Spark is a general framework for distributed computing that offers high performance for both batch and
interactive processing.
To run applications distributed across a cluster, Spark requires a cluster manager. Cloudera supports two cluster
managers: YARN and Spark Standalone. When run on YARN, Spark application processes are managed by the YARN
ResourceManager and NodeManager roles. When run on Spark Standalone, Spark application processes are managed
by Spark Master and Worker roles.
In CDH 5, Cloudera recommends running Spark applications on a YARN cluster manager instead of on a Spark Standalone
cluster manager, for the following benefits:
• You can dynamically share and centrally configure the same pool of cluster resources among all frameworks that
run on YARN.
• You can use all the features of YARN schedulers for categorizing, isolating, and prioritizing workloads.
• You choose the number of executors to use; in contrast, Spark Standalone requires each application to run an
executor on every host in the cluster.
• Spark can run against Kerberos-enabled Hadoop clusters and use secure authentication between its processes.
Related Information
• Spark Guide
• Monitoring Spark Applications
• Tuning Spark Applications on page 275
• Spark Authentication
• Cloudera Spark forum
• Apache Spark documentation
This section describes how to manage Spark services.
You can install, add, and start Spark through the Cloudera Manager Installation wizard using parcels. For more
information, see Installing Spark.
If you do not add the Spark service using the Installation wizard, you can use the Add Service wizard to create the
service. The wizard automatically configures dependent services and the Spark service. For instructions, see Adding a
Service on page 36.
When you upgrade from Cloudera Manager 5.1 or lower to Cloudera 5.2 or higher, Cloudera Manager does not migrate
an existing Spark service, which runs Spark Standalone, to a Spark on YARN service.
For information on Spark applications, see Spark Application Overview.
Important: This item is deprecated and will be removed in a future release. Cloudera supports items
that are deprecated until they are removed. For more information about deprecated and removed
items, see Deprecated Items.
This section describes how to configure and start Spark Standalone services.
For information on installing Spark using the command line, see Spark Installation. For information on configuring and
starting the Spark History Server, see Configuring and Running the Spark History Server Using the Command Line on
page 231.
For information on Spark applications, see Spark Application Overview.
Configuring Spark Standalone
Before running Spark Standalone, do the following on every host in the cluster:
• Edit /etc/spark/conf/spark-env.sh and change hostname in the last line to the name of the host where
the Spark Master will run:
###
### === IMPORTANT ===
### Change the following to specify the Master host
###
export STANDALONE_SPARK_MASTER_HOST=`hostname`
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
1. Create the /user/spark/applicationHistory/ directory in HDFS and set ownership and permissions as
follows:
2. On hosts from which you will launch Spark jobs, do the following:
a. Create /etc/spark/conf/spark-defaults.conf:
cp /etc/spark/conf/spark-defaults.conf.template /etc/spark/conf/spark-defaults.conf
spark.eventLog.dir=hdfs://namenode_host:namenode_port/user/spark/applicationHistory
spark.eventLog.enabled=true
or
spark.eventLog.dir=hdfs://name_service_id/user/spark/applicationHistory
spark.eventLog.enabled=true
To link the YARN ResourceManager directly to the Spark History Server, set the spark.yarn.historyServer.address
property in /etc/spark/conf/spark-defaults.conf:
spark.yarn.historyServer.address=http://spark_history_server:history_port
By default, history_port is 18088. This causes Spark applications to write their history to the directory that the History
Server reads.
to the right of the cluster name and select Add a Service. A list of service types display. You can add one type of
service at a time.
2. Select the Sqoop 1 Client service and click Continue.
3. Select the radio button next to the services on which the new service should depend. All services must depend
on the same ZooKeeper service. Click Continue.
4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the
hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to
which the HDFS DataNode role is assigned. You can reassign role instances if necessary.
Click a field below a role to display a dialog containing a list of hosts. If you click a field containing multiple hosts,
you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
5. Click Continue. The client configuration deployment command runs.
6. Click Continue and click Finish.
Managing Sqoop 2
Cloudera Manager can install the Sqoop 2 service as part of the CDH installation.
You can elect to have the service created and started as part of the Installation wizard if you choose to add it in Custom
Services. If you elect not to create the service using the Installation wizard, you can use the Add Service wizard to
perform the installation. The wizard will automatically configure and start the dependent services and the Sqoop 2
service. See Adding a Service on page 36 for instructions.
Managing ZooKeeper
Minimum Required Role: Full Administrator
When adding the ZooKeeper service, the Add Service wizard automatically initializes the data directories. If you quit
the Add Service wizard or it does not finish successfully, you can initialize the directories outside the wizard by doing
these steps:
1. Go to the ZooKeeper service.
2. Select Actions > Initialize.
3. Click Initialize again to confirm.
Note: If the data directories are not initialized, the ZooKeeper servers cannot be started.
In a production environment, you should deploy ZooKeeper as an ensemble with an odd number of servers. As long
as a majority of the servers in the ensemble are available, the ZooKeeper service will be available. The minimum
recommended ensemble size is three ZooKeeper servers, and Cloudera recommends that each server run on a separate
machine. In addition, the ZooKeeper server process should have its own dedicated disk storage if possible.
com.cloudera.cmf.command.CmdExecException:java.lang.RuntimeException:
java.lang.IllegalStateException: Assumption violated:
getAllDependencies returned multiple distinct services of the same type
at SeqFlowCmd.java line 120
in com.cloudera.cmf.command.flow.SeqFlowCmd run()
CDH services that are not dependent can use different ZooKeeper services. For example, Kafka does not depend on
any services other than ZooKeeper. You might have one ZooKeeper service for Kafka, and one ZooKeeper service for
the rest of your CDH services.
HDFS
1. Go to the HDFS service.
2. Click the Configuration tab.
3. Search for the io.compression.codecs property.
4. In the Compression Codecs property, click in the field, then click the + sign to open a new value field.
5. Add the following two codecs:
• com.hadoop.compression.lzo.LzoCodec
• com.hadoop.compression.lzo.LzopCodec
6. Save your configuration changes.
7. Restart HDFS.
8. Redeploy the HDFS client configuration.
Oozie
1. Go to /var/lib/oozie on each Oozie server and even if the LZO JAR is present, symlink the Hadoop LZO JAR:
• CDH 5 - /opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/hadoop-lzo.jar
• CDH 4 - /opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo.jar
2. Restart Oozie.
HBase
Restart HBase.
Impala
Restart Impala.
Hive
Restart the Hive server.
Sqoop 2
1. Add the following entries to the Sqoop Service Environment Advanced Configuration Snippet:
• HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/*
• JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/native
Resource Management
Resource management helps ensure predictable behavior by defining the impact of different services on cluster
resources. The goals of resource management features are to:
• Guarantee completion in a reasonable time frame for critical workloads
• Support reasonable cluster scheduling between groups of users based on fair allocation of resources per group
• Prevent users from depriving other users access to the cluster
Schedulers
A scheduler is responsible for deciding which tasks get to run and where and when to run them. The MapReduce and
YARN computation frameworks support the following schedulers:
• FIFO - Allocates resources based on arrival time.
• Fair - Allocates resources to weighted pools, with fair sharing within each pool.
– CDH 4 Fair Scheduler
– CDH 5 Fair Scheduler
• Capacity - Allocates resources to pools, with FIFO scheduling within each pool.
– CDH 4 Capacity Scheduler
– CDH 5 Capacity Scheduler
Static Allocation
Cloudera Manager 4 introduced the ability to partition resources across HBase, HDFS, Impala, MapReduce, and YARN
services by allowing you to set configuration properties that were enforced by Linux control groups (Linux cgroups).
With Cloudera Manager 5, the ability to statically allocate resources using cgroups is configurable through a single
static service pool wizard. You allocate services a percentage of total resources and the wizard configures the cgroups.
For example, the following figure illustrates static pools for HBase, HDFS, Impala, and YARN services that are respectively
assigned 20%, 30%, 20%, and 30% of cluster resources.
Dynamic Allocation
Cloudera Manager allows you to manage mechanisms for dynamically apportioning resources statically allocated to
YARN and Impala using dynamic resource pools.
Depending on the version of CDH you are using, dynamic resource pools in Cloudera Manager support the following
resource management (RM) scenarios:
• (CDH 5) YARN Independent RM - YARN manages the virtual cores, memory, running applications, and scheduling
policy for each pool. In the preceding diagram, three dynamic resource pools - Dev, Product, and Mktg with weights
3, 2, and 1 respectively - are defined for YARN. If an application starts and is assigned to the Product pool, and
other applications are using the Dev and Mktg pools, the Product resource pool will receive 30% x 2/6 (or 10%)
of the total cluster resources. If there are no applications using the Dev and Mktg pools, the YARN Product pool
will be allocated 30% of the cluster resources.
• (CDH 5) YARN and Impala Independent RM - YARN manages the virtual cores, memory, running applications, and
scheduling policy for each pool; Impala manages memory for pools running queries and limits the number of
running and queued queries in each pool.
• (CDH 5 and CDH 4) Impala Independent RM - Impala manages memory for pools running queries and limits the
number of running and queued queries in each pool.
• (CDH 5) YARN and Impala Integrated RM -
Note: Though Impala can be used together with YARN via simple configuration of Static Service
Pools in Cloudera Manager, the use of the general-purpose component Llama for integrated
resource management within YARN is no longer supported with CDH 5.5 / Impala 2.3 and higher.
YARN manages memory for pools running Impala queries; Impala limits the number of running and queued queries
in each pool. In the YARN and Impala integrated RM scenario, Impala services can reserve resources through
YARN, effectively sharing the static YARN service pool and resource pools with YARN applications. The integrated
resource management scenario, where both YARN and Impala use the YARN resource management framework,
require the Impala Llama role.
In the following figure, the YARN and Impala services have a 50% static share which is subdivided among the
original resource pools with an additional resource pool designated for the Impala service. If YARN applications
are using all the original pools, and Impala uses its designated resource pool, Impala queries will have the same
resource allocation 50% x 4/8 = 25% as in the first scenario. However, when YARN applications are not using the
original pools, Impala queries will have access to 50% of the cluster resources.
The scenarios where YARN manages resources, whether for independent RM or integrated RM, map to the YARN
scheduler configuration. The scenarios where Impala independently manages resources employ the Impala admission
control feature.
To submit a YARN application to a specific resource pool, specify the mapreduce.job.queuename property. The
YARN application's queue property is mapped to a resource pool. To submit an Impala query to a specific resource
pool, specify the REQUEST_POOL option.
For details on how to configure specific resource management features, see the following topics:
to enable isolation of compute frameworks from one another. Resource allocation is implemented by setting properties
for the services and roles.
Table 8: RHEL-compatible
Distribution CPU Shares I/O Weight Memory Soft Limit Memory Hard Limit
Red Hat Enterprise Linux,
CentOS, and Oracle Enterprise
Linux 7
Red Hat Enterprise Linux,
CentOS, and Oracle Enterprise
Linux 6
Red Hat Enterprise Linux,
CentOS, and Oracle Enterprise
Linux 5
Table 9: SLES
Distribution CPU Shares I/O Weight Memory Soft Limit Memory Hard Limit
SUSE Linux Enterprise Server
11
Distribution CPU Shares I/O Weight Memory Soft Limit Memory Hard Limit
Ubuntu 14.04 LTS
Ubuntu 12.04 LTS
Ubuntu 10.04 LTS
Distribution CPU Shares I/O Weight Memory Soft Limit Memory Hard Limit
Debian 7.1
Debian 7.0
Debian 6.0
The exact level of support can be found in the Cloudera Manager Agent log file, shortly after the Agent has started.
See Viewing the Cloudera Manager Server Log to find the Agent log. In the log file, look for an entry like this:
'default_cpu_shares': 1024,
'has_blkio': True}
The has_cpu and similar entries correspond directly to support for the CPU, I/O, and memory parameters.
Further Reading
• http://www.kernel.org/doc/Documentation/cgroups/cgroups.txt
• http://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt
• http://www.kernel.org/doc/Documentation/cgroups/memory.txt
• MANAGING SYSTEM RESOURCES ON RED HAT ENTERPRISE LINUX 6
• MANAGING SYSTEM RESOURCES ON RED HAT ENTERPRISE LINUX 7
Limitations
• Role group and role instance override cgroup-based resource management parameters must be saved one at a
time. Otherwise some of the changes that should be reflected dynamically will be ignored.
• The role group abstraction is an imperfect fit for resource management parameters, where the goal is often to
take a numeric value for a host resource and distribute it amongst running roles. The role group represents a
"horizontal" slice: the same role across a set of hosts. However, the cluster is often viewed in terms of "vertical"
slices, each being a combination of worker roles (such as TaskTracker, DataNode, RegionServer, Impala Daemon,
and so on). Nothing in Cloudera Manager guarantees that these disparate horizontal slices are "aligned" (meaning,
that the role assignment is identical across hosts). If they are unaligned, some of the role group values will be
incorrect on unaligned hosts. For example a host whose role groups have been configured with memory limits
but that's missing a role will probably have unassigned memory.
Action Procedure
CPU 1. Leave DataNode and TaskTracker role group CPU shares at 1024.
2. Set Impala Daemon role group's CPU shares to 256.
3. The TaskTracker role group should be configured with a Maximum Number of Simultaneous
Map Tasks of 2 and a Maximum Number of Simultaneous Reduce Tasks of 1. This yields an
upper bound of three MapReduce tasks at any given time; this is an important detail for
memory sizing.
Memory 1. Set Impala Daemon role group memory limit to 1024 MB.
2. Leave DataNode maximum Java heap size at 1 GB.
3. Leave TaskTracker maximum Java heap size at 1 GB.
4. Leave MapReduce Child Java Maximum Heap Size for Gateway at 1 GB.
Action Procedure
5. Leave cgroups hard memory limits alone. We'll rely on "cooperative" memory limits
exclusively, as they yield a nicer user experience than the cgroups-based hard memory limits.
I/O 1. Leave DataNode and TaskTracker role group I/O weight at 500.
2. Impala Daemon role group I/O weight is set to 125.
When you're done with configuration, restart all services for these changes to take effect. The results are:
1. When MapReduce jobs are running, all Impala queries together will consume up to a fifth of the cluster's CPU
resources.
2. Individual Impala Daemons won't consume more than 1 GB of RAM. If this figure is exceeded, new queries will
be cancelled.
3. DataNodes and TaskTrackers can consume up to 1 GB of RAM each.
4. We expect up to 3 MapReduce tasks at a given time, each with a maximum heap size of 1 GB of RAM. That's up
to 3 GB for MapReduce tasks.
5. The remainder of each host's available RAM (6 GB) is reserved for other host processes.
6. When MapReduce jobs are running, read requests issued by Impala queries will receive a fifth of the priority of
either HDFS read requests or MapReduce read requests.
Note:
• I/O allocation only works when short-circuit reads are enabled.
• I/O allocation does not handle write side I/O because cgroups in the Linux kernel do not currently
support buffered writes.
3. Step 2: Review Changes - The allocation of resources for each resource type and role displays with the new values
as well as the values previously in effect. The values for each role are set by role group; if there is more than one
role group for a given role type (for example, for RegionServers or DataNodes) then resources will be allocated
separately for the hosts in each role group. Take note of changed settings. If you have previously customized these
settings, check these over carefully.
• Click the to the right of each percentage to display the allocations for a single service. Click to the right
of the Total (100%) to view all the allocations in a single page.
• Click the Back button to go to the previous page and change your allocations.
When you are satisfied with the allocations, click Continue.
4. Step 3 of 4: Restart Services - To apply the new allocation percentages, click Restart Now to restart the cluster.
To skip this step, click Restart Later. If HDFS High Availability is enabled, you will have the option to choose a
rolling restart.
5. Step 4 of 4: Progress displays the status of the restart commands. Click Finished after the restart commands
complete.
displays while the settings are propagated to the service configuration files. You can also manually refresh the files.
– Dominant Resource Fairness (DRF) (default) - An extension of fair scheduling for more than one
resource—it determines resource shares (CPU, memory) for a job separately based on the availability
of those resources and the needs of the job.
– Fair Scheduler (FAIR) - Determines resource shares based on memory.
– First-In, First-Out (FIFO) - Determines resource shares based on when the job was added.
• If you have enabled Fair Scheduler preemption, optionally set a preemption timeout to specify how long a
job in this pool must wait before it can preempt resources from jobs in other pools. To enable preemption,
click the Fair Scheduler Preemption link or follow the procedure in Enabling Preemption on page 247.
5. Do one or more of the following:
• Click the YARN tab.
1. Click a configuration set.
2. Specify a weight that indicates that pool's share of resources relative to other pools, minimum and
maximums for virtual cores and memory, and a limit on the number of applications that can run
simultaneously in the pool.
• Click the Impala tab.
1. Click a configuration set.
2. Specify the maximum number of concurrently running and queued queries in the pool.
6. If you have enabled ACLs and specified users or groups, optionally click the Submission and Administration Access
Control tabs to specify which users and groups can submit applications and which users can view all and kill
applications. The default is that anyone can submit, view all, and kill applications. To restrict either of these
permissions, select the Allow these users and groups radio button and provide a comma-delimited list of users
and groups in the Users and Groups fields respectively. Click OK.
Adding Sub-Pools
Pools can be nested as sub-pools. They share among their siblings the resources of the parent pool. Each sub-pool can
have its own resource restrictions; if those restrictions fall within the configuration of the parent pool, then the limits
for the sub-pool take effect. If the limits for the sub-pool exceed those of the parent, then the parent limits take effect.
Once you create sub-pools, jobs cannot be submitted to the parent pool; they must be submitted to a sub-pool.
1. Select Clusters > Cluster name > Resource Management > Dynamic Resource Pools. If the cluster has a YARN
service, the Dynamic Resource Pools Status tab displays. If the cluster has only an Impala service enabled for
dynamic resource pools, the Dynamic Resource Pools Configuration tab displays.
2. If the Status page is displayed, click the Configuration tab. A list of the currently configured pools with their
configured limits displays.
3.
Click at the right of a resource pool row and select Add Sub Pool. Configure sub-pool properties.
4. Click OK.
Enabling ACLs
To specify whether ACLs are checked:
1. Select Clusters > Cluster name > Resource Management > Dynamic Resource Pools.
2. Click the Configuration tab.
3. Click Other Settings.
4. In the Enable ResourceManager ACLs property, click . The YARN service configuration page displays.
5. Select the checkbox.
6. Click Save Changes to commit the changes.
7. Click to invoke the cluster restart wizard.
8. Click Restart Cluster.
9. Click Restart Now.
10. Click Finish.
Configuring ACLs
To configure which users and groups can submit and kill YARN applications in any resource pool:
1. Enable ACLs.
2. Select Clusters > Cluster name > Resource Management > Dynamic Resource Pools.
3. Click the Configuration tab.
4. Click Other Settings.
5. In the Admin ACL property, click . The YARN service configuration page displays.
6. Specify which users and groups can submit and kill applications.
7. Click Save Changes to commit the changes.
8. Click to invoke the cluster restart wizard.
9. Click Restart Cluster.
10. Click Restart Now.
11. Click Finish.
Enabling Preemption
You can enable the Fair Scheduler to preempt applications in other pools if a pool's minimum share is not met for some
period of time. When you create a pool you can specify how long a pool must wait before other applications are
preempted.
1. Select Clusters > Cluster name > Resource Management > Dynamic Resource Pools.
2. Click the Configuration tab.
3. Click the User Limits tab. The table shows you a list of users and the maximum number of jobs each user can
submit.
4. Click Other Settings.
5. In the Fair Scheduler Preemption, click . The YARN service configuration page displays.
6. Select the checkbox.
7. Click Save Changes to commit the changes.
8. Click to invoke the cluster restart wizard.
9. Click Restart Cluster.
10. Click Restart Now.
11. Click Finish.
Note: YARN and Impala pools created "on-the-fly" are deleted when you restart the YARN and Impala
services.
Configuration Sets
A configuration set defines the allocation of resources across pools that may be active at a given time. For example,
you can define "weekday" and "weekend" configuration sets, which define different resource pool configurations for
different days of the week.
The weekend configuration set assigns the production and development pools an equal share of the resources:
The default configuration set assigns the production pool twice the resources of the development pool:
Scheduling Rules
A scheduling rule defines when a configuration set is active. The configuration set is updated in affected services every
hour.
Example Scheduling Rules
Consider the example weekday and weekend configuration sets. To specify that the weekday configuration set is active
every weekday, weekend configuration set is active on the weekend (weekly on Saturday and Sunday), and the default
configuration set is active all other times, define the following rules:
until certain conditions are met, such as too many queries or too much total memory used across the cluster. When
one of these thresholds is reached, incoming queries wait to begin execution. These queries are queued and are
admitted (that is, begin executing) when the resources become available.
For further information on Impala admission control, see Admission Control and Query Queuing on page 255.
Type: int64
Default: 200
default_pool_mem_limit
Purpose: Maximum amount of memory (across the entire cluster) that all outstanding requests in this pool can
use before new requests to this pool are queued. Specified in bytes, megabytes, or gigabytes by a number followed
by the suffix b (optional), m, or g, either uppercase or lowercase. You can specify floating-point values for megabytes
and gigabytes, to represent fractional numbers such as 1.5. You can also specify it as a percentage of the physical
memory by specifying the suffix %. 0 or no setting indicates no limit. Defaults to bytes if no unit is given. Because
this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately
or queue them, it is a soft limit; the overall memory used by concurrent queries might be slightly higher during
times of heavy load. Ignored if fair_scheduler_config_path and llama_site_path are set.
Note: Impala relies on the statistics produced by the COMPUTE STATS statement to estimate
memory usage for each query. See COMPUTE STATS Statement for guidelines about how and
when to use this statement.
Type: string
Default: "" (empty string, meaning unlimited)
disable_admission_control
Purpose: Turns off the admission control feature entirely, regardless of other configuration option settings.
Type: Boolean
Default: true
disable_pool_max_requests
Purpose: Disables all per-pool limits on the maximum number of running requests.
Type: Boolean
Default: false
disable_pool_mem_limits
Purpose: Disables all per-pool mem limits.
Type: Boolean
Default: false
fair_scheduler_allocation_path
Purpose: Path to the fair scheduler allocation file (fair-scheduler.xml).
Type: string
Default: "" (empty string)
Usage notes: Admission control only uses a small subset of the settings that can go in this file, as described below.
For details about all the Fair Scheduler configuration settings, see the Apache wiki.
llama_site_path
Purpose: Path to the Llama configuration file (llama-site.xml). If set, fair_scheduler_allocation_path
must also be set.
Type: string
Default: "" (empty string)
Usage notes: Admission control only uses a small subset of the settings that can go in this file, as described below.
For details about all the Llama configuration settings, see the documentation on Github.
queue_wait_timeout_ms
Purpose: Maximum amount of time (in milliseconds) that a request waits to be admitted before timing out.
Type: int64
Default: 60000
5. Click Save Changes to commit the changes.
6. Restart the Impala service.
The Impala Llama ApplicationMaster (Llama) reserves and releases YARN-managed resources for Impala, thus reducing
resource management overhead when performing Impala queries. Llama is used when you want to enable integrated
resource management.
By default, YARN allocates resources bit-by-bit as needed by MapReduce jobs. Impala needs all resources available at
the same time, so that intermediate results can be exchanged between cluster nodes, and queries do not stall partway
through waiting for new resources to be allocated. Llama is the intermediary process that ensures all requested
resources are available before each Impala query actually begins.
For more information about Llama, see Llama - Low Latency Application MAster.
For information on enabling Llama high availability, see Llama High Availability on page 349.
The Enable Integrated Resource Management wizard starts and displays information about resource management
options and the actions performed by the wizard.
2. Click Continue.
3. Leave the Enable Cgroup-based Resource Management checkbox checked and click Continue.
4. Click the Impala Llama ApplicationMaster Hosts field to display a dialog for choosing Llama hosts.
The following shortcuts for specifying hostname patterns are supported:
• Range of hostnames (without the domain portion)
• IP addresses
• Rack name
5. Specify or select one or more hosts and click OK.
6. Click Continue. A progress screen displays with a summary of the wizard actions.
7. Click Continue.
8. Click Restart Now to restart the cluster and apply the configuration changes or click leave this wizard to restart
at a later time.
9. Click Finish.
The Disable Integrated Resource Management wizard starts and displays information about resource management
options and the actions performed by the wizard.
Note: Though Impala can be used together with YARN via simple configuration of Static Service Pools
in Cloudera Manager, the use of the general-purpose component Llama for integrated resource
management within YARN is no longer supported with CDH 5.5 / Impala 2.3 and higher.
are successful and perform well during times with less concurrent load. Admission control works as a safeguard to
avoid out-of-memory conditions during heavy concurrent usage.
Important:
• Cloudera strongly recommends you upgrade to CDH 5 or higher to use admission control. In CDH
4, admission control will only work if you do not have Hue deployed; unclosed Hue queries will
accumulate and exceed the queue size limit. On CDH 4, to use admission control, you must
explicitly enable it by specifying --disable_admission_control=false in the impalad
command-line options.
• Use the COMPUTE STATS statement for large tables involved in join queries, and follow other
steps from Tuning Impala for Performance to tune your queries. Although COMPUTE STATS is
an important statement to help optimize query performance, it is especially important when
admission control is enabled:
– When queries complete quickly and are tuned for optimal memory usage, there is less chance
of performance or capacity problems during times of heavy load.
– The admission control feature also relies on the statistics produced by the COMPUTE STATS
statement to generate accurate estimates of memory usage for complex queries. If the
estimates are inaccurate due to missing statistics, Impala might hold back queries
unnecessarily even though there is sufficient memory to run them, or might allow queries
to run that end up exceeding the memory limit and being cancelled.
How Admission Control works with Impala Clients (JDBC, ODBC, HiveServer2)
Most aspects of admission control work transparently with client interfaces such as JDBC and ODBC:
• If a SQL statement is put into a queue rather than running immediately, the API call blocks until the statement is
dequeued and begins execution. At that point, the client program can request to fetch results, which might also
block until results become available.
• If a SQL statement is cancelled because it has been queued for too long or because it exceeded the memory limit
during execution, the error is returned to the client program with a descriptive error message.
If you want to submit queries to different resource pools through the REQUEST_POOL query option, as described in
REQUEST_POOL Query Option, In Impala 2.0 and higher you can change that query option through a SQL SET statement
that you submit from the client application, in the same session. Prior to Impala 2.0, that option was only settable for
a session through the impala-shell SET command, or cluster-wide through an impalad startup option.
Admission control has the following limitations or special behavior when used with JDBC or ODBC applications:
• The MEM_LIMIT query option, sometimes useful to work around problems caused by inaccurate memory estimates
for complicated queries, is only settable through the impala-shell interpreter and cannot be used directly
through JDBC or ODBC applications.
• Admission control does not use the other resource-related query options, RESERVATION_REQUEST_TIMEOUT or
V_CPU_CORES. Those query options only apply to the YARN resource management framework.
Type: int64
Default: 200
default_pool_max_requests
Purpose: Maximum number of concurrent outstanding requests allowed to run before incoming requests are
queued. Because this limit applies cluster-wide, but each Impala node makes independent decisions to run queries
immediately or queue them, it is a soft limit; the overall number of concurrent queries might be slightly higher
during times of heavy load. A negative value indicates no limit. Ignored if fair_scheduler_config_path and
llama_site_path are set.
Type: int64
Default: 200
default_pool_mem_limit
Purpose: Maximum amount of memory (across the entire cluster) that all outstanding requests in this pool can use
before new requests to this pool are queued. Specified in bytes, megabytes, or gigabytes by a number followed by
the suffix b (optional), m, or g, either uppercase or lowercase. You can specify floating-point values for megabytes
and gigabytes, to represent fractional numbers such as 1.5. You can also specify it as a percentage of the physical
memory by specifying the suffix %. 0 or no setting indicates no limit. Defaults to bytes if no unit is given. Because
this limit applies cluster-wide, but each Impala node makes independent decisions to run queries immediately or
queue them, it is a soft limit; the overall memory used by concurrent queries might be slightly higher during times
of heavy load. Ignored if fair_scheduler_config_path and llama_site_path are set.
Note: Impala relies on the statistics produced by the COMPUTE STATS statement to estimate
memory usage for each query. See COMPUTE STATS Statement for guidelines about how and when
to use this statement.
Type: string
Default: "" (empty string, meaning unlimited)
disable_admission_control
Purpose: Turns off the admission control feature entirely, regardless of other configuration option settings.
Type: Boolean
Default: true
disable_pool_max_requests
Purpose: Disables all per-pool limits on the maximum number of running requests.
Type: Boolean
Default: false
disable_pool_mem_limits
Purpose: Disables all per-pool mem limits.
Type: Boolean
Default: false
fair_scheduler_allocation_path
Purpose: Path to the fair scheduler allocation file (fair-scheduler.xml).
Type: string
Default: "" (empty string)
Usage notes: Admission control only uses a small subset of the settings that can go in this file, as described below.
For details about all the Fair Scheduler configuration settings, see the Apache wiki.
llama_site_path
Purpose: Path to the Llama configuration file (llama-site.xml). If set, fair_scheduler_allocation_path
must also be set.
Type: string
Default: "" (empty string)
Usage notes: Admission control only uses a small subset of the settings that can go in this file, as described below.
For details about all the Llama configuration settings, see the documentation on Github.
queue_wait_timeout_ms
Purpose: Maximum amount of time (in milliseconds) that a request waits to be admitted before timing out.
Type: int64
Default: 60000
Configuring Admission Control Using Cloudera Manager
In Cloudera Manager, you can configure pools to manage queued Impala queries, and the options for the limit on
number of concurrent queries and how to handle queries that exceed the limit. For details, see Managing Resources
with Cloudera Manager.
See Examples of Admission Control Configurations on page 260 for a sample setup for admission control under Cloudera
Manager.
Configuring Admission Control Using the Command Line
If you do not use Cloudera Manager, you use a combination of startup options for the Impala daemon, and optionally
editing or manually constructing the configuration files fair-scheduler.xml and llama-site.xml.
For a straightforward configuration using a single resource pool named default, you can specify configuration options
on the command line and skip the fair-scheduler.xml and llama-site.xml configuration files.
For an advanced configuration with multiple resource pools using different settings, set up the fair-scheduler.xml
and llama-site.xml configuration files manually. Provide the paths to each one using the Impala daemon
command-line options, --fair_scheduler_allocation_path and --llama_site_path respectively.
The Impala admission control feature only uses the Fair Scheduler configuration settings to determine how to map
users and groups to different resource pools. For example, you might set up different resource pools with separate
memory limits, and maximum number of concurrent and queued queries, for different categories of users within your
organization. For details about all the Fair Scheduler configuration settings, see the Apache wiki.
The Impala admission control feature only uses a small subset of possible settings from the llama-site.xml
configuration file:
llama.am.throttling.maximum.placed.reservations.queue_name
llama.am.throttling.maximum.queued.reservations.queue_name
For details about all the Llama configuration settings, see Llama Default Configuration.
See Example Admission Control Configurations Using Configuration Files on page 261 for sample configuration files for
admission control using multiple resource pools, without Cloudera Manager.
Examples of Admission Control Configurations
Figure 1: Sample Settings for Cloudera Manager Dynamic Resource Pools Page
The following figure shows a sample of the Placement Rules page in Cloudera Manager, accessed through the Clusters >
Cluster name > Resource Management > Dynamic Resource Pools menu choice and then the Configuration > Placement
Rules tabs. The settings demonstrate a reasonable configuration of a pool named default to service all requests
where the specified resource pool does not exist, is not explicitly set, or the user or group is not authorized for the
specified pool.
<allocations>
<queue name="root">
<aclSubmitApps> </aclSubmitApps>
<queue name="default">
<maxResources>50000 mb, 0 vcores</maxResources>
<aclSubmitApps>*</aclSubmitApps>
</queue>
<queue name="development">
<maxResources>200000 mb, 0 vcores</maxResources>
<aclSubmitApps>user1,user2 dev,ops,admin</aclSubmitApps>
</queue>
<queue name="production">
<maxResources>1000000 mb, 0 vcores</maxResources>
<aclSubmitApps> ops,admin</aclSubmitApps>
</queue>
</queue>
<queuePlacementPolicy>
<rule name="specified" create="false"/>
<rule name="default" />
</queuePlacementPolicy>
</allocations>
llama-site.xml:
The statements affected by the admission control feature are primarily queries, but also include statements that write
data such as INSERT and CREATE TABLE AS SELECT. Most write operations in Impala are not resource-intensive,
but inserting into a Parquet table can require substantial memory due to buffering 1 GB of data before writing out
each Parquet data block. See Loading Data into Parquet Tables for instructions about inserting data efficiently into
Parquet tables.
Although admission control does not scrutinize memory usage for other kinds of DDL statements, if a query is queued
due to a limit on concurrent queries or memory usage, subsequent statements in the same session are also queued
so that they are processed in the correct order:
If you set up different resource pools for different users and groups, consider reusing any classifications and hierarchy
you developed for use with Sentry security. See Enabling Sentry Authorization for Impala for details.
For details about all the Fair Scheduler configuration settings, see Fair Scheduler Configuration, in particular the tags
such as <queue> and <aclSubmitApps> to map users and groups to particular resource pools (queues).
Note: Though Impala can be used together with YARN via simple configuration of Static Service Pools
in Cloudera Manager, the use of the general-purpose component Llama for integrated resource
management within YARN is no longer supported with CDH 5.5 / Impala 2.3 and higher.
• Limits on memory usage are enforced by Impala's process memory limit (the MEM_LIMIT query option setting).
The admission control feature checks this setting to decide how many queries can be safely run at the same time.
Then the Impala daemon enforces the limit by activating the spill-to-disk mechanism when necessary, or cancelling
a query altogether if the limit is exceeded at runtime.
Performance Management
This section describes mechanisms and best practices for improving performance.
Related Information
• Tuning Impala for Performance
• Tuning YARN on page 282
Important: Work with your network administrators and hardware vendors to ensure that you have
the proper NIC firmware, drivers, and configurations in place and that your network performs properly.
Cloudera recognizes that network setup and upgrade are challenging problems, and will do its best
to share useful experiences.
1. To see whether transparent hugepage compaction is enabled, run the following command and check the output:
$ cat defrag_file_pathname
You can also disable transparent hugepage compaction interactively (but remember this will not survive a reboot).
To disable transparent hugepage compaction temporarily as root:
cat /proc/sys/vm/swappiness
<property>
<name>mapreduce.tasktracker.outofband.heartbeat</name>
<value>true</value>
</property>
Reduce the interval for JobClient status reports on single node systems
The jobclient.progress.monitor.poll.interval property defines the interval (in milliseconds) at which
JobClient reports status to the console and checks for job completion. The default value is 1000 milliseconds; you may
want to set this to a lower value to make tests run faster on a single-node cluster. Adjusting this value on a large
production cluster may lead to unwanted client-server traffic.
<property>
<name>jobclient.progress.monitor.poll.interval</name>
<value>10</value>
</property>
<property>
<name>mapreduce.jobtracker.heartbeat.interval.min</name>
<value>10</value>
</property>
<property>
<name>mapred.reduce.slowstart.completed.maps</name>
<value>0</value>
</property>
To add JARs to the classpath, use -libjars jar1,jar2. This copies the local JAR files to HDFS and uses the distributed
cache mechanism to ensure they are available on the task nodes and added to the task classpath.
The advantage of this, over JobConf.setJar, is that if the JAR is on a task node, it does not need to be copied again
if a second task from the same job runs on that node, though it will still need to be copied from the launch machine
to HDFS.
Note: -libjars works only if your MapReduce driver uses ToolRunner. If it does not, you would
need to use the DistributedCache APIs (Cloudera does not recommend this).
For more information, see item 1 in the blog post How to Include Third-Party Libraries in Your MapReduce Job.
Changing the Logging Level on a Job (MRv1)
You can change the logging level for an individual job. You do this by setting the following properties in the job
configuration (JobConf):
• mapreduce.map.log.level
• mapreduce.reduce.log.level
Valid values are NONE, INFO, WARN, DEBUG, TRACE, and ALL.
Example:
conf.set("mapreduce.map.log.level", "DEBUG");
conf.set("mapreduce.reduce.log.level", "TRACE");
...
General Guidelines
• You need to balance the processing capacity required to compress and uncompress the data, the disk IO required
to read and write the data, and the network bandwidth required to send the data across the network. The correct
balance of these factors depends upon the characteristics of your cluster and your data, as well as your usage
patterns.
• Compression is not recommended if your data is already compressed (such as images in JPEG format). In fact, the
resulting file can actually be larger than the original.
• GZIP compression uses more CPU resources than Snappy or LZO, but provides a higher compression ratio. GZip
is often a good choice for cold data, which is accessed infrequently. Snappy or LZO are a better choice for hot
data, which is accessed frequently.
• BZip2 can also produce more compression than GZip for some types of files, at the cost of some speed when
compressing and decompressing. HBase does not support BZip2 compression.
• Snappy often performs better than LZO. It is worth running tests to see if you detect a significant difference.
• For MapReduce, if you need your compressed data to be splittable, BZip2 and LZO formats can be split. Snappy
and GZip blocks are not splittable, but files with Snappy blocks inside a container file format such as SequenceFile
or Avro can be split. Snappy is intended to be used with a container format, like SequenceFiles or Avro data files,
rather than being used directly on plain text, for example, since the latter is not splittable and cannot be processed
in parallel using MapReduce. Splittability is not relevant to HBase data.
• For MapReduce, you can compress either the intermediate data, the output, or both. Adjust the parameters you
provide for the MapReduce job accordingly. The following examples compress both the intermediate data and
the output. MR2 is shown first, followed by MR1.
– MRv2
– MRv1
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
To configure support for LZO in CDH, see Step 5: (Optional) Install LZO and Configuring LZO. Snappy support is included
in CDH.
To use Snappy in a MapReduce job, see Using Snappy for MapReduce Compression. Use the same method for LZO,
with the codec com.hadoop.compression.lzo.LzopCodec instead.
Further Reading
For more information about compression algorithms in Hadoop, see Hadoop: The Definitive Guide by Tom White.
<luceneMatchVersion>4.4</luceneMatchVersion>
General Tuning
The following tuning categories can be completed at any time. It is less important to implement these changes before
beginning to use your system.
General Tips
• Enabling multi-threaded faceting can provide better performance for field faceting. When multi-threaded faceting
is enabled, field faceting tasks are completed in a parallel with a thread working on every field faceting task
simultaneously. Performance improvements do not occur in all cases, but improvements are likely when all of the
following are true:
– The system uses highly concurrent hardware.
– Faceting operations apply to large data sets over multiple fields.
– There is not an unusually high number of queries occurring simultaneously on the system. Systems that are
lightly loaded or that are mainly engaged with ingestion and indexing may be helped by multi-threaded
faceting; for example, a system ingesting articles and being queried by a researcher. Systems heavily loaded
by user queries are less likely to be helped by multi-threaded faceting; for example, an e-commerce site with
heavy user-traffic.
Note: Multi-threaded faceting only applies to field faceting and not to query faceting.
• Field faceting identifies the number of unique entries for a field. For example, multi-threaded
faceting could be used to simultaneously facet for the number of unique entries for the
fields, "color" and "size". In such a case, there would be two threads, and each thread would
work on faceting one of the two fields.
• Query faceting identifies the number of unique entries that match a query for a field. For
example, query faceting could be used to find the number of unique entries in the "size"
field are between 1 and 5. Multi-threaded faceting does not apply to these operations.
To enable multi-threaded faceting, add facet-threads to queries. For example, to use up to 1000 threads, you
might use a query as follows:
http://localhost:8983/solr/collection1/select?q=*:*&facet=true&fl=id&facet.field=f0_ws&facet.threads=1000
Configuration
The following parameters control caching. They can be configured at the Solr process level by setting the respective
system property or by editing the solrconfig.xml directly.
Note:
Increasing the direct memory cache size may make it necessary to increase the maximum direct
memory size allowed by the JVM. Each Solr slab allocates the slab's memory, which is 128 MB by
default, as well as allocating some additional direct memory overhead. Therefore, ensure that the
MaxDirectMemorySize is set comfortably above the value expected for slabs alone. The amount of
additional memory required varies according to multiple factors, but for most cases, setting
MaxDirectMemorySize to at least 20-30% more than the total memory configured for slabs is
sufficient. Setting the MaxDirectMemorySize to the number of slabs multiplied by the slab size does
not provide enough memory.
To set MaxDirectMemorySize using Cloudera Manager
1. Go to the Solr service.
2. Click the Configuration tab.
3. In the Search box, type Java Direct Memory Size of Solr Server in Bytes.
4. Set the new direct memory value.
5. Restart Solr servers after editing the parameter.
Solr HDFS optimizes caching when performing NRT indexing using Lucene's NRTCachingDirectory.
Lucene caches a newly created segment if both of the following conditions are true:
• The segment is the result of a flush or a merge and the estimated size of the merged segment is <=
solr.hdfs.nrtcachingdirectory.maxmergesizemb.
• The total cached bytes is <= solr.hdfs.nrtcachingdirectory.maxcachedmb.
The following parameters control NRT caching behavior:
<directoryFactory name="DirectoryFactory">
<bool name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}</bool>
<int name="solr.hdfs.blockcache.slab.count">${solr.hdfs.blockcache.slab.count:1}</int>
<bool
name="solr.hdfs.blockcache.direct.memory.allocation">${solr.hdfs.blockcache.direct.memory.allocation:true}</bool>
<int
name="solr.hdfs.blockcache.blocksperbank">${solr.hdfs.blockcache.blocksperbank:16384}</int>
<bool
name="solr.hdfs.blockcache.read.enabled">${solr.hdfs.blockcache.read.enabled:true}</bool>
<bool
name="solr.hdfs.blockcache.write.enabled">${solr.hdfs.blockcache.write.enabled:true}</bool>
<bool
name="solr.hdfs.nrtcachingdirectory.enable">${solr.hdfs.nrtcachingdirectory.enable:true}</bool>
<int
name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}</int>
<int
name="solr.hdfs.nrtcachingdirectory.maxcachedmb">${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}</int>
</directoryFactory>
The following example illustrates passing Java options by editing the /etc/default/solr or
/opt/cloudera/parcels/CDH-*/etc/default/solr configuration file:
For better performance, Cloudera recommends setting the Linux swap space on all Solr server hosts as shown below:
# minimize swappiness
sudo sysctl vm.swappiness=1
sudo bash -c 'echo "vm.swappiness=1">> /etc/sysctl.conf'
# disable swap space until next reboot:
sudo /sbin/swapoff -a
Threads
Configure the Tomcat server to have more threads per Solr instance. Note that this is only effective if your hardware
is sufficiently powerful to accommodate the increased threads. 10,000 threads is a reasonable number to try in many
cases.
To change the maximum number of threads, add a maxThreads element to the Connector definition in the Tomcat
server's server.xml configuration file. For example, if you installed Search for CDH 5 using parcels installation, you
Becomes this:
Garbage Collection
Choose different garbage collection options for best performance in different environments. Some garbage collection
options typically chosen include:
• Concurrent low pause collector: Use this collector in most cases. This collector attempts to minimize "Stop the
World" events. Avoiding these events can reduce connection timeouts, such as with ZooKeeper, and may improve
user experience. This collector is enabled using -XX:+UseConcMarkSweepGC.
• Throughput collector: Consider this collector if raw throughput is more important than user experience. This
collector typically uses more "Stop the World" events so this may negatively affect user experience and connection
timeouts such as ZooKeeper heartbeats. This collector is enabled using -XX:+UseParallelGC. If UseParallelGC
"Stop the World" events create problems, such as ZooKeeper timeouts, consider using the UseParNewGC collector
as an alternative collector with similar throughput benefits.
You can also affect garbage collection behavior by increasing the Eden space to accommodate new objects. With
additional Eden space, garbage collection does not need to run as frequently on new objects.
Replicated Data
You can adjust the degree to which different data is replicated.
Replicas
If you have sufficient additional hardware, add more replicas for a linear boost of query throughput. Note that adding
replicas may slow write performance on the first replica, but otherwise this should have minimal negative consequences.
Transaction Log Replication
Beginning with CDH 5.4.1, Search for CDH supports configurable transaction log replication levels for replication logs
stored in HDFS.
Configure the replication factor by modifying the tlogDfsReplication setting in solrconfig.xml. The
tlogDfsReplication is a new setting in the updateLog settings area. An excerpt of the solrconfig.xml file
where the transaction log replication factor is set is as follows:
<updateHandler class="solr.DirectUpdateHandler2">
<!-- Enables a transaction log, used for real-time get, durability, and
and solr cloud replica recovery. The log can grow as big as
uncommitted changes to the index, so use of a hard autoCommit
is recommended (see below).
"dir" - the target directory for transaction logs, defaults to the
solr data directory. -->
<updateLog>
<str name="dir">${solr.ulog.dir:}</str>
<int name="tlogDfsReplication">3</int>
</updateLog>
You might want to increase the replication level from the default level of 1 to some higher value such as 3. Increasing
the transaction log replication level can:
• Reduce the chance of data loss, especially when the system is otherwise configured to have single replicas of
shards. For example, having single replicas of shards is reasonable when autoAddReplicas is enabled, but
without additional transaction log replicas, the risk of data loss during a node failure would increase.
• Facilitate rolling upgrade of HDFS while Search is running. If you have multiple copies of the log, when a node with
the transaction log becomes unavailable during the rolling upgrade process, another copy of the log can continue
to collect transactions.
• Facilitate HDFS write lease recovery.
Initial testing shows no significant performance regression for common use cases.
Shards
In some cases, oversharding can help improve performance including intake speed. If your environment includes
massively parallel hardware and you want to use these available resources, consider oversharding. You might increase
the number of replicas per host from 1 to 2 or 3. Making such changes creates complex interactions, so you should
continue to monitor your system's performance to ensure that the benefits of oversharding do not outweigh the costs.
Commits
Changing commit values may improve performance in some situation. These changes result in tradeoffs and may not
be beneficial in all cases.
• For hard commit values, the default value of 60000 (60 seconds) is typically effective, though changing this value
to 120 seconds may improve performance in some cases. Note that setting this value to higher values, such as
600 seconds may result in undesirable performance tradeoffs.
• Consider increasing the auto-commit value from 15000 (15 seconds) to 120000 (120 seconds).
• Enable soft commits and set the value to the largest value that meets your requirements. The default value of
1000 (1 second) is too aggressive for some environments.
Other Resources
• General information on Solr caching is available on the SolrCaching page on the Solr Wiki.
• Information on issues that influence performance is available on the SolrPerformanceFactors page on the Solr
Wiki.
• Resource Management describes how to use Cloudera Manager to manage resources, for example with Linux
cgroups.
• For information on improving querying performance, see How to make searching faster.
• For information on improving indexing performance, see How to make indexing faster.
Shuffle Overview
A Spark dataset comprises a fixed number of partitions, each of which comprises a number of records. For the datasets
returned by narrow transformations, such as map and filter, the records required to compute the records in a single
partition reside in a single partition in the parent dataset. Each object is only dependent on a single object in the parent.
Operations such as coalesce can result in a task processing multiple input partitions, but the transformation is still
considered narrow because the input records used to compute any single output record can still only reside in a limited
subset of the partitions.
Spark also supports transformations with wide dependencies, such as groupByKey and reduceByKey. In these
dependencies, the data required to compute the records in a single partition can reside in many partitions of the parent
dataset. To perform these transformations, all of the tuples with the same key must end up in the same partition,
processed by the same task. To satisfy this requirement, Spark performs a shuffle, which transfers data around the
cluster and results in a new stage with a new set of partitions.
For example, consider the following code:
sc.textFile("someFile.txt").map(mapFunc).flatMap(flatMapFunc).filter(filterFunc).count()
It runs a single action, count, which depends on a sequence of three transformations on a dataset derived from a text
file. This code runs in a single stage, because none of the outputs of these three transformations depend on data that
comes from different partitions than their inputs.
In contrast, this Scala code finds how many times each character appears in all the words that appear more than 1,000
times in a text file:
This example has three stages. The two reduceByKey transformations each trigger stage boundaries, because computing
their outputs requires repartitioning the data by keys.
A final example is this more complicated transformation graph, which includes a join transformation with multiple
dependencies:
The pink boxes show the resulting stage graph used to run it:
At each stage boundary, data is written to disk by tasks in the parent stages and then fetched over the network by
tasks in the child stage. Because they incur high disk and network I/O, stage boundaries can be expensive and should
be avoided when possible. The number of data partitions in a parent stage may be different than the number of
partitions in a child stage. Transformations that can trigger a stage boundary typically accept a numPartitions
argument, which specifies into how many partitions to split the data in the child stage. Just as the number of reducers
is an important parameter in MapReduce jobs, the number of partitions at stage boundaries can determine an
application's performance. Tuning the Number of Partitions on page 280 describes how to tune this number.
This results in unnecessary object creation because a new set must be allocated for each record.
Instead, use aggregateByKey, which performs the map-side aggregation more efficiently:
• flatMap-join-groupBy. When two datasets are already grouped by key and you want to join them and keep
them grouped, use cogroup. This avoids the overhead associated with unpacking and repacking the groups.
rdd1 = someRdd.reduceByKey(...)
rdd2 = someOtherRdd.reduceByKey(...)
rdd3 = rdd1.join(rdd2)
Because no partitioner is passed to reduceByKey, the default partitioner is used, resulting in rdd1 and rdd2 both
being hash-partitioned. These two reduceByKey transformations result in two shuffles. If the datasets have the same
number of partitions, a join requires no additional shuffling. Because the datasets are partitioned identically, the set
of keys in any single partition of rdd1 can only occur in a single partition of rdd2. Therefore, the contents of any single
output partition of rdd3 depends only on the contents of a single partition in rdd1 and single partition in rdd2, and
a third shuffle is not required.
For example, if someRdd has four partitions, someOtherRdd has two partitions, and both the reduceByKeys use
three partitions, the set of tasks that run would look like this:
If rdd1 and rdd2 use different partitioners or use the default (hash) partitioner with different numbers of partitions,
only one of the datasets (the one with the fewer number of partitions) needs to be reshuffled for the join:
To avoid shuffles when joining two datasets, you can use broadcast variables. When one of the datasets is small enough
to fit in memory in a single executor, it can be loaded into a hash table on the driver and then broadcast to every
executor. A map transformation can then reference the hash table to do lookups.
while not generating enough partitions to use all available cores. In this case, invoking repartition with a high number
of partitions (which triggers a shuffle) after loading the data allows the transformations that follow to use more of the
cluster's CPU.
Another example arises when using the reduce or aggregate action to aggregate data into the driver. When
aggregating over a high number of partitions, the computation can quickly become bottlenecked on a single thread in
the driver merging all the results together. To lighten the load on the driver, first use reduceByKey or aggregateByKey
to perform a round of distributed aggregation that divides the dataset into a smaller number of partitions. The values
in each partition are merged with each other in parallel, before being sent to the driver for a final round of aggregation.
See treeReduce and treeAggregate for examples of how to do that.
This method is especially useful when the aggregation is already grouped by a key. For example, consider an application
that counts the occurrences of each word in a corpus and pulls the results into the driver as a map. One approach,
which can be accomplished with the aggregate action, is to compute a local map at each partition and then merge
the maps at the driver. The alternative approach, which can be accomplished with aggregateByKey, is to perform
the count in a fully distributed way, and then simply collectAsMap the results to the driver.
Secondary Sort
The repartitionAndSortWithinPartitions transformation repartitions the dataset according to a partitioner
and, within each resulting partition, sorts records by their keys. This transformation pushes sorting down into the
shuffle machinery, where large amounts of data can be spilled efficiently and sorting can be combined with other
operations.
For example, Apache Hive on Spark uses this transformation inside its join implementation. It also acts as a vital
building block in the secondary sort pattern, in which you group records by key and then, when iterating over the
values that correspond to a key, have them appear in a particular order. This scenario occurs in algorithms that need
to group events by user and then analyze the events for each user, based on the time they occurred.
• The --executor-memory/spark.executor.memory property controls the executor heap size, but executors
can also use some memory off heap, for example, Java NIO direct buffers. The value of the
spark.yarn.executor.memoryOverhead property is added to the executor memory to determine the full
memory request to YARN for each executor. It defaults to max(384, .07 * spark.executor.memory).
• YARN may round the requested memory up slightly. The yarn.scheduler.minimum-allocation-mb and
yarn.scheduler.increment-allocation-mb properties control the minimum and increment request values,
respectively.
The following diagram (not to scale with defaults) shows the hierarchy of memory properties in Spark and YARN:
As described in Spark Execution Model, Spark groups datasets into stages. The number of tasks in a stage is the same
as the number of partitions in the last dataset in the stage. The number of partitions in a dataset is the same as the
number of partitions in the datasets on which it depends, with the following exceptions:
• The coalesce transformation creates a dataset with fewer partitions than its parent dataset.
• The union transformation creates a dataset with the sum of its parents' number of partitions.
• The cartesian transformation creates a dataset with the product of its parents' number of partitions.
Datasets with no parents, such as those produced by textFile or hadoopFile, have their partitions determined by
the underlying MapReduce InputFormat used. Typically, there is a partition for each HDFS block being read. The
number of partitions for datasets produced by parallelize are specified in the method, or
spark.default.parallelism if not specified. To determine the number of partitions in an dataset, call
rdd.partitions().size().
If the number of tasks is smaller than number of slots available to run them, CPU usage is suboptimal. In addition, more
memory is used by any aggregation operations that occur in each task. In join, cogroup, or *ByKey operations,
objects are held in in hashmaps or in-memory buffers to group or sort. join, cogroup, and groupByKey use these
data structures in the tasks for the stages that are on the fetching side of the shuffles they trigger. reduceByKey and
aggregateByKey use data structures in the tasks for the stages on both sides of the shuffles they trigger. If the records
in these aggregation operations exceed memory, the following issues can occur:
• Holding a high number records in these data structures increases garbage collection, which can lead to pauses in
computation.
• Spark spills them to disk, causing disk I/O and sorting that leads to job stalls.
To increase the number of partitions if the stage is reading from Hadoop:
• Use the repartition transformation, which triggers a shuffle.
• Configure your InputFormat to create more splits.
• Write the input data to HDFS with a smaller block size.
If the stage is receiving input from another stage, the transformation that triggered the stage boundary accepts a
numPartitions argument:
Determining the optimal value for X requires experimentation. Find the number of partitions in the parent dataset,
and then multiply that by 1.5 until performance stops improving.
You can also calculate X in a more formulaic way, but some quantities in the formula are difficult to calculate. The main
goal is to run enough tasks so that the data destined for each task fits in the memory available to that task. The memory
available to each task is:
The in-memory size of the total shuffle data is more difficult to determine. The closest heuristic is to find the ratio
between shuffle spill memory and the shuffle spill disk for a stage that ran. Then, multiply the total shuffle write by
this number. However, this can be compounded if the stage is performing a reduction:
Then, round up slightly, because too many partitions is usually better than too few.
When in doubt, err on the side of a larger number of tasks (and thus partitions). This contrasts with recommendations
for MapReduce, which unlike Spark, has a high startup overhead for tasks.
Tuning YARN
This topic applies to YARN clusters only, and describes how to tune and optimize YARN for your cluster.
Note: Download the Cloudera YARN tuning spreadsheet to help calculate YARN configurations. For
a short video overview, see Tuning YARN Applications.
Overview
This overview provides an abstract description of a YARN cluster and the goals of YARN tuning.
There are three phases to YARN tuning. The phases correspond to the tabs in the YARN tuning spreadsheet.
1. Cluster configuration, where you configure your hosts.
2. YARN configuration, where you quantify memory and vcores.
3. MapReduce configuration, where you allocate minimum and maximum resources for specific map and reduce
tasks.
YARN and MapReduce have many configurable properties. For a complete list, see Cloudera Manager Configuration
Properties. The YARN tuning spreadsheet lists the essential subset of these properties that are most likely to improve
performance for common MapReduce applications.
Cluster Configuration
In the Cluster Configuration tab, you define the worker host configuration and cluster size for your YARN implementation.
As with any system, the more memory and CPU resources available, the faster the cluster can process large amounts
of data. A machine with 8 CPUs, each with 6 cores, provides 48 vcores per host.
3 TB hard drives in a 2-unit server installation with 12 available slots in JBOD (Just a Bunch Of Disks) configuration is a
reasonable balance of performance and pricing at the time the spreadsheet was created. The cost of storage decreases
over time, so you might consider 4 TB disks. Larger disks are expensive and not required for all use cases.
Two 1-Gigabit Ethernet ports provide sufficient throughput at the time the spreadsheet was published, but 10-Gigabit
Ethernet ports are an option where price is of less concern than speed.
Start with at least 8 GB for your operating system, and 1 GB for Cloudera Manager. If services outside of CDH require
additional resources, add those numbers under Other Services.
The HDFS DataNode uses a minimum of 1 core and about 1 GB of memory. The same requirements apply to the YARN
NodeManager.
The spreadsheet lists three optional services. For Impala, allocate at least 16 GB for the daemon. HBase RegionServer
requires 12-16 GB of memory. Solr Server requires a minimum of 1 GB of memory.
Any remaining resources are available for YARN applications (Spark and MapReduce). In this example, 44 CPU cores
are available. Set the multiplier for vcores you want on each physical core to calculate the total available vcores.
YARN Configuration
On the YARN Configuration tab, you verify your available resources and set minimum and maximum limits for each
container.
MapReduce Configuration
On the MapReduce Configuration tab, you can plan for increased task-specific memory capacity.
High Availability
This guide is for Apache Hadoop system administrators who want to enable continuous availability by configuring
clusters without single points of failure.
Background
In a standard configuration, the NameNode is a single point of failure (SPOF) in an HDFS cluster. Each cluster has a
single NameNode, and if that host or process became unavailable, the cluster as a whole is unavailable until the
NameNode is either restarted or brought up on a new host. The Secondary NameNode does not provide failover
capability.
The standard configuration reduces the total availability of an HDFS cluster in two major ways:
• In the case of an unplanned event such as a host crash, the cluster is unavailable until an operator restarts the
NameNode.
• Planned maintenance events such as software or hardware upgrades on the NameNode machine result in periods
of cluster downtime.
HDFS HA addresses the above problems by providing the option of running two NameNodes in the same cluster, in an
active/passive configuration. These are referred to as the active NameNode and the standby NameNode. Unlike the
Secondary NameNode, the standby NameNode is hot standby, allowing a fast automatic failover to a new NameNode
in the case that a host crashes, or a graceful administrator-initiated failover for the purpose of planned maintenance.
You cannot have more than two NameNodes.
Implementation
Cloudera Manager 5 and CDH 5 support Quorum-based Storage to implement HA.
Quorum-based Storage
Quorum-based Storage refers to the HA implementation that uses a Quorum Journal Manager (QJM).
In order for the standby NameNode to keep its state synchronized with the active NameNode in this implementation,
both nodes communicate with a group of separate daemons called JournalNodes. When any namespace modification
is performed by the active NameNode, it durably logs a record of the modification to a majority of the JournalNodes.
The standby NameNode is capable of reading the edits from the JournalNodes, and is constantly watching them for
changes to the edit log. As the standby Node sees the edits, it applies them to its own namespace. In the event of a
failover, the standby ensures that it has read all of the edits from the JournalNodes before promoting itself to the
active state. This ensures that the namespace state is fully synchronized before a failover occurs.
In order to provide a fast failover, it is also necessary that the standby NameNode has up-to-date information regarding
the location of blocks in the cluster. In order to achieve this, DataNodes are configured with the location of both
NameNodes, and they send block location information and heartbeats to both.
It is vital for the correct operation of an HA cluster that only one of the NameNodes be active at a time. Otherwise,
the namespace state would quickly diverge between the two, risking data loss or other incorrect results. In order to
ensure this property and prevent the so-called "split-brain scenario," JournalNodes only ever allow a single NameNode
to be a writer at a time. During a failover, the NameNode which is to become active simply takes over the role of writing
to the JournalNodes, which effectively prevents the other NameNode from continuing in the active state, allowing the
new active NameNode to safely proceed with failover.
Note: Because of this, fencing is not required, but it is still useful; see Enabling HDFS HA on page 293.
Automatic Failover
Automatic failover relies on two additional components in an HDFS: a ZooKeeper quorum, and the
ZKFailoverController process (abbreviated as ZKFC). In Cloudera Manager, the ZKFC process maps to the HDFS
Failover Controller role.
Apache ZooKeeper is a highly available service for maintaining small amounts of coordination data, notifying clients
of changes in that data, and monitoring clients for failures. The implementation of HDFS automatic failover relies on
ZooKeeper for the following functions:
• Failure detection - each of the NameNode machines in the cluster maintains a persistent session in ZooKeeper.
If the machine crashes, the ZooKeeper session will expire, notifying the other NameNode that a failover should
be triggered.
• Active NameNode election - ZooKeeper provides a simple mechanism to exclusively elect a node as active. If the
current active NameNode crashes, another node can take a special exclusive lock in ZooKeeper indicating that it
should become the next active NameNode.
The ZKFailoverController (ZKFC) is a ZooKeeper client that also monitors and manages the state of the NameNode.
Each of the hosts that run a NameNode also run a ZKFC. The ZKFC is responsible for:
• Health monitoring - the ZKFC contacts its local NameNode on a periodic basis with a health-check command. So
long as the NameNode responds promptly with a healthy status, the ZKFC considers the NameNode healthy. If
the NameNode has crashed, frozen, or otherwise entered an unhealthy state, the health monitor marks it as
unhealthy.
• ZooKeeper session management - when the local NameNode is healthy, the ZKFC holds a session open in ZooKeeper.
If the local NameNode is active, it also holds a special lock znode. This lock uses ZooKeeper's support for
"ephemeral" nodes; if the session expires, the lock node is automatically deleted.
• ZooKeeper-based election - if the local NameNode is healthy, and the ZKFC sees that no other NameNode currently
holds the lock znode, it will itself try to acquire the lock. If it succeeds, then it has "won the election", and is
responsible for running a failover to make its local NameNode active. The failover process is similar to the manual
failover described above: first, the previous active is fenced if necessary, and then the local NameNode transitions
to active state.
system can tolerate at most (N - 1) / 2 failures and continue to function normally. If the requisite quorum is not
available, the NameNode will not format or start, and you will see an error similar to this:
12/10/01 17:34:18 WARN namenode.FSEditLog: Unable to determine input streams from QJM
to [10.0.1.10:8485, 10.0.1.10:8486, 10.0.1.10:8487]. Skipping.
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
Note: In an HA cluster, the standby NameNode also performs checkpoints of the namespace state,
and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an
HA cluster. In fact, to do so would be an error. If you are reconfiguring a non-HA-enabled HDFS cluster
to be HA-enabled, you can reuse the hardware which you had previously dedicated to the Secondary
NameNode.
Enabling HDFS HA
An HDFS high availability (HA) cluster uses two NameNodes—an active NameNode and a standby NameNode. Only
one NameNode can be active at any point in time. HDFS HA depends on maintaining a log of all namespace modifications
in a location available to both NameNodes, so that in the event of a failure, the standby NameNode has up-to-date
information about the edits and location of blocks in the cluster.
Important: Enabling and disabling HA causes a service outage for the HDFS service and all services
that depend on HDFS. Before enabling or disabling HA, ensure that there are no jobs running on your
cluster.
Important:
• Enabling or disabling HA causes the previous monitoring history to become unavailable.
• Some parameters will be automatically set as follows once you have enabled JobTracker HA. If
you want to change the value from the default for these parameters, use an advanced
configuration snippet.
– mapred.jobtracker.restart.recover: true
– mapred.job.tracker.persist.jobstatus.active: true
– mapred.ha.automatic-failover.enabled: true
– mapred.ha.fencing.methods: shell(/bin/true)
a. Specify a name for the nameservice or accept the default name nameservice1 and click Continue.
b. In the NameNode Hosts field, click Select a host. The host selection dialog box displays.
c. Check the checkbox next to the hosts where you want the standby NameNode to be set up and click OK. The
standby NameNode cannot be on the same host as the active NameNode, and the host that is chosen should
have the same hardware configuration (RAM, disk space, number of cores, and so on) as the active NameNode.
d. In the JournalNode Hosts field, click Select hosts. The host selection dialog box displays.
e. Check the checkboxes next to an odd number of hosts (a minimum of three) to act as JournalNodes and click
OK. JournalNodes should be hosted on hosts with similar hardware specification as the NameNodes. Cloudera
recommends that you put a JournalNode each on the same hosts as the active and standby NameNodes, and
the third JournalNode on similar hardware, such as the JobTracker.
f. Click Continue.
g. In the JournalNode Edits Directory property, enter a directory location for the JournalNode edits directory
into the fields for each JournalNode host.
• You may enter only one directory for each JournalNode. The paths do not need to be the same on every
JournalNode.
• The directories you specify should be empty, and must have the appropriate permissions.
h. Extra Options: Decide whether Cloudera Manager should clear existing data in ZooKeeper, standby NameNode,
and JournalNodes. If the directories are not empty (for example, you are re-enabling a previous HA
configuration), Cloudera Manager will not automatically delete the contents—you can select to delete the
contents by keeping the default checkbox selection. The recommended default is to clear the directories. If
you choose not to do so, the data should be in sync across the edits directories of the JournalNodes and
should have the same version data as the NameNodes.
i. Click Continue.
Cloudera Manager executes a set of commands that will stop the dependent services, delete, create, and configure
roles and directories as appropriate, create a nameservice and failover controller, and restart the dependent
services and deploy the new client configuration.
5. If you want to use other services in a cluster with HA configured, follow the procedures in Configuring Other CDH
Components to Use HDFS HA on page 306.
rmr /hadoop-ha/nameservice1
Fencing Methods
To ensure that only one NameNode is active at a time, a fencing method is required for the shared edits directory.
During a failover, the fencing method is responsible for ensuring that the previous active NameNode no longer has
access to the shared edits directory, so that the new active NameNode can safely proceed writing to it.
By default, Cloudera Manager configures HDFS to use a shell fencing method
(shell(./cloudera_manager_agent_fencer.py)) that takes advantage of the Cloudera Manager Agent. However,
you can configure HDFS to use the sshfence method, or you can add your own shell fencing scripts, instead of or in
addition to the one Cloudera Manager provides.
The fencing parameters are found in the Service-Wide > High Availability category under the configuration properties
for your HDFS service.
For details of the fencing methods supplied with CDH 5, and how fencing is configured, see Fencing Configuration on
page 299.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
This section describes the software configuration required for HDFS HA in CDH 5 and explains how to set configuration
properties and use the command line to deploy HDFS HA.
Configuring Software for HDFS HA
Configuration Overview
As with HDFS federation configuration, HA configuration is backward compatible and allows existing single NameNode
configurations to work without change. The new configuration is designed such that all the nodes in the cluster can
have the same configuration without the need for deploying different configuration files to different machines based
on the type of the node.
HA clusters reuse the Nameservice ID to identify a single HDFS instance that may consist of multiple HA NameNodes.
In addition, there is a new abstraction called NameNode ID. Each distinct NameNode in the cluster has a different
NameNode ID. To support a single configuration file for all of the NameNodes, the relevant configuration parameters
include the Nameservice ID as well as the NameNode ID.
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
• For MRv1:
<property>
<name>fs.default.name</name>
<value>hdfs://mycluster</value>
</property>
Choose a logical name for this nameservice, for example mycluster, and use this logical name for the value of this
configuration option. The name you choose is arbitrary. It will be used both for configuration and as the authority
component of absolute HDFS paths in the cluster.
Note: If you are also using HDFS federation, this configuration setting should also include the list of
other Nameservices, HA or otherwise, as a comma-separated list.
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
Configure a list of comma-separated NameNode IDs. This will be used by DataNodes to determine all the NameNodes
in the cluster. For example, if you used mycluster as the NameService ID previously, and you wanted to use nn1 and
nn2 as the individual IDs of the NameNodes, you would configure this as follows:
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
Note: In this release, you can configure a maximum of two NameNodes per nameservice.
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>machine1.example.com:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>machine2.example.com:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>machine1.example.com:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>machine2.example.com:50070</value>
</property>
Note: If you have Hadoop Kerberos security features enabled, and you intend to use HSFTP, you
should also set the https-address similarly for each NameNode.
Configure dfs.namenode.shared.edits.dir
dfs.namenode.shared.edits.dir - the location of the shared storage directory
Configure the addresses of the JournalNodes which provide the shared edits storage, written to by the Active NameNode
and read by the Standby NameNode to stay up-to-date with all the file system changes the Active NameNode makes.
Though you must specify several JournalNode addresses, you should only configure one of these URIs. The URI should
be in the form:
qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId>
The Journal ID is a unique identifier for this nameservice, which allows a single set of JournalNodes to provide storage
for multiple federated namesystems. Though it is not a requirement, it's a good idea to reuse the Nameservice ID for
the journal identifier.
For example, if the JournalNodes for this cluster were running on the machines node1.example.com,
node2.example.com, and node3.example.com, and the nameservice ID were mycluster, you would use the
following as the value for this setting (the default port for the JournalNode is 8485):
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster</value>
</property>
Configure dfs.journalnode.edits.dir
dfs.journalnode.edits.dir - the path where the JournalNode daemon will store its local state
On each JournalNode machine, configure the absolute path where the edits and other local state information used by
the JournalNodes will be stored; use only a single path per JournalNode. (The other JournalNodes provide redundancy;
you can also configure this directory on a locally-attached RAID-1 or RAID-10 array.)
For example:
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/1/dfs/jn</value>
</property>
Now create the directory (if it doesn't already exist) and make sure its owner is hdfs, for example:
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
Fencing Configuration
dfs.ha.fencing.methods - a list of scripts or Java classes which will be used to fence the active NameNode during
a failover
It is desirable for correctness of the system that only one NameNode be in the active state at any given time.
Important: When you use Quorum-based Storage, only one NameNode will ever be allowed to write
to the JournalNodes, so there is no potential for corrupting the file system metadata in a "split-brain"
scenario. But when a failover occurs, it is still possible that the previously active NameNode could
serve read requests to clients - and these requests may be out of date - until that NameNode shuts
down when it tries to write to the JournalNodes. For this reason, it is still desirable to configure some
fencing methods even when using Quorum-based Storage.
To improve the availability of the system in the event the fencing mechanisms fail, it is advisable to configure a fencing
method which is guaranteed to return success as the last fencing method in the list.
Note: If you choose to use no actual fencing methods, you still must configure something for this
setting, for example shell(/bin/true).
The fencing methods used during a failover are configured as a carriage-return-separated list, and these will be
attempted in order until one of them indicates that fencing has succeeded.
There are two fencing methods which ship with Hadoop:
• sshfence
• shell
For information on implementing your own custom fencing method, see the org.apache.hadoop.ha.NodeFencer
class.
Configuring the sshfence fencing method
sshfence - SSH to the active NameNode and kill the process
The sshfence option uses SSH to connect to the target node and uses fuser to kill the process listening on the
service's TCP port. In order for this fencing option to work, it must be able to SSH to the target node without providing
a passphrase. Thus, you must also configure the dfs.ha.fencing.ssh.private-key-files option, which is a
comma-separated list of SSH private key files.
Important: The files must be accessible to the user running the NameNode processes (typically the
hdfs user on the NameNode hosts).
For example:
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/exampleuser/.ssh/id_rsa</value>
</property>
Optionally, you can configure a non-standard username or port to perform the SSH as shown below. You can also
configure a timeout, in milliseconds, for the SSH, after which this fencing method will be considered to have failed:
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence([[username][:port]])</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
<description>
SSH connection timeout, in milliseconds, to use with the builtin
sshfence fencer.
</description>
</property>
The shell fencing method runs an arbitrary shell command, which you can configure as shown below:
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/path/to/my/script.sh arg1 arg2 ...)</value>
</property>
The string between '(' and ')' is passed directly to a bash shell and cannot include any closing parentheses.
When executed, the first argument to the configured script will be the address of the NameNode to be fenced, followed
by all arguments specified in the configuration.
The shell command will be run with an environment set up to contain all of the current Hadoop configuration variables,
with the '_' character replacing any '.' characters in the configuration keys. The configuration used has already had any
NameNode-specific configurations promoted to their generic forms - for example dfs_namenode_rpc-address will
contain the RPC address of the target node, even though the configuration may specify that variable as
dfs.namenode.rpc-address.ns1.nn1.
The following variables referring to the target node to be fenced are also available:
Variable Description
$target_host Hostname of the node to be fenced
$target_port IPC port of the node to be fenced
$target_address The two variables above, combined as host:port
$target_nameserviceid The nameservice ID of the NameNode to be fenced
$target_namenodeid The NameNode ID of the NameNode to be fenced
You can also use these environment variables as substitutions in the shell command itself. For example:
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid
$target_host:$target_port)</value>
</property>
If the shell command returns an exit code of 0, the fencing is determined to be successful. If it returns any other exit
code, the fencing was not successful and the next fencing method in the list will be attempted.
Note: This fencing method does not implement any timeout. If timeouts are necessary, they should
be implemented in the shell script itself (for example, by forking a subshell to kill its parent in some
number of seconds).
Note: Before you begin configuring automatic failover, you must shut down your cluster. It is not
currently possible to transition from a manual failover setup to an automatic failover setup while the
cluster is running.
Configuring automatic failover requires two additional configuration parameters. In your hdfs-site.xml file, add:
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
This specifies that the cluster should be set up for automatic failover. In your core-site.xml file, add:
<property>
<name>ha.zookeeper.quorum</name>
<value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value>
</property>
There are several other configuration parameters which you can set to control the behavior of automatic failover, but
they are not necessary for most installations. See the configuration section of the Hadoop documentation for details.
Initializing the HA state in ZooKeeper
After you have added the configuration keys, the next step is to initialize the required state in ZooKeeper. You can do
so by running the following command from one of the NameNode hosts.
Note: The ZooKeeper ensemble must be running when you use this command; otherwise it will not
work properly.
This will create a znode in ZooKeeper in which the automatic failover system stores its data.
Securing access to ZooKeeper
If you are running a secure cluster, you will probably want to ensure that the information stored in ZooKeeper is also
secured. This prevents malicious clients from modifying the metadata in ZooKeeper or potentially triggering a false
failover.
In order to secure the information in ZooKeeper, first add the following to your core-site.xml file:
<property>
<name>ha.zookeeper.auth</name>
<value>@/path/to/zk-auth.txt</value>
</property>
<property>
<name>ha.zookeeper.acl</name>
<value>@/path/to/zk-acl.txt</value>
</property>
Note the '@' character in these values – this specifies that the configurations are not inline, but rather point to a file
on disk.
The first configured file specifies a list of ZooKeeper authentications, in the same format as used by the ZooKeeper
CLI. For example, you may specify something like digest:hdfs-zkfcs:mypassword where hdfs-zkfcs is a unique
username for ZooKeeper, and mypassword is some unique string used as a password.
Next, generate a ZooKeeper Access Control List (ACL) that corresponds to this authentication, using a command such
as the following:
Copy and paste the section of this output after the '->' string into the file zk-acls.txt, prefixed by the string "digest:".
For example:
digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
To put these ACLs into effect, rerun the zkfc -formatZK command as described above.
After doing so, you can verify the ACLs from the ZooKeeper CLI as follows:
Important: Before you start:Make sure you have performed all the configuration and setup tasks
described under Configuring Hardware for HDFS HA on page 292 and Configuring Software for HDFS
HA on page 296, including initializing the HA state in ZooKeeper if you are deploying automatic failover.
2. Start the JournalNode daemons on each of the machines where they will run:
Wait for the daemons to start before formatting the primary NameNode (in a new cluster) and before starting the
NameNodes (in all cases).
Format the NameNode (if new cluster)
If you are setting up a new HDFS cluster, format the NameNode you will use as your primary NameNode; see Formatting
the NameNode.
Important: Make sure the JournalNodes have started. Formatting will fail if you have configured the
NameNode to communicate with the JournalNodes, but have not started the JournalNodes.
Initialize the Shared Edits directory (if converting existing non-HA cluster)
If you are converting a non-HA NameNode to HA, initialize the shared edits directory with the edits data from the local
NameNode edits directories:
Note: If Kerberos is enabled, do not use commands in the form sudo -u <user> <command>;
they will fail with a security error. Instead, use the following commands: $ kinit <user> (if
you are using a password) or $ kinit -kt <keytab> <principal> (if you are using a keytab)
and then, for each command executed by this user, $ <command>
Starting the standby NameNode with the -bootstrapStandby option copies over the contents of the primary
NameNode's metadata directories (including the namespace information and most recent checkpoint) to the standby
NameNode. (The location of the directories containing the NameNode metadata is configured using the configuration
options dfs.namenode.name.dir and dfs.namenode.edits.dir.)
You can visit each NameNode's web page by browsing to its configured HTTP address. Notice that next to the configured
address is the HA state of the NameNode (either "Standby" or "Active".) Whenever an HA NameNode starts and
automatic failover is not enabled, it is initially in the Standby state. If automatic failover is enabled the first NameNode
that is started will become active.
Restart Services (if converting existing non-HA cluster)
If you are converting from a non-HA to an HA configuration, you need to restart the JobTracker and TaskTracker (for
MRv1, if used), or ResourceManager, NodeManager, and JobHistory Server (for YARN), and the DataNodes:
On each DataNode:
On each NodeManager system (YARN; typically the same ones where DataNode service runs):
It is not important that you start the ZKFC and NameNode daemons in a particular order. On any given node you can
start the ZKFC before or after its corresponding NameNode.
You should add monitoring on each host that runs a NameNode to ensure that the ZKFC remains running. In some
types of ZooKeeper failures, for example, the ZKFC may unexpectedly exit, and should be restarted to ensure that the
system is ready for automatic failover.
Additionally, you should monitor each of the servers in the ZooKeeper quorum. If ZooKeeper crashes, then automatic
failover will not function. If the ZooKeeper cluster crashes, no automatic failovers will be triggered. However, HDFS
will continue to run without any impact. When ZooKeeper is restarted, HDFS will reconnect with no issues.
Verifying Automatic Failover
After the initial deployment of a cluster with automatic failover enabled, you should test its operation. To do so, first
locate the active NameNode. As mentioned above, you can tell which node is active by visiting the NameNode web
interfaces.
Once you have located your active NameNode, you can cause a failure on that node. For example, you can use kill
-9 <pid of NN> to simulate a JVM crash. Or you can power-cycle the machine or its network interface to simulate
different kinds of outages. After you trigger the outage you want to test, the other NameNode should automatically
become active within several seconds. The amount of time required to detect a failure and trigger a failover depends
on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc daemons as well as the
NameNode daemons in order to further diagnose the issue.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
If you need to unconfigure HA and revert to using a single NameNode, either permanently or for upgrade or testing
purposes, proceed as follows.
Important: Only Quorum-based storage is supported in CDH 5. If you already using Quorum-based
storage, you do not need to unconfigure it to upgrade.
2. Check each host to make sure that there are no processes running as the hdfs, yarn, mapred or httpfs users
from root:
Step 2: Unconfigure HA
1. Disable the software configuration.
• If you are using Quorum-based storage and want to unconfigure it, unconfigure the HA properties described
under Enabling HDFS HA Using the Command Line on page 295.
If you intend to redeploy HDFS HA later, comment out the HA properties rather than deleting them.
2. Move the NameNode metadata directories on the standby NameNode. The location of these directories is
configured by dfs.namenode.name.dir and dfs.namenode.edits.dir. Move them to a backup location.
2. Stop the cluster by shutting down the Master and the RegionServers:
• Use the following command on the Master host:
Configure hbase.rootdir
Change the distributed file system URI in hbase-site.xml to the name specified in the dfs.nameservices property
in hdfs-site.xml. The clients must also have access to hdfs-site.xml's dfs.client.* settings to properly use
HA.
For example, suppose the HDFS HA property dfs.nameservices is set to ha-nn in hdfs-site.xml. To configure
HBase to use the HA NameNodes, specify that same value as part of your hbase-site.xml's hbase.rootdir value:
Restart HBase
1. Start the HBase Master.
2. Start each of the HBase RegionServers.
HBase-HDFS HA Troubleshooting
Problem: HMasters fail to start.
Solution: Check for this error in the HMaster log:
If so, verify that Hadoop's hdfs-site.xml and core-site.xml files are in your hbase/conf directory. This may
be necessary if you put your configurations in non-standard places.
Note: You may want to stop the Hue and Impala services first, if present, because they depend
on the Hive service.
Note: Before attempting to upgrade the Hive metastore to use HDFS HA, shut down the metastore
and back it up to a persistent store.
If you are unsure which version of Avro SerDe is used, use both the serdePropKey and tablePropKey arguments.
For example:
where:
• hdfs://oldnamenode.com/user/hive/warehouse identifies the NameNode location.
• hdfs://nameservice1 specifies the new location and should match the value of the dfs.nameservices
property.
• tablePropKey is a table property key whose value field may reference the HDFS NameNode location and hence
may require an update. To update the Avro SerDe schema URL, specify avro.schema.url for this argument.
• serdePropKey is a SerDe property key whose value field may reference the HDFS NameNode location and hence
may require an update. To update the Haivvero schema URL, specify schema.url for this argument.
Note: The Hive metatool is a best effort service that tries to update as many Hive metastore records
as possible. If it encounters an error during the update of a record, it skips to the next record.
Example:
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>hdfs://ha-nn
Note: For advanced use only: You can set the Force Failover checkbox to force the selected
NameNode to be active, irrespective of its state or the other NameNode's state. Forcing a failover
will first attempt to failover the selected NameNode to active mode and the other NameNode
to standby mode. It will do so even if the selected NameNode is in safe mode. If this fails, it will
proceed to transition the selected NameNode to active mode. To avoid having two NameNodes
be active, use this only if the other NameNode is either definitely stopped, or can be transitioned
to standby mode by the first failover step.
Manually Failing Over to the Standby NameNode Using the Command Line
To initiate a failover between two NameNodes, run the command hdfs haadmin -failover.
This command causes a failover from the first provided NameNode to the second. If the first NameNode is in the
Standby state, this command simply transitions the second to the Active state without error. If the first NameNode is
in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing
methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one of the methods succeeds.
Only after this process will the second NameNode be transitioned to the Active state. If no fencing method succeeds,
the second NameNode will not be transitioned to the Active state, and an error will be returned.
Note: Running hdfs haadmin -failover from the command line works whether you have
configured HA from the command line or using Cloudera Manager. This means you can initiate a
failover manually even if Cloudera Manager is unavailable.
4. Make sure these services are not set to restart on boot. If you are not planning to use nn2 as a NameNode again,
you may want to remove the services.
Note: You do not need to shut down the cluster to do this if automatic failover is already
configured as your failover method; shutdown is required only if you are switching from manual
to automatic failover.
Step 5: Copy the contents of the dfs.name.dir and dfs.journalnode.edits.dir directories to nn2-alt
Use rsync or a similar tool to copy the contents of the dfs.name.dir directory, and the
dfs.journalnode.edits.dir directory if you are moving the JournalNode, from nn2 to nn2-alt.
Step 7: If you are using automatic failover, install the zkfc daemon on nn2-alt
For instructions, see Deploy Automatic Failover (if it is configured) on page 305, but do not start the daemon yet.
getServiceState
getServiceState - Determine whether the given NameNode is active or standby.
Connect to the provided NameNode to determine its current state, printing either "standby" or "active" to STDOUT as
appropriate. This subcommand might be used by cron jobs or monitoring scripts, which need to behave differently
based on whether the NameNode is currently active or standby.
checkHealth
checkHealth - Check the health of the given NameNode.
Connect to the provided NameNode to check its health. The NameNode can perform some diagnostics on itself,
including checking if internal services are running as expected. This command returns 0 if the NameNode is healthy,
non-zero otherwise. You can use this command for monitoring purposes.
For a full list of dfsadmin command options, run: hdfs dfsadmin -help.
2. Although the standby NameNode role is removed, its name directories are not deleted. Empty these directories.
3. Enable HA with Quorum-based storage.
Converting From an NFS-mounted Shared Edits Directory to Quorum-based Storage Using the Command Line
To switch from shared storage using NFS to Quorum-based storage, proceed as follows:
1. Disable HA.
2. Redeploy HA using Quorum-based storage.
Changing a Nameservice Name for Highly Available HDFS Using Cloudera Manager
For background on HDFS namespaces and HDFS high availability, see Managing Federated Nameservices on page 133
and Enabling HDFS HA Using Cloudera Manager on page 293.
1. Stop all services except ZooKeeper.
2. On a ZooKeeper server host, run zookeeper-client.
a. Execute the following to remove the configured nameservice. This example assumes the name of the
nameservice is nameservice1. You can identify the nameservice from the Federation and High Availability
section on the HDFS Instances tab:
rmr /hadoop-ha/nameservice1
3. In the Cloudera Manager Admin Console, update the NameNode nameservice name.
a. Go to the HDFS service.
b. Click the Configuration tab.
c. Type nameservice in the Search field.
d. For the NameNode Nameservice property, type the nameservice name in the NameNode (instance_name)
field. The name must be unique and can contain only alphanumeric characters.
e. Type quorum in the Search field.
f. For the Quorum-based Storage Journal name property, type the nameservice name in the NameNode
(instance_name) field.
g. Click Save Changes to commit the changes.
4. Click the Instances tab.
5. In the Federation and High Availability pane, select Actions > Initialize High Availability State in ZooKeeper.
6. Go to the Hive service.
7. Select Actions > Update Hive Metastore NameNodes.
8. Go to the HDFS service.
9. Click the Instances tab.
10. Select the checkboxes next to the JournalNode role instances.
11. Select Actions for Selected > Start.
12. Click a NameNode role instance.
13. Select Actions > Initialize Shared Edits Directory.
14. Click the Cloudera Manager logo to return to the Home page.
15. Redeploy client configuration files.
16. Start all services except ZooKeeper.
Architecture
ResourceManager HA is implemented by means of an active-standby pair of ResourceManagers. On start-up, each
ResourceManager is in the standby state; the process is started, but the state is not loaded. When one of the
ResourceManagers is transitioning to the active state, the ResourceManager loads the internal state from the designated
state store and starts all the internal services. The stimulus to transition to active comes from either the administrator
(through the CLI) or through the integrated failover controller when automatic failover is enabled. The subsections
that follow provide more details about the components of ResourceManager HA.
ResourceManager Restart
Restarting the ResourceManager allows for the recovery of in-flight applications if recovery is enabled. To achieve this,
the ResourceManager stores its internal state, primarily application-related data and tokens, to the
ResourceManagerStateStore; the cluster resources are re-constructed when the NodeManagers connect. The
available alternatives for the state store are MemoryResourceManagerStateStore (a memory-based implementation),
FileSystemResourceManagerStateStore (file system-based implementation; HDFS can be used for the file
system), and ZKResourceManagerStateStore (ZooKeeper-based implementation).
Fencing
When running two ResourceManagers, a split-brain situation can arise where both ResourceManagers assume they
are active. To avoid this, only a single ResourceManager should be able to perform active operations and the other
ResourceManager should be "fenced". The ZooKeeper-based state store (ZKResourceManagerStateStore) allows
only a single ResourceManager to make changes to the stored state, implicitly fencing the other ResourceManager.
This is accomplished by the ResourceManager claiming exclusive create-delete permissions on the root znode. The
ACLs on the root znode are automatically created based on the ACLs configured for the store; in case of secure clusters,
Cloudera recommends that you set ACLs for the root node such that both ResourceManagers share read-write-admin
access, but have exclusive create-delete access. The fencing is implicit and doesn't require explicit configuration (as
fencing in HDFS and MRv1 does). You can plug in a custom "Fencer" if you choose to – for example, to use a different
implementation of the state store.
Configuration and FailoverProxy
In an HA setting, you should configure two ResourceManagers to use different ports (for example, ports on different
hosts). To facilitate this, YARN uses the notion of an ResourceManager Identifier (rm-id). Each ResourceManager has
a unique rm-id, and all the RPC configurations (<rpc-address>; for example yarn.resourcemanager.address) for
that ResourceManager can be configured via <rpc-address>.<rm-id>. Clients, ApplicationMasters, and
NodeManagers use these RPC addresses to talk to the active ResourceManager automatically, even after a failover.
To achieve this, they cycle through the list of ResourceManagers in the configuration. This is done automatically and
doesn't require any configuration (as it does in HDFS and MapReduce (MRv1)).
Automatic Failover
By default, ResourceManager HA uses ZKFC (ZooKeeper-based failover controller) for automatic failover in case the
active ResourceManager is unreachable or goes down. Internally, the StandbyElector is used to elect the active
ResourceManager. The failover controller runs as part of the ResourceManager (not as a separate process as in HDFS
and MapReduce v1) and requires no further setup after the appropriate properties are configured in yarn-site.xml.
Important: Enabling or disabling HA will cause the previous monitoring history to become unavailable.
Note: ResourceManager HA doesn't affect the JobHistory Server (JHS). JHS doesn't maintain any
state, so if the host fails you can simply assign it to a new host. You can also enable process auto-restart
by doing the following:
1. Go to the YARN service.
2. Click the Configuration tab.
3. Select Scope > JobHistory Server.
4. Select Category > Advanced.
5. Locate the Automatically Restart Process property or search for it by typing its name in the Search
box.
6. Click Edit Individual Values
7. Select the JobHistory Server Default Group.
8. Restart the JobHistory Server role.
Configuring YARN (MRv2) ResourceManager High Availability Using the Command Line
To configure and start ResourceManager HA, proceed as follows.
Note:
Configure the following properties in yarn-site.xml as shown, whether you are configuring manual
or automatic failover. They are sufficient to configure manual failover. You need to configure additional
properties for automatic failover.
The following is a sample yarn-site.xml showing these properties configured, including work preserving recovery
for both ResourceManager and NM:
<configuration>
<!-- Resource Manager Configs -->
<property>
<name>yarn.resourcemanager.connect.retry-interval.ms</name>
<value>2000</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>pseudo-yarn-rm-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKResourceManagerStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>localhost:2181</value>
</property>
<property>
<name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
<value>5000</value>
</property>
<property>
<name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
<value>true</value>
</property>
<value>host1:23141</value>
</property>
[-transitionToActive serviceId]
[-transitionToStandby serviceId]
[-getServiceState serviceId]
[-checkHealth <serviceId]
[-help <command>]
Note: Even though -help lists the -failover option, it is not supported by yarn rmadmin.
Note: YARN does not support high availability for the Job History Server (JHS). If the JHS goes down,
Cloudera Manager will restart it automatically.
Prerequisites
To use work preserving recovery for the ResourceManager, you need to first enable High Availability for the
ResourceManager. See YARN (MRv2) ResourceManager High Availability on page 314 for more information.
<property>
<name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
<value>false</value>
</property>
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
After enabling YARN (MRv2) ResourceManager High Availability on page 314, add the following property elements to
yarn-site.xml on the ResourceManager and all NodeManagers.
<property>
<name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
<value>true</value>
<description>Whether to enable work preserving recovery for the Resource
Manager</description>
</property>
<property>
<name>yarn.nodemanager.recovery.enabled</name>
<value>true</value>
<description>Whether to enable work preserving recovery for the Node
Manager</description>
</property>
<property>
<name>yarn.nodemanager.recovery.dir</name>
<value>/home/cloudera/recovery</value>
<description>The location for stored state on the Node Manager, if work preserving
recovery
is enabled.</description>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>0.0.0.0:45454</value>
</property>
Important: Enabling or disabling JobTracker HA will cause the previous monitoring history to become
unavailable.
• If the directories are not empty, Cloudera Manager will not delete the contents.
5. Optionally use the checkbox under Advanced Options to force initialize the ZooKeeper znode for auto-failover.
6. Click Continue. Cloudera Manager executes a set of commands that stop the MapReduce service, add a standby
JobTracker and Failover controller, initialize the JobTracker high availability state in ZooKeeper, create the job
status directory, restart MapReduce, and redeploy the relevant client configurations.
Configuring MapReduce (MRv1) JobTracker High Availability Using the Command Line
If you are running MRv1, you can configure the JobTracker to be highly available. You can configure either manual or
automatic failover to a warm-standby JobTracker.
Note:
• As with HDFS High Availability on page 291, the JobTracker high availability feature is backward
compatible; that is, if you do not want to enable JobTracker high availability, you can simply keep
your existing configuration after updating your hadoop-0.20-mapreduce,
hadoop-0.20-mapreduce-jobtracker, and hadoop-0.20-mapreduce-tasktracker
packages, and start your services as before. You do not need to perform any of the actions
described on this page.
To use the high availability feature, you must create a new configuration. This new configuration is designed such that
all the hosts in the cluster can have the same configuration; you do not need to deploy different configuration files to
different hosts depending on each host's role in the cluster.
In an HA setup, the mapred.job.tracker property is no longer a host:port string, but instead specifies a logical
name to identify JobTracker instances in the cluster (active and standby). Each distinct JobTracker in the cluster has a
different JobTracker ID. To support a single configuration file for all of the JobTrackers, the relevant configuration
parameters are suffixed with the JobTracker logical name as well as the JobTracker ID.
The HA JobTracker is packaged separately from the original (non-HA) JobTracker.
Important: You cannot run both HA and non-HA JobTrackers in the same cluster. Do not install the
HA JobTracker unless you need a highly available JobTracker. If you install the HA JobTracker and later
decide to revert to the non-HA JobTracker, you will need to uninstall the HA JobTracker and re-install
the non-HA JobTracker.
Important: The HA JobTracker cannot be installed on a node on which the non-HA JobTracker is
installed, and vice versa. If the JobTracker is installed, uninstall it following the instructions below
before installing the HA JobTracker. Uninstall the non-HA JobTracker whether or not you intend to
install the HA JobTracker on the same node.
• On SLES systems:
• On Ubuntu systems:
• On SLES systems:
• On Ubuntu systems:
Note: The instructions for automatic failover assume that you have set up a ZooKeeper cluster running
on three or more nodes, and have verified its correct operation by connecting using the ZooKeeper
command-line interface (CLI). See the ZooKeeper documentation for instructions on how to set up a
ZooKeeper ensemble.
• On SLES systems:
• On Ubuntu systems:
Make changes and additions similar to the following to mapred-site.xml on each node.
Note: It is simplest to configure all the parameters on all nodes, even though not all of the parameters
will be used on any given node. This also makes for robustness if you later change the roles of the
nodes in your cluster.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>logicaljt</value>
<!-- host:port string is replaced with a logical name -->
</property>
<property>
<name>mapred.jobtrackers.logicaljt</name>
<value>jt1,jt2</value>
<description>Comma-separated list of JobTracker IDs.</description>
</property>
<property>
<name>mapred.jobtracker.rpc-address.logicaljt.jt1</name>
<!-- RPC address for jt1 -->
<value>myjt1.myco.com:8021</value>
</property>
<property>
<name>mapred.jobtracker.rpc-address.logicaljt.jt2</name>
<!-- RPC address for jt2 -->
<value>myjt2.myco.com:8022</value>
</property>
<property>
<name>mapred.job.tracker.http.address.logicaljt.jt1</name>
<!-- HTTP bind address for jt1 -->
<value>0.0.0.0:50030</value>
</property>
<property>
<name>mapred.job.tracker.http.address.logicaljt.jt2</name>
<!-- HTTP bind address for jt2 -->
<value>0.0.0.0:50031</value>
</property>
<property>
<name>mapred.ha.jobtracker.rpc-address.logicaljt.jt1</name>
<!-- RPC address for jt1 HA daemon -->
<value>myjt1.myco.com:8023</value>
</property>
<property>
<name>mapred.ha.jobtracker.rpc-address.logicaljt.jt2</name>
<!-- RPC address for jt2 HA daemon -->
<value>myjt2.myco.com:8024</value>
</property>
<property>
<name>mapred.ha.jobtracker.http-redirect-address.logicaljt.jt1</name>
<!-- HTTP redirect address for jt1 -->
<value>myjt1.myco.com:50030</value>
</property>
<property>
<name>mapred.ha.jobtracker.http-redirect-address.logicaljt.jt2</name>
<!-- HTTP redirect address for jt2 -->
<value>myjt2.myco.com:50031</value>
</property>
<property>
<name>mapred.jobtracker.restart.recover</name>
<value>true</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.active</name>
<value>true</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.hours</name>
<value>1</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.dir</name>
<value>/jobtracker/jobsInfo</value>
</property>
<property>
<name>mapred.client.failover.proxy.provider.logicaljt</name>
<value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>mapred.client.failover.max.attempts</name>
<value>15</value>
</property>
<property>
<name>mapred.client.failover.sleep.base.millis</name>
<value>500</value>
</property>
<property>
<name>mapred.client.failover.sleep.max.millis</name>
<value>1500</value>
</property>
<property>
<name>mapred.client.failover.connection.retries</name>
<value>0</value>
</property>
<property>
<name>mapred.client.failover.connection.retries.on.timeouts</name>
<value>0</value>
</property>
<property>
<name>mapred.ha.fencing.methods</name>
<value>shell(/bin/true)</value>
</property>
</configuration>
Note:
In pseudo-distributed mode you need to specify mapred.ha.jobtracker.id for each JobTracker,
so that the JobTracker knows its identity.
But in a fully-distributed setup, where the JobTrackers run on different nodes, there is no need to set
mapred.ha.jobtracker.id, since the JobTracker can infer the ID from the matching address in
one of the mapred.jobtracker.rpc-address.<name>.<id> properties.
Note:
• You must be the mapred user to use mrhaadmin commands.
• If Kerberos is enabled, do not use sudo -u mapred when using the hadoop mrhaadmin
command. Instead, you must log in with the mapred Kerberos credentials (the short name must
be mapred). See Configuring Hadoop Security in CDH 5 for more information.
Unless automatic failover is configured, both JobTrackers will be in a standby state after the jobtrackerha daemons
start up.
If Kerberos is not enabled, use the following commands:
To find out what state each JobTracker is in:
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our
sample mapred-site.xml files.
To transition one of the JobTrackers to active and then verify that it is active:
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our
sample mapred-site.xml files.
With Kerberos enabled, log in as the mapred user and use the following commands:
$ sudo su - mapred
$ kinit -kt mapred.keytab mapred/<fully.qualified.domain.name>
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our
sample mapred-site.xml files.
To transition one of the JobTrackers to active and then verify that it is active:
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our
sample mapred-site.xml files.
$ sudo su - mapred
$ kinit -kt mapred.keytab mapred/<fully.qualified.domain.name>
To cause a failover from the currently active to the currently inactive JobTracker:
Note: If you are already using a ZooKeeper ensemble for automatic failover, use the same ensemble
for automatic JobTracker failover.
<property>
<name>mapred.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>mapred.ha.zkfc.port</name>
<value>8018</value>
<!-- Pick a different port for each failover controller when running one machine
-->
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181 </value>
<!-- ZK ensemble addresses -->
</property>
Note: If you have already configured automatic failover for HDFS, this property is already properly
configured; you use the same ZooKeeper ensemble for HDFS and JobTracker HA.
Note: The ZooKeeper ensemble must be running when you use this command; otherwise it will not
work properly.
or
This will create a znode in ZooKeeper in which the automatic failover system stores its data.
Note: If you are running a secure cluster, see also Securing access to ZooKeeper.
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our
sample mapred-site.xml files.
Once you have located your active JobTracker, you can cause a failure on that node. For example, you can use kill
-9 <pid of JobTracker> to simulate a JVM crash. Or you can power-cycle the machine or its network interface
to simulate different kinds of outages. After you trigger the outage you want to test, the other JobTracker should
automatically become active within several seconds. The amount of time required to detect a failure and trigger a
failover depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc and jobtrackerha
daemons to diagnose the problem.
Usage Notes
Using the JobTracker Web UI
To use the JobTracker Web UI, use the HTTP address of either JobTracker (that is, the value of
mapred.job.tracker.http.address.<name>.<id> for either the active or the standby JobTracker). Note the
following:
• If you use the URL of the standby JobTracker, you will be redirected to the active JobTracker.
• If you use the URL of a JobTracker that is down, you will not be redirected - you will simply get a "Not Found" error
from your browser.
Turning off Job Recovery
After a failover, the newly active JobTracker by default restarts all jobs that were running when the failover occurred.
For Sqoop 1 and HBase jobs, this is undesirable because they are not idempotent (that is, they do not behave the same
way on repeated execution). For these jobs you should consider setting mapred.job.restart.recover to false
in the job configuration (JobConf).
Cloudera recommends monitoring both Key Trustee Servers. If a Key Trustee Server fails catastrophically, restore it
from backup to a new host with the same hostname and IP address as the failed host. See Backing Up and Restoring
Key Trustee Server for more information. Cloudera does not support PostgreSQL promotion to convert a passive Key
Trustee Server to an active Key Trustee Server.
Important: You must assign the Key Trustee Server and Database roles to the same host. Assign the
Active Key Trustee Server and Active Database roles to one host, and the Passive Key Trustee Server
and Passive Database roles to a separate host.
After completing the Add Role Instances wizard, the Passive Key Trustee Server and Passive Database roles fail to start.
Complete the following manual actions to start these roles:
1. Stop the Key Trustee Server service (Key Trustee Server service > Actions > Stop).
2. Run the Set Up Key Trustee Server Database command (Key Trustee Server service > Actions > Set Up Key Trustee
Server Database).
3. Run the following command on the Active Key Trustee Server:
Replace keytrustee02.example.com with the hostname of the Passive Key Trustee Server.
4. Run the following command on the Passive Key Trustee Server:
5. Start the Key Trustee Server service (Key Trustee Server service > Actions > Start).
Important: Starting or restarting the Key Trustee Server service attempts to start the Active
Database and Passive Database roles. If the Active Database is not running when the Passive
Database attempts to start, the Passive Database fails to start. If this occurs, manually restart the
Passive Database role after confirming that the Active Database role is running.
6. Enable synchronous replication (Key Trustee Server service > Actions > Setup Enable Synchronous Replication
in HA mode).
7. Restart the Key Trustee Server service (Key Trustee Server service > Actions > Restart).
Configuring Key Trustee Server High Availability Using the Command Line
Install and configure a second Key Trustee Server following the instructions in Installing Cloudera Navigator Key Trustee
Server.
Once you have installed and configured the second Key Trustee Server, initialize the active Key Trustee Server by
running the following commands on the active Key Trustee Server host:
Important: For Key Trustee Server 5.4.0 and higher, the ktadmin init-master command is
deprecated, and should not be used. Use the ktadmin init command instead. If you are using SSH
software other than OpenSSH, pre-create the SSH key on the active Key Trustee Server before
continuing:
Replace keytrustee01.example.com with the fully qualified domain name (FQDN) of the active Key Trustee Server.
Replace keytrustee02.example.com with the FQDN of the passive Key Trustee Server. Cloudera recommends using
the default /var/lib/keytrustee/db directory for the PostgreSQL database.
To use a different port for the database, modify the ktadmin init and ktadmin db commands as follows:
If you use a database directory other than /var/lib/keytrustee/db, create or edit the
/etc/sysconfig/keytrustee-db file and add the following line:
ARGS="--pg-rootdir /path/to/db"
The ktadmin init command generates a self-signed certificate that the Key Trustee Server uses for HTTPS
communication.
Initialize the passive Key Trustee Server by running the following commands on the passive host:
Replace keytrustee02.example.com with the fully qualified domain name (FQDN) of the passive Key Trustee Server.
Replace keytrustee01.example.com with the FQDN of the active Key Trustee Server. Cloudera recommends using
the default /var/lib/keytrustee/db directory for the PostgreSQL database.
To use a different port for the database, modify the ktadmin init-slave command as follows:
If you use a database directory other than /var/lib/keytrustee/db, create or edit the
/etc/sysconfig/keytrustee-db file and add the following line:
ARGS="--pg-rootdir /path/to/db"
The ktadmin init-slave command performs an initial database sync by running the pg_basebackup command.
The database directory must be empty for this step to work. For information on performing an incremental backup,
see the PostgreSQL documentation.
Note: The /etc/init.d/postgresql script does not work when the PostgreSQL database is started
by Key Trustee Server, and cannot be used to monitor the status of the database. Use
/etc/init.d/keytrustee-db instead.
The ktadmin init command generates a self-signed certificate that the Key Trustee Server uses for HTTPS
communication. Instructions for using alternate certificates (for example, if you have obtained certificates from a
trusted Certificate Authority) are provided later.
To enable synchronous replication, run the following command on the active Key Trustee Server:
If you modified the default database location, replace /var/lib/keytrustee/db with the modified path.
Important: Because clients connect to Key Trustee Server using its fully qualified domain name
(FQDN), certificates must be issued to the FQDN of the Key Trustee Server host. If you are using
CA-signed certificates, ensure that the generated certificates use the FQDN, and not the short name.
If you have a CA-signed certificate for Key Trustee Server, see Managing Key Trustee Server Certificates for instructions
on how to replace the self-signed certificates.
Warning: It is very important that you perform this step. Failure to do so leaves Key Trustee KMS
in a state where keys are intermittently inaccessible, depending on which Key Trustee KMS host
a client interacts with, because cryptographic key material encrypted by one Key Trustee KMS
host cannot be decrypted by another. If you are already running multiple Key Trustee KMS hosts
with different private keys, immediately back up all Key Trustee KMS hosts, and contact Cloudera
Support for assistance correcting the issue.
To determine whether the Key Trustee KMS private keys are different, compare the MD5 hash
of the private keys. On each Key Trustee KMS host, run the following command:
$ md5sum /var/lib/kms-keytrustee/keytrustee/.keytrustee/secring.gpg
If the outputs are different, contact Cloudera Support for assistance. Do not attempt to synchronize
existing keys. If you overwrite the private key and do not have a backup, any keys encrypted by
that private key are permanently inaccessible, and any data encrypted by those keys is permanently
irretrievable. If you are configuring Key Trustee KMS high availability for the first time, continue
synchronizing the private keys.
Cloudera recommends following security best practices and transferring the private key using offline media, such
as a removable USB drive. For convenience (for example, in a development or testing environment where maximum
security is not required), you can copy the private key over the network by running the following rsync command
on the original Key Trustee KMS host:
Replace ktkms02.example.com with the hostname of the Key Trustee KMS host that you are adding.
9. Restart the cluster.
10. Redeploy the client configuration (Home > Cluster-wide > Deploy Client Configuration).
11. Re-run the steps in Validating Hadoop Key Operations.
Master should be started. Each host that will run a Master needs to have all of the configuration files available. In
general, it is a good practice to distribute the entire conf/ directory across all cluster nodes.
After saving and distributing the file, restart your cluster for the changes to take effect. When the master starts the
backup Masters, messages are logged. In addition, you can verify that an HMaster process is listed in the output of
the jps command on the nodes where the backup Master should be running.
$ jps
15930 HRegionServer
16194 Jps
15838 HQuorumPeer
16010 HMaster
To stop a backup Master without stopping the entire cluster, first find its process ID using the jps command, then
issue the kill command against its process ID.
$ kill 16010
Note:
Before you enable read-replica support, make sure to account for their increased heap memory
requirements. Although no additional copies of HFile data are created, read-only replicas regions have
the same memory footprint as normal regions and need to be considered when calculating the amount
of increased heap memory required. For example, if your table requires 8 GB of heap memory, when
you enable three replicas, you need about 24 GB of heap memory.
To enable support for read replicas in HBase, several properties must be set.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
1. Add the properties from Table 13: HBase Read Replica Properties on page 341 to hbase-site.xml on each
RegionServer in your cluster, and configure each of them to a value appropriate to your needs. The following
example configuration shows the syntax.
<property>
<name>hbase.regionserver.storefile.refresh.period</name>
<value>0</value>
</property>
<property>
<name>hbase.ipc.client.allowsInterrupt</name>
<value>true</value>
<description>Whether to enable interruption of RPC threads at the client. The default
value of true is
required to enable Primary RegionServers to access other RegionServers in secondary
mode. </description>
</property>
<property>
<name>hbase.client.primaryCallTimeout.get</name>
<value>10</value>
</property>
<property>
<name>hbase.client.primaryCallTimeout.multiget</name>
<value>10</value>
</property>
At Table Creation
To create a new table with read replication capabilities enabled, set the REGION_REPLICATION property on the table.
Use a command like the following, in HBase Shell:
Get Request
Scan Request
Prerequisites
• Cloudera recommends that each instance of the metastore runs on a separate cluster host, to maximize high
availability.
• Hive metastore HA requires a database that is also highly available, such as MySQL with replication in active-active
mode. Refer to the documentation for your database of choice to configure it correctly.
Limitations
Sentry HDFS synchronization does not support Hive metastore HA.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
1. Configure Hive on each of the cluster hosts where you want to run a metastore, following the instructions at
Configuring the Hive Metastore.
2. On the server where the master metastore instance runs, edit the /etc/hive/conf.server/hive-site.xml
file, setting the hive.metastore.uris property's value to a list of URIs where a Hive metastore is available for
failover.
<property>
<name>hive.metastore.uris</name>
<value>thrift://metastore1.example.com,thrift://metastore2.example.com,thrift://metastore3.example.com</value>
3. If you use a secure cluster, enable the Hive token store by configuring the value of the
hive.cluster.delegation.token.store.class property to
org.apache.hadoop.hive.thrift.DBTokenStore.
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.DBTokenStore</value>
</property>
6. Test your configuration by stopping your main metastore instance, and then attempting to connect to one of the
other metastores from a client. The following is an example of doing this on a RHEL or Fedora system. The example
first stops the local metastore, then connects to the metastore on the host metastore2.example.com and runs
the SHOW TABLES command.
Important: Cloudera strongly recommends an external database for clusters with multiple Hue
servers (or "hue" users). With the default embedded database (one per server), in a multi-server
environment, the data on server "A" appears lost when working on server "B" and vice versa. Use an
external database, and configure each server to point to it to ensure that no matter which server is
being used by Hue, your data is always accessible.
Prerequisite
• An external database was added and each Hue server is configured to use it. See Using an External Database for
Hue Using Cloudera Manager on page 178.
3. Add one or more Hue servers to an existing Hue server role. At least two Hue server roles are required for high
availability:
a. Click Add Role Instances.
b. Click Select hosts under Hue Server (HS).
c. Check the box for each host on which you want a Hue server (which adds HS icons).
d. Click OK.
4. Add one load balancer:
a. Click Select hosts under Load Balancer (LB).
b. Check the box for the host on which you want the load balancer (which adds an LB icon).
c. Click OK.
5. Click Continue.
6. Select the Configuration tab and review the options to ensure they meet your needs. Some items to consider are:
• Hue Load Balancer Port - The Apache Load Balancer listens on this port. The default is 8889.
• Path to TLS/SSL Certificate File - The TLS/SSL certificate file.
• Path to TLS/SSL Private Key File - The TLS/SSL private key file.
7. Save any configuration changes.
8. Start the load balancer and new Hue servers:
a. Check the box by each new role type.
b. Select Actions for Selected > Start.
c. Click Start and Close.
For more information, see Automatic High Availability and Load Balancing of Hue.
Configuring a Cluster for Hue High Availability Using the Command Line
This section applies to unmanaged deployments without Cloudera Manager. It explains how to install and configure
nginx from the command line. To make the Hue service highly available, configure at least two instances of the Hue
service, each on a different host. Also configure the nginx load balancer to direct traffic to an alternate host if the
primary host becomes unavailable. For advanced configurations, see the nginx documentation.
Prerequisites
• The Hue service is installed and two or more Hue server roles are configured.
• You have network access through SSH to the host machines with the Hue server role(s).
• An external database was added each Hue server is configured to use it. See Using an External Database for Hue
Using Cloudera Manager on page 178.
Debian/Ubuntu:
/etc/nginx/conf.d/hue.conf
server {
server_name NGINX_HOSTNAME;
charset utf-8;
listen 8001;
client_max_body_size 0;
location / {
proxy_pass http://hue;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $remote_addr;
# Or if the upstream Hue instances are running behind https://
## proxy_pass https://hue;
}
location /static/ {
# Uncomment to expose the static file directories.
## autoindex on;
# Or if on a parcel install:
alias /opt/cloudera/parcels/CDH/lib/hue/build/static/;
expires 30d;
add_header Cache-Control public;
}
}
upstream hue {
ip_hash;
server_name myhost-2.myco.com;
• In the location/static block, comment or uncomment the alias lines, depending on whether your
cluster was installed using parcels or packages. (The comment indicator is #.)
• In the upstream hue block, list all the hostnames, including port number, of the Hue instances in your
cluster. For example:
– Uncomment these lines, and replace the path names with the paths for your cluster:
## proxy_pass https://hue;
proxy_pass http://hue;
listen 8001
6. Start nginx:
7. Test your installation by opening the Hue application in a web browser, using the following URL:
• Without TLS/SSL: http://NGINX_HOSTNAME:8001
• With TLS/SSL: https://NGINX_HOSTNAME:8001
NGINX_HOSTNAME is the name of the host machine where you installed nginx.
Note: Though Impala can be used together with YARN via simple configuration of Static Service Pools
in Cloudera Manager, the use of the general-purpose component Llama for integrated resource
management within YARN is no longer supported with CDH 5.5 / Impala 2.3 and higher.
Llama high availability (HA) uses an active-standby architecture, in which the active Llama is automatically elected
using the ZooKeeper-based ActiveStandbyElector. The active Llama accepts RPC Thrift connections and communicates
with YARN. The standby Llama monitors the leader information in ZooKeeper, but doesn't accept RPC Thrift connections.
Fencing
Only one of the Llamas should be active to ensure the resources are not partitioned. Llama uses ZooKeeper access
control lists (ACLs) to claim exclusive ownership of the cluster when transitioning to active, and monitors this ownership
periodically. If another Llama takes over, the first one realizes it within this period.
• IP addresses
• Rack name
4. Specify or select one or more hosts and click OK.
5. Click Continue.
6. Click Continue. A progress screen displays with a summary of the wizard actions.
7. Click Finish.
Requirements
The requirements for Oozie high availability are:
• Multiple active Oozie servers, preferably identically configured.
• JDBC JAR in the same location across all Oozie hosts (for example, /var/lib/oozie/).
• External database that supports multiple concurrent connections, preferably with HA support. The default Derby
database does not support multiple concurrent connections.
• ZooKeeper ensemble with distributed locks to control database access, and service discovery for log aggregation.
• Load balancer (preferably with HA support, for example HAProxy), virtual IP, or round-robin DNS to provide a
single entry point (of the multiple active servers), and for callbacks from the Application Master or JobTracker.
For information on setting up TLS/SSL communication with Oozie HA enabled, see Additional Considerations when
Configuring TLS/SSL for Oozie HA.
Important: Enabling or disabling high availability makes the previous monitoring history unavailable.
• Configuration parameter isProductionMode=true (Production mode): Cloudera Search logs and ignores
unrecoverable exceptions, enabling mission-critical large-scale online production systems to make progress without
downtime, despite some issues.
If Cloudera Search throws an exception according the rules described above, the caller, meaning Flume Solr Sink and
MapReduceIndexerTool, can catch the exception and retry the task if it meets the criteria for such retries.
agent.sinks.solrSink.isProductionMode = true
agent.sinks.solrSink.isIgnoringRecoverableExceptions = true
In addition, Flume SolrSink automatically attempts to load balance and failover among the hosts of a SolrCloud before
it considers the transaction rollback and retry. Load balancing and failover is done with the help of ZooKeeper, which
itself can be configured to be highly available.
Further, Cloudera Manager can configure Flume so it automatically restarts if its process crashes.
To tolerate extended periods of Solr downtime, you can configure Flume to use a high-performance transactional
persistent queue in the form of a FileChannel. A FileChannel can use any number of local disk drives to buffer significant
amounts of data. For example, you might buffer many terabytes of events corresponding to a week of data. Further,
using the Replicating Channel Selector Flume feature, you can configure Flume to replicate the same data both into
HDFS as well as into Solr. Doing so ensures that if the Flume SolrSink channel runs out of disk space, data delivery is
still delivered to HDFS, and this data can later be ingested from HDFS into Solr using MapReduce.
Many machines with many Flume Solr Sinks and FileChannels can be used in a failover and load balancing configuration
to improve high availability and scalability. Flume SolrSink servers can be either co-located with live Solr servers serving
end user queries, or Flume SolrSink servers can be deployed on separate industry standard hardware for improved
scalability and reliability. By spreading indexing load across a large number of Flume SolrSink servers you can improve
scalability. Indexing load can be replicated across multiple Flume SolrSink servers for high availability, for example
using Flume features such as Load balancing Sink Processor.
For example:
Important: Cloudera Support supports all of the configuration and modification to Cloudera software
detailed in this document. However, Cloudera Support is unable to assist with issues or failures with
the third-party software that is used. Use of any third-party software, or software not directly covered
by Cloudera Support, is at the risk of the end user.
• A multi-homed TCP load balancer, or two TCP load balancers, capable of proxying requests on specific ports to
one server from a set of backing servers.
– The load balancer does not need to support termination of TLS/SSL connections.
– This load balancer can be hardware or software based, but should be capable of proxying multiple ports.
HTTP/HTTPS-based load balancers are insufficient because Cloudera Manager uses several non-HTTP-based
protocols internally.
– This document uses HAProxy, a small, open-source, TCP-capable load balancer, to demonstrate a workable
configuration.
• A networked storage device that you can configure to be highly available. Typically this is an NFS store, a SAN
device, or a storage array that satisfies the read/write throughput requirements of the Cloudera Management
Service. This document assumes the use of NFS due to the simplicity of its configuration and because it is an easy,
vendor-neutral illustration.
• The procedures in this document require ssh access to all the hosts in the cluster where you are enabling high
availability for Cloudera Manager.
• Questionable reliance on outdated Address Resolution Protocol (ARP) behavior to ensure that the IP-to-MAC
translation works correctly to resolve to the new MAC address on failure.
• Split-brain scenarios that lead to problems with routing.
• A requirement that the virtual IP address subnet be shared between the primary and the secondary hosts, which
can be onerous if you deploy your secondaries off site.
Therefore, Cloudera no longer recommend the use of virtual IP addresses, and instead recommends using a dedicated
load balancer.
Important:
Unless stated otherwise, run all commands mentioned in this topic as the root user.
You do not need to stop the CDH cluster to configure Cloudera Manager high availability.
Note: The hostnames used here are placeholders and are used throughout this document. When
configuring your cluster, substitute the actual names of the hosts you use in your environment.
Note: HAProxy is used here for demonstration purposes. Production-level performance requirements
determine the load balancer that you select for your installation.
HAProxy version 1.5.2 is used for these procedures.
1. Reserve two hostnames in your DNS system, and assign them to each of the load balancer hosts. (The names
CMSHostname, and MGMTHostname are used in this example; substitute the correct hostname for your
environment.) These hostnames will be the externally accessible hostnames for Cloudera Manager Server and
Cloudera Management Service. (Alternatively, use one load balancer with separate, resolvable IP addresses—one
each to back CMSHostname and MGMTHostname respectively).
• CMSHostname is used to access Cloudera Manager Admin Console.
• MGMTHostname is used for internal access to the Cloudera Management Service from Cloudera Manager
Server and Cloudera Manager Agents.
2. Set up two hosts using any supported Linux distribution (RHEL, CentOS, Ubuntu or SUSE; see Supported Operating
Systems) with the hostnames listed above. See the HAProxy documentation for recommendations on configuring
the hardware of these hosts.
3. Install the version of HAProxy that is recommended for the version of Linux installed on the two hosts:
RHEL/CentOS:
Ubuntu (use a current Personal Package Archive (PPA) for 1.5 from http://haproxy.debian.net):
SUSE:
$ chkconfig haproxy on
Ubuntu:
5. Configure HAProxy.
• On CMSHostname, edit the /etc/haproxy/haproxy.cfg files and make sure that the ports listed at Ports
Used by Cloudera Manager and Cloudera Navigator for “Cloudera Manager Server” are proxied. For Cloudera
Manager 5, this list includes the following ports as defaults:
– 7180
– 7182
– 7183
• On MGMTHostname, edit the /etc/haproxy/haproxy.cfg file and make sure that the ports for Cloudera
Management Service are proxied (see Ports Used by Cloudera Manager and Cloudera Navigator). For Cloudera
Manager 5, this list includes the following ports as defaults:
– 5678
– 7184
– 7185
– 7186
– 7187
– 8083
– 8084
– 8086
– 8087
– 8091
– 9000
– 9994
– 9995
– 9996
– 9997
– 9998
– 9999
– 10101
option tcplog
server mgmt17a MGMT1 check
server mgmt17b MGMT2 check
After updating the configuration, restart HAProxy on both the MGMTHostname and CMSHostname hosts:
Important: The embedded Postgres database cannot be configured for high availability and
should not be used in a high-availability configuration.
2. Configure your databases to be highly available. Consult the vendor documentation for specific information.
MySQL, PostgreSQL, and Oracle each have many options for configuring high availability. See Database High
Availability Configuration on page 381 for some external references on configuring high availability for your Cloudera
Manager databases.
Note: Using NFS as a shared storage mechanism is used here for demonstration purposes. Refer to
your Linux distribution documentation on production NFS configuration and security. Production-level
performance requirements determine the storage that you select for your installation.
This section describes how to configure an NFS server and assumes that you understand how to configure highly
available remote storage devices. Further details are beyond the scope and intent of this guide.
There are no intrinsic limitations on where this NFS server is located, but because overlapping failure domains can
cause problems with fault containment and error tracing, Cloudera recommends that you not co-locate the NFS server
with any CDH or Cloudera Manager servers or the load-balancer hosts detailed in this document.
1. Install NFS on your designated server:
RHEL/CentOS
Ubuntu
SUSE
RHEL/CentOS:
$ chkconfig nfs on
$ service rpcbind start
$ service nfs start
Ubuntu:
SUSE:
$ chkconfig nfs on
$ service rpcbind start
$ service nfs-kernel-server start
Note: Later sections describe mounting the shared directories and sharing them between the primary
and secondary instances.
Step 2: Installing and Configuring Cloudera Manager Server for High Availability
You can use an existing Cloudera Manager installation and extend it to a high-availability configuration, as long as you
are not using the embedded PostgreSQL database.
This section describes how to install and configure a failover secondary for Cloudera Manager Server that can take
over if the primary fails.
This section does not cover installing instances of Cloudera Manager Agent on CMS1 or CMS2 and configuring them to
be highly available. See Installation.
Setting up NFS Mounts for Cloudera Manager Server
1. Create the following directories on the NFS server you created in a previous step:
$ mkdir -p /media/cloudera-scm-server
2. Mark these mounts by adding these lines to the /etc/exports file on the NFS server:
/media/cloudera-scm-server CMS1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-server CMS2(rw,sync,no_root_squash,no_subtree_check)
3. Export the mounts by running the following command on the NFS server:
$ exportfs -a
Ubuntu:
SUSE:
$ rm -rf /var/lib/cloudera-scm-server
$ mkdir -p /var/lib/cloudera-scm-server
c. Mount the following directory to the NFS mounts, on both CMS1 and CMS2:
d. Set up fstab to persist the mounts across restarts by editing the /etc/fstab file on CMS1 and CMS2 and
adding the following lines:
Important: Deleting the Cloudera Management Service leads to loss of all existing data from the Host
Monitor and Service Monitor roles that store health and monitoring information for your cluster on
the local disk associated with the host(s) where those roles are installed.
Fresh Installation
Use either of the installation paths (B or C) specified in the documentation to install Cloudera Manager Server, but do
not add “Cloudera Management Service” to your deployment until you complete Step 3: Installing and Configuring
Cloudera Management Service for High Availability on page 364, which describes how to set up the Cloudera Management
Service.
See:
• Installation Path B - Manual Installation Using Cloudera Manager Packages
• Installation Path C - Manual Installation Using Cloudera Manager Tarballs
You can now start the freshly-installed Cloudera Manager Server on CMS1:
Before proceeding, verify that you can access the Cloudera Manager Admin Console at http://CMS1:7180.
If you have just installed Cloudera Manager, click the Cloudera Manager logo to skip adding new hosts and to gain
access to the Administration menu, which you need for the following steps.
HTTP Referer Configuration
Cloudera recommends that you disable the HTTP Referer check because it causes problems for some proxies and load
balancers. Check the configuration manual of your proxy or load balancer to determine if this is necessary.
To disable HTTP Referer in the Cloudera Manager Admin Console:
1. Select Administration > Settings.
2. Select Category > Security.
3. Deselect the HTTP Referer Check property.
Before proceeding, verify that you can access the Cloudera Manager Admin Console through the load balancer at
http://CMSHostname:7180.
$ mkdir -p /etc/cloudera-scm-server
$ scp [<ssh-user>@]CMS1:/etc/cloudera-scm-server/db.properties
/etc/cloudera-scm-server/db.properties
3. If you configured Cloudera Manager TLS encryption or authentication, or Kerberos authentication in your primary
installation, see TLS and Kerberos Configuration for Cloudera Manager High Availability on page 382 for additional
configuration steps.
4. Do not start the cloudera-scm-server service on this host yet, and disable autostart on the secondary to avoid
automatically starting the service on this host.
RHEL/CentOS/SUSEL:
Ubuntu:
(You will also disable autostart on the primary when you configure automatic failover in a later step.) Data corruption
can result if both primary and secondary Cloudera Manager Server instances are running at the same time, and
it is not supported. :
Testing Failover
Test failover manually by using the following steps:
1. Stop cloudera-scm-server on your primary host (CMS1):
3. Wait a few minutes for the service to load, and then access the Cloudera Manager Admin Console through a web
browser, using the load-balanced hostname (for example: http://CMSHostname:CMS_port).
Now, fail back to the primary before configuring the Cloudera Management Service on your installation:
1. Stop cloudera-scm-server on your secondary machine (CMS2):
3. Wait a few minutes for the service to load, and then access the Cloudera Manager Admin Console through a web
browser, using the load-balanced hostname (for example: http://CMSHostname:7180).
Updating Cloudera Manager Agents to use the Load Balancer
After completing the primary and secondary installation steps listed previously, update the Cloudera Manager Agent
configuration on all of the hosts associated with this Cloudera Manager installation, except the MGMT1, MGMT2, CMS1,
and CMS2 hosts, to use the load balancer address:
1. Connect to a shell on each host where CDH processes are installed and running. (The MGMT1, MGMT2, CMS1, and
CMS2 hosts do not need to be modified as part of this step.)
2. Update the /etc/cloudera-scm-agent/config.ini file and change the server_host line:
server_host = <CMSHostname>
3. Restart the agent (this command starts the agents if they are not running):
Step 3: Installing and Configuring Cloudera Management Service for High Availability
This section demonstrates how to set up shared mounts on MGMT1 and MGMT2, and then install Cloudera Management
Service to use those mounts on the primary and secondary servers.
Important: Do not start the primary and secondary servers that are running Cloudera Management
Service at the same time. Data corruption can result.
$ mkdir -p /media/cloudera-host-monitor
$ mkdir -p /media/cloudera-scm-agent
$ mkdir -p /media/cloudera-scm-eventserver
$ mkdir -p /media/cloudera-scm-headlamp
$ mkdir -p /media/cloudera-service-monitor
$ mkdir -p /media/cloudera-scm-navigator
$ mkdir -p /media/etc-cloudera-scm-agent
2. Mark these mounts by adding the following lines to the /etc/exports file on the NFS server:
/media/cloudera-host-monitor MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-agent MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-eventserver MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-headlamp MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-service-monitor MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-navigator MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/etc-cloudera-scm-agent MGMT1(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-host-monitor MGMT2(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-agent MGMT2(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-eventserver MGMT2(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-headlamp MGMT2(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-service-monitor MGMT2(rw,sync,no_root_squash,no_subtree_check)
/media/cloudera-scm-navigator MGMT2(rw,sync,no_root_squash,no_subtree_check)
/media/etc-cloudera-scm-agent MGMT2(rw,sync,no_root_squash,no_subtree_check)
3. Export the mounts running the following command on the NFS server:
$ exportfs -a
Ubuntu:
SUSE:
$ mkdir -p /var/lib/cloudera-host-monitor
$ mkdir -p /var/lib/cloudera-scm-agent
$ mkdir -p /var/lib/cloudera-scm-eventserver
$ mkdir -p /var/lib/cloudera-scm-headlamp
$ mkdir -p /var/lib/cloudera-service-monitor
$ mkdir -p /var/lib/cloudera-scm-navigator
$ mkdir -p /etc/cloudera-scm-agent
c. Mount the following directories to the NFS mounts, on both MGMT1 and MGMT2 (NFS refers to the server NFS
hostname or IP address):
5. Set up fstab to persist the mounts across restarts. Edit the /etc/fstab file and add these lines:
server_host=CMSHostname
listening_hostname=MGMTHostname
b. Edit the /etc/hosts file and add MGMTHostname as an alias for your public IP address for MGMT1 by adding
a line like this at the end of your /etc/hosts file:
MGMT1 IP MGMTHostname
c. Confirm that the alias has taken effect by running the ping command. For example:
d. Make sure that the cloudera-scm user and the cloudera-scm group have access to the mounted directories
under /var/lib, by using the chown command on cloudera-scm. For example, run the following on MGMT1:
Note: The cloudera-scm user and the cloudera-scm group are the default owners as
specified in Cloudera Management Service advanced configuration. If you alter these settings,
or are using single-user mode, modify the above chown instructions to use the altered user
or group name.
e. Restart the agent on MGMT1 (this also starts the agent if it is not running):
g. Make sure you install all of the roles of the Cloudera Management Service on the host named MGMTHostname.
h. Proceed through the steps to configure the roles of the service to use your database server, and use defaults
for the storage directory for Host Monitor or Service Monitor.
i. After you have completed the steps, wait for the Cloudera Management Service to finish starting, and verify
the health status of your clusters as well as the health of the Cloudera Management Service as reported in
the Cloudera Manager Admin Console. The health status indicators should be green, as shown:
The service health for Cloudera Management Service might, however, show as red:
In this case, you need to identify whether the health test failure is caused by the Hostname and Canonical
Name Health Check for the MGMTHostname host, which might look like this:
This test can fail in this way because of the way you modified /etc/hosts on MGMT1 and MGMT2 to allow
the resolution of MGMTHostname locally. This test can be safely disabled on the MGMTHostname host from
the Cloudera Manager Admin Console.
j. If you are configuring Kerberos and TLS/SSL, see TLS and Kerberos Configuration for Cloudera Manager High
Availability on page 382 for configuration changes as part of this step.
server_host=<CMHostname>
listening_hostname=<MGMTHostname>
b. Edit the /etc/hosts file and add MGMTHostname as an alias for your public IP address for MGMT1, by adding
a line like this at the end of your /etc/hosts file:
<MGMT2-IP> <MGMTHostname>
c. Confirm that the alias is working by running the ping command. For example:
6. Log into the Cloudera Manager Admin Console in a web browser and start all Cloudera Management Service roles.
This starts the Cloudera Management Service on MGMT2.
a. Wait for the Cloudera Manager Admin Console to report that the services have started.
b. Confirm that the services have started on this host by running the following command on MGMT2:
You should see ten total processes running on that host, including the eight Cloudera Management Service
processes, a Cloudera Manager Agent process, and a Supervisor process.
c. Test the secondary installation through the Cloudera Management Admin Console, and inspect the health
of the Cloudera Management Service roles, before proceeding.
Note:
Make sure that the UID and GID for the cloudera-scm user on the primary and secondary Cloudera
Management Service hosts are same; this ensures that the correct permissions are available on the
shared directories after failover.
Note: The versions referred to for setting up automatic failover in this document are Pacemaker
1.1.11 and Corosync 1.4.7. See http://clusterlabs.org/wiki/Install to determine what works best
for your Linux distribution.
RHEL/CentOS:
Ubuntu:
SUSE:
2. Make sure that the crm tool exists on all of the hosts. This procedure uses the crm tool, which works with Pacemaker
configuration. If this tool is not installed when you installed Pacemaker (verify this by running which crm), you
can download and install the tool for your distribution using the instructions at http://crmsh.github.io/installation.
About Corosync and Pacemaker
• By default, Corosync and Pacemaker are not autostarted as part of the boot sequence. Cloudera recommends
leaving this as is. If the machine crashes and restarts, manually make sure that failover was successful and determine
the cause of the restart before manually starting these processes to achieve higher availability.
– If the /etc/default/corosync file exists, make sure that START is set to yes in that file:
START=yes
– Make sure that Corosync is not set to start automatically, by running the following command:
RHEL/CentOS/SUSE:
Ubuntu:
• Note which version of Corosync is installed. The contents of the configuration file for Corosync (corosync.conf)
that you edit varies based on the version suitable for your distribution. Sample configurations are supplied in this
document and are labeled with the Corosync version.
• This document does not demonstrate configuring Corosync with authentication (with secauth set to on). The
Corosync website demonstrates a mechanism to encrypt traffic using symmetric keys.
• Firewall configuration:
Corosync uses UDP transport on ports 5404 and 5405, and these ports must be open for both inbound and
outbound traffic on all hosts. If you are using IP tables, run a command similar to the following:
$ sudo iptables -I INPUT -m state --state NEW -p udp -m multiport --dports 5404,5405 -j
ACCEPT
$ sudo iptables -I OUTPUT -m state --state NEW -p udp -m multiport --sports 5404,5405
-j ACCEPT
compatibility: whitetank
totem {
version: 2
secauth: off
interface {
member {
memberaddr: CMS1
}
member {
memberaddr: CMS2
}
ringnumber: 0
bindnetaddr: CMS1
mcastport: 5405
}
transport: udpu
}
logging {
fileline: off
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
#
}
totem {
version: 2
secauth: off
cluster_name: cmf
transport: udpu
}
nodelist {
node {
ring0_addr: CMS1
nodeid: 1
}
node {
ring0_addr: CMS2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
2. Edit the /etc/corosync/corosync.conf file on CMS2, and replace the entire contents with the following text
(use the correct version for your environment):
Corosync version 1.x:
compatibility: whitetank
totem {
version: 2
secauth: off
interface {
member {
memberaddr: CMS1
}
member {
memberaddr: CMS2
}
ringnumber: 0
bindnetaddr: CMS2
mcastport: 5405
}
transport: udpu
}
logging {
fileline: off
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
#
}
totem {
version: 2
secauth: off
cluster_name: cmf
transport: udpu
}
nodelist {
node {
ring0_addr: CMS1
nodeid: 1
}
node {
ring0_addr: CMS2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
3. Restart Corosync on CMS1 and CMS2 so that the new configuration takes effect:
Setting up Pacemaker
You use Pacemaker to set up Cloudera Manager Server as a cluster resource.
See the Pacemaker configuration reference at
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ for more details about Pacemaker
options.
The following steps demonstrate one way, recommended by Cloudera, to configure Pacemaker for simple use:
1. Disable autostart for Cloudera Manager Server (because you manage its lifecycle through Pacemaker) on both
CMS1 and CMS2:
RHEL/CentOS/SUSE:
Ubuntu:
2. Make sure that Pacemaker has been started on both CMS1 and CMS2:
$ /etc/init.d/pacemaker start
# crm status
Last updated: Wed Mar 4 18:55:27 2015
Last change: Wed Mar 4 18:38:40 2015 via crmd on CMS1
Stack: corosync
Current DC: CMS1 (1) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
0 Resources configured
$ crm_mon
For example:
$ crm_mon
Last updated: Tue Jan 27 15:01:35 2015
Last change: Mon Jan 27 14:10:11 2015
Stack: classic openais (with plugin)
Current DC: CMS1 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ CMS1 CMS2 ]
cloudera-scm-server (lsb:cloudera-scm-server): Started CMS1
At this point, Pacemaker manages the status of the cloudera-scm-server service on hosts CMS1 and CMS2, ensuring
that only one instance is running at a time.
Note: Pacemaker expects all lifecycle actions, such as start and stop, to go through Pacemaker;
therefore, running direct service start or service stop commands breaks that assumption.
Test the resource move by connecting to a shell on CMS2 and verifying that the cloudera-scm-server process is
now active on that host. It takes usually a few minutes for the new services to come up on the new host.
Enabling STONITH (Shoot the other node in the head)
The following link provides an explanation of the problem of fencing and ensuring (within reasonable limits) that only
one host is running a shared resource at a time:
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457872046640
As noted in that link, you can use several methods (such as IPMI) to achieve reasonable guarantees on remote host
shutdown. Cloudera recommends enabling STONITH, based on the hardware configuration in your environment.
Setting up the Cloudera Manager Service
Setting Up Corosync
1. Edit the /etc/corosync/corosync.conf file on MGMT1 and replace the entire contents with the contents
below; make sure to use the correct section for your version of Corosync:
Corosync version 1.x:
compatibility: whitetank
totem {
version: 2
secauth: off
interface {
member {
memberaddr: MGMT1
}
member {
memberaddr: MGMT2
}
ringnumber: 0
bindnetaddr: MGMT1
mcastport: 5405
}
transport: udpu
}
logging {
fileline: off
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
#
}
totem {
version: 2
secauth: off
cluster_name: mgmt
transport: udpu
nodelist {
node {
ring0_addr: MGMT1
nodeid: 1
}
node {
ring0_addr: MGMT2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
2. Edit the /etc/corosync/corosync.conf file on MGMT2 andf replace the contents with the contents below:
Corosync version 1.x:
compatibility: whitetank
totem {
version: 2
secauth: off
interface {
member {
memberaddr: MGMT1
}
member {
memberaddr: MGMT2
}
ringnumber: 0
bindnetaddr: MGMT2
mcastport: 5405
}
transport: udpu
}
logging {
fileline: off
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
#
}
totem {
version: 2
secauth: off
cluster_name: mgmt
transport: udpu
}
nodelist {
node {
ring0_addr: CMS1
nodeid: 1
}
node {
ring0_addr: CMS2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
3. Restart Corosync on MGMT1 and MGMT2 for the new configuration to take effect:
4. Test whether Corosync has set up a cluster, by using the corosync-cmapctl or corosync-objctl commands.
You should see two members with status joined:
Setting Up Pacemaker
Use Pacemaker to set up Cloudera Management Service as a cluster resource.
See the Pacemaker configuration reference at
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/ for more information about
Pacemaker options.
Because the lifecycle of Cloudera Management Service is managed through the Cloudera Manager Agent, you configure
the Cloudera Manager Agent to be highly available.
Follow these steps to configure Pacemaker, recommended by Cloudera for simple use:
1. Disable autostart for the Cloudera Manager Agent (because Pacemaker manages its lifecycle) on both MGMT1 and
MGMT2:
RHEL/CentOS/SUSE
Ubuntu:
$ /etc/init.d/pacemaker start
3. Make sure that the crm command reports two nodes in the cluster; you can run this command on either host:
# crm status
Last updated: Wed Mar 4 18:55:27 2015
Last change: Wed Mar 4 18:38:40 2015 via crmd on MGMT1
Stack: corosync
As with Cloudera Manager Server Pacemaker configuration, this step disables quorum checks, disables STONITH
explicitly, and reduces the likelihood of resources being moved between hosts.
5. Create an Open Cluster Framework (OCF) provider on both MGMT1 and MGMT2 for Cloudera Manager Agent for
use with Pacemaker:
a. Create an OCF directory for creating OCF resources for Cloudera Manager:
$ mkdir -p /usr/lib/ocf/resource.d/cm
#!/bin/sh
#######################################################################
# CM Agent OCF script
#######################################################################
#######################################################################
# Initialization:
: ${__OCF_ACTION=$1}
OCF_SUCCESS=0
OCF_ERROR=1
OCF_STOPPED=7
#######################################################################
meta_data() {
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="Cloudera Manager Agent" version="1.0">
<version>1.0</version>
<longdesc lang="en">
This OCF agent handles simple monitoring, start, stop of the Cloudera
Manager Agent, intended for use with Pacemaker/corosync for failover.
</longdesc>
<shortdesc lang="en">Cloudera Manager Agent OCF script</shortdesc>
<parameters />
<actions>
<action name="start" timeout="20" />
<action name="stop" timeout="20" />
<action name="monitor" timeout="20" interval="10" depth="0"/>
<action name="meta-data" timeout="5" />
</actions>
</resource-agent>
END
}
#######################################################################
agent_usage() {
cat <<END
usage: $0 {start|stop|monitor|meta-data}
Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager Agent and
agent_start() {
service cloudera-scm-agent start
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
return $OCF_ERROR
}
agent_stop() {
service cloudera-scm-agent next_stop_hard
service cloudera-scm-agent stop
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
return $OCF_ERROR
}
agent_monitor() {
# Monitor _MUST!_ differentiate correctly between running
# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
# That is THREE states, not just yes/no.
service cloudera-scm-agent status
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
return $OCF_STOPPED
}
case $__OCF_ACTION in
meta-data) meta_data
exit $OCF_SUCCESS
;;
start) agent_start;;
stop) agent_stop;;
monitor) agent_monitor;;
usage|help) agent_usage
exit $OCF_SUCCESS
;;
*) agent_usage
exit $OCF_ERR_UNIMPLEMENTED
;;
esac
rc=$?
exit $rc
#!/bin/sh
#######################################################################
# CM Agent OCF script
#######################################################################
#######################################################################
# Initialization:
: ${__OCF_ACTION=$1}
OCF_SUCCESS=0
OCF_ERROR=1
OCF_STOPPED=7
#######################################################################
meta_data() {
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="Cloudera Manager Agent" version="1.0">
<version>1.0</version>
<longdesc lang="en">
This OCF agent handles simple monitoring, start, stop of the Cloudera
Manager Agent, intended for use with Pacemaker/corosync for failover.
</longdesc>
<shortdesc lang="en">Cloudera Manager Agent OCF script</shortdesc>
<parameters />
<actions>
<action name="start" timeout="20" />
<action name="stop" timeout="20" />
<action name="monitor" timeout="20" interval="10" depth="0"/>
<action name="meta-data" timeout="5" />
</actions>
</resource-agent>
END
}
#######################################################################
agent_usage() {
cat <<END
usage: $0 {start|stop|monitor|meta-data}
Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager Agent and
managed processes lifecycle for use with Pacemaker.
END
}
agent_start() {
service cloudera-scm-agent start
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
return $OCF_ERROR
}
agent_stop() {
service cloudera-scm-agent hard_stop_confirmed
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
return $OCF_ERROR
}
agent_monitor() {
# Monitor _MUST!_ differentiate correctly between running
# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
# That is THREE states, not just yes/no.
service cloudera-scm-agent status
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
return $OCF_STOPPED
}
case $__OCF_ACTION in
meta-data) meta_data
exit $OCF_SUCCESS
;;
start) agent_start;;
stop) agent_stop;;
monitor) agent_monitor;;
usage|help) agent_usage
exit $OCF_SUCCESS
;;
*) agent_usage
exit $OCF_ERR_UNIMPLEMENTED
;;
esac
rc=$?
exit $rc
$ /usr/lib/ocf/resource.d/cm/agent monitor
This script should return the current running status of the SCM agent.
7. Add Cloudera Manager Agent as an OCF-managed resource (either on MGMT1 or MGMT2):
8. Verify that the primitive has been picked up by Pacemaker by running the following command:
$ crm_mon
For example:
>crm_mon
Last updated: Tue Jan 27 15:01:35 2015
Last change: Mon Jan 27 14:10:11 2015ls /
Stack: classic openais (with plugin)
Current DC: CMS1 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ MGMT1 MGMT2 ]
cloudera-scm-agent (ocf:cm:agent): Started MGMT2
Pacemaker starts managing the status of the cloudera-scm-agent service on hosts MGMT1 and MGMT2, ensuring
that only one instance is running at a time.
Note: Pacemaker expects that all lifecycle actions, such as start and stop, go through Pacemaker;
therefore, running direct service start or service stop commands on one of the hosts breaks
that assumption and could cause Pacemaker to start the service on the other host.
Test the resource move by connecting to a shell on MGMT2 and verifying that the cloudera-scm-agent and the
associated Cloudera Management Services processes are now active on that host. It usually takes a few minutes for
the new services to come up on the new host.
Database-Specific Mechanisms
• MariaDB:
Configuring MariaDB for high availability requires configuring MariaDB for replication. For more information, see
https://mariadb.com/kb/en/mariadb/setting-up-replication/.
• MySQL:
Configuring MySQL for high availability requires configuring MySQL for replication. Replication configuration
depends on which version of MySQL you are using. For version 5.1,
http://dev.mysql.com/doc/refman/5.1/en/replication-howto.html provides an introduction.
MySQL GTID-based replication is not supported.
• PostgreSQL:
PostgreSQL has extensive documentation on high availability, especially for versions 9.0 and higher. For information
about options available for version 9.1, see http://www.postgresql.org/docs/9.1/static/high-availability.html.
• Oracle:
Oracle supports a wide variety of free and paid upgrades to their database technology that support increased
availability guarantees, such as their Maximum Availability Architecture (MAA) recommendations. For more
information, see
http://www.oracle.com/technetwork/database/features/availability/oracle-database-maa-best-practices-155386.html.
Disk-Based Mechanisms
DRBD is an open-source Linux-based disk replication mechanism that works at the individual write level to replicate
writes on multiple machines. Although not directly supported by major database vendors (at the time of writing of
this document), it provides a way to inexpensively configure redundant distributed disk for disk-consistent databases
(such as MySQL, PostgreSQL, and Oracle). For information, see http://drbd.linbit.com.
https://[CMSHostname]:[TLS_Port]
Note: Remember to restart cloudera-scm-agent after making changes to these files or configuration
options.
This message is expected, and can be corrected by generating Kerberos credentials for these roles using the
Cloudera Manager Admin Console. Select Administration > Kerberos > Credentials > Generate Credentials.
For additional instructions on configuring Kerberos authentication for Cloudera Manager, see Configuring
Authentication in Cloudera Manager.
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
You can also use Cloudera Manager to schedule, save, and restore snapshots of HDFS directories and HBase tables.
Cloudera Manager provides key functionality in the Cloudera Manager Admin Console:
• Select - Choose datasets that are critical for your business operations.
• Schedule - Create an appropriate schedule for data replication and snapshots. Trigger replication and snapshots
as required for your business needs.
• Monitor - Track progress of your snapshots and replication jobs through a central console and easily identify issues
or files that failed to be transferred.
• Alert - Issue alerts when a snapshot or replication job fails or is aborted so that the problem can be diagnosed
quickly.
Replication works seamlessly across Hive and HDFS—you can set it up on files or directories in HDFS and on tables in
Hive—without manual translation of Hive datasets to HDFS datasets, or vice versa. Hive metastore information is also
replicated, so applications that depend on table definitions stored in Hive will work correctly on both the replica side
and the source side as table definitions are updated.
Replication is built on a hardened version of distcp. It uses the scalability and availability of MapReduce and YARN
to copy files in parallel, using a specialized MapReduce job or YARN application that runs diffs and transfers only
changed files from each mapper to the replica side. Files are selected for copying based on their size and checksums.
You can also perform a “dry run” to verify configuration and understand the cost of the overall operation before actually
copying the entire dataset.
Table 14:
See Ports for more information, including how to verify the current values for these ports.
Data Replication
Cloudera Manager enables you to replicate data across datacenters for disaster recovery scenarios. Replications can
include data stored in HDFS, data stored in Hive tables, Hive metastore data, and Impala metadata (catalog server
metadata) associated with Impala tables registered in the Hive metastore. When critical data is stored on HDFS, Cloudera
Manager helps to ensure that the data is available at all times, even in case of complete shutdown of a datacenter.
You can also use the HBase shell to replicate HBase data. (Cloudera Manager does not manage HBase replications.)
For recommendations on using data replication and Sentry authorization, see Configuring Sentry to Enable BDR
Replication.
The following sections describe license requirements and supported and unsupported replication scenarios.
Workaround for replicated data that includes a directory that contains several hundred thousand files or subdirectories:
1. On the destination Cloudera Manager instance, go to the HDFS service page.
2. Click the Configuration tab.
3. Select Scope > HDFS service name (Service-Wide) and Category > Advanced.
4. Locate the HDFS Replication Advanced Configuration Snippet property.
5. Increase the heap size by adding a key-value pair, for instance, HADOOP_CLIENT_OPTS=-Xmx1g. In this example,
1g sets the heap size to 1 GB. This value should be adjusted depending on the number of files and directories
being replicated.
6. Click Save Changes to commit the changes.
The Cloudera Manager Server that you are logged into is the destination for replications set up using that Cloudera
Manager instance. From the Admin Console of this destination Cloudera Manager instance, you can designate a peer
Cloudera Manager Server as a source of HDFS and Hive data for replication.
Note: If your cluster uses SAML Authentication, see Configuring Peers with SAML Authentication on
page 388 before configuring a peer.
1. Go to the Peers page by selecting Administration > Peers. If there are no existing peers, you will see only an Add
Peer button in addition to a short message. If peers already exist, they display in the Peers list.
2. Click the Add Peer button.
3. In the Add Peer dialog box, provide a name, the URL (including the port) of the Cloudera Manager Server source
for the data to be replicated, and the login credentials for that server.
Important: The role assigned to the login on the source server must be either a User Administrator
or a Full Administrator.
Cloudera recommends that TLS/SSL be used. A warning is shown if the URL scheme is http instead of https.
After configuring both peers to use TLS/SSL, add the remote source Cloudera Manager TLS/SSL certificate to the
local Cloudera Manager truststore, and vice versa. See Configuring TLS Encryption Only for Cloudera Manager.
4. Click the Add Peer button in the dialog box to create the peer relationship.
The peer is added to the Peers list. Cloudera Manager automatically tests the connection between the Cloudera
Manager Server and the peer. You can also click Test Connectivity to test the connection.
Modifying Peers
1. Go to the Peers page by selecting Administration > Peers. If there are no existing peers, you will see only an Add
Peer button in addition to a short message. If peers already exist, they display in the Peers list.
2. Do one of the following:
• Edit
1. In the row for the peer, select Edit.
2. Make your changes.
3. Click Update Peer to save your changes.
• Delete - In the row for the peer, click Delete.
HDFS Replication
Minimum Required Role: BDR Administrator (also provided by Full Administrator)
HDFS replication enables you to copy (replicate) your HDFS data from one HDFS service to another, synchronizing the
data set on the destination service with the data set on the source service, based on a specified replication schedule.
The destination service must be managed by the Cloudera Manager Server where the replication is being set up, and
the source service can be managed by that same server or by a peer Cloudera Manager Server. You can also replicate
HDFS data within a cluster by specifying different source and destination directories.
When you perform a replication, ensure that the source directory is not modified. A file added during replication does
not get replicated. If you delete a file during replication, the replication fails.
Important: To use HDFS replication, both the destination and source HDFS services must use Kerberos
authentication, or both must not use Kerberos authentication. See Enabling Replication Between
Clusters in Different Kerberos Realms on page 405.
<property>
<name>hadoop.proxyuser.hdfsdest.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hdfsdest.hosts</name>
<value>*</value>
</property>
Deploy the client configuration and restart all services on the source cluster.
3. If the source cluster is managed by a different Cloudera Manager server than the destination cluster, configure a
peer relationship.
4. Do one of the following:
1. Select Backup > Replications
2. Click Schedule HDFS Replication.
or
1. Select Clusters > HDFS Service Name.
2. Select Quick Links > Replication.
3. Click Schedule HDFS Replication.
The Create Replication dialog box displays.
5. Click the Source field and select the source HDFS service. You can select HDFS services managed by a peer Cloudera
Manager Server, or local HDFS services (managed by the Cloudera Manager Server for the Admin Console you are
logged into).
6. Enter the Path to the directory (or file) you want to replicate (the source).
7. Click the Destination field and select the destination HDFS service from the HDFS services managed by the Cloudera
Manager Server for the Admin Console you are logged into.
8. Enter the Path where the source files should be saved.
9. Select a Schedule:
• Immediate - Run the schedule Immediately.
• Once - Run the schedule one time in the future. Set the date and time.
• Recurring - Run the schedule periodically in the future. Set the date, time, and interval between runs.
10. Click the Add Exclusion link to exclude one or more paths from the replication.
The Regular Expression-Based Path Exclusion field displays, where you can enter a regular expression-based path.
Important: You must skip checksum checks to prevent replication failure due to
nonmatching checksums in the following cases:
• Replications from an encrypted zone on the source cluster to an encrypted zone
on a destination cluster.
• Replications from an encryption zone on the source cluster to an unencrypted zone
on the destination cluster.
• Replications from an unencrypted zone on the source cluster to an encrypted zone
on the destination cluster.
See Replication of Encrypted Data on page 407 and HDFS Transparent Encryption.
• Replication Strategy - Whether file replication tasks should be distributed among the mappers statically or
dynamically. (The default is Dynamic.) Static replication distributes file replication tasks among the mappers
up front to achieve a uniform distribution based on the file sizes. Dynamic replication distributes file replication
tasks in small sets to the mappers, and as each mapper completes its tasks, it dynamically acquires and
processes the next unallocated set of tasks. There are additional tuning options you can use to improve
performance when using the Dynamic strategy. See HDFS Replication Tuning on page 396.
• Delete Policy - Whether files that were deleted on the source should also be deleted from the destination
directory. This policy also determines the handling of files in the destination location that are unrelated to
the source. Options include:
– Keep Deleted Files - Retains the destination files even when they no longer exist at the source. (This is
the default.).
– Delete to Trash - If the HDFS trash is enabled, files are moved to the trash folder.
– Delete Permanently - Uses the least amount of space; use with caution.
• Preserve - Whether to preserve the block size, replication count, permissions (including ACLs), and extended
attributes (XAttrs) as they exist on the source file system, or to use the settings as configured on the destination
file system. By default source system settings are preserved. When Permission is checked, and both the
source and destination clusters support ACLs, replication preserves ACLs. Otherwise, ACLs are not replicated.
When Extended attributes is checked, and both the source and destination clusters support extended
attributes, replication preserves them.
• Alerts - Whether to generate alerts for various state changes in the replication workflow. You can alert on
failure, on start, on success, or when the replication workflow is aborted.
12. Click Save Schedule.
The replication task now appears as a row in the Replications Schedule table. (It can take up to 15 seconds for
the task to appear.)
Note: If your replication job takes a long time to complete, and files change before the replication
finishes, the replication may fail. Consider making the directories snapshottable, so that the replication
job creates snapshots of the directories before copying the files and then copies files from these
snapshottable directories when executing the replication. See Using Snapshots with Replication on
page 405.
HOST_WHITELIST=host-1.mycompany.com,host-2.mycompany.com
Only one job corresponding to a replication schedule can occur at a time; if another job associated with that same
replication schedule starts before the previous one has finished, the second one is canceled.
You can limit the replication jobs that are displayed by selecting filters on the left. If you do not see an expected
schedule, adjust or clear the filters. Use the search box to search the list of schedules for path, database, or table
names.
The Replication Schedules columns are described in the following table.
Column Description
ID An internally generated ID number that identifies the schedule. Provides a convenient way to
identify a schedule.
Click the ID column label to sort the replication schedule table by ID.
Column Description
Last Run The date and time when the replication last ran. Displays None if the scheduled replication has
not yet been run. Click the date and time link to view the Replication History page for the
replication.
Displays one of the following icons:
• - Successful. Displays the date and time of the last run replication.
• - Failed. Displays the date and time of a failed replication.
• - None. This scheduled replication has not yet run.
•
- Running. Displays a spinner and bar showing the progress of the replication.
Click the Last Run column label to sort the Replication Schedules table by the last run date.
Next Run The date and time when the next replication is scheduled, based on the schedule parameters
specified for the schedule. Hover over the date to view additional details about the scheduled
replication.
Click the Next Run column label to sort the Replication Schedules table by the next run date.
Actions The following items are available from the Action button:
• Show History - Opens the Replication History page for a replication. See Viewing Replication
History on page 394.
• Edit Configuration - Opens the Edit Replication Schedule page.
• Dry Run - Simulates a run of the replication task but does not actually copy any files or
tables. After a Dry Run, you can select Show History, which opens the Replication History
page where you can view any error messages and the number and size of files or tables
that would be copied in an actual replication.
• Click Collect Diagnostic Data to open the Send Diagnostic Data screen, which allows you
to collect replication-specific diagnostic data for the last 10 runs of the schedule:
1. Select Send Diagnostic Data to Cloudera to automatically send the bundle to Cloudera
Support. You can also enter a ticket number and comments when sending the bundle.
2. Click Collect and Send Diagnostic Data to generate the bundle and open the
Replications Diagnostics Command screen.
3. When the command finishes, click Download Result Data to download a zip file
containing the bundle.
• Run Now - Runs the replication task immediately.
• Disable | Enable - Disables or enables the replication schedule. No further replications are
scheduled for disabled replication schedules.
• Delete - Deletes the schedule. Deleting a replication schedule does not delete copied files
or tables.
• While a job is in progress, the Last Run column displays a spinner and progress bar, and each stage of the replication
task is indicated in the message beneath the job's row. Click the Command Details link to view details about the
execution of the command.
• If the job is successful, the number of files copied is indicated. If there have been no changes to a file at the source
since the previous job, then that file is not copied. As a result, after the initial job, only a subset of the files may
actually be copied, and this is indicated in the success message.
• If the job fails, the icon displays.
• To view more information about a completed job, select Actions > Show History. See Viewing Replication History
on page 394.
The Replication History page displays a table of previously run replication jobs with the following columns:
Column Description
Start Time Time when the replication job started.
Click to expand the display and show details of the replication. In this screen, you can:
• Click the View link to open the Command Details page, which displays details and
messages about each step in the execution of the command. Click to expand the display
for a Step to:
– View the actual command string.
– View the Start time and duration of the command.
– Click the Context link to view the service status page relevant to the command.
– Select one of the tabs to view the Role Log, stdout, and stderr for the command.
See Viewing Running and Recent Commands.
• Click Collect Diagnostic Data to open the Send Diagnostic Data screen, which allows
you to collect replication-specific diagnostic data for this run of the schedule:
1. Select Send Diagnostic Data to Cloudera to automatically send the bundle to
Cloudera Support. You can also enter a ticket number and comments when sending
the bundle.
2. Click Collect and Send Diagnostic Data to generate the bundle and open the
Replications Diagnostics Command screen.
3. When the command finishes, click Download Result Data to download a zip file
containing the bundle.
• (HDFS only) Link to view details on the MapReduce Job used for the replication. See
Viewing and Filtering MapReduce Activities.
• (Dry Run only) View the number of Replicable Files. Displays the number of files that
would be replicated during an actual replication.
• (Dry Run only) View the number of Replicable Bytes. Displays the number of bytes that
would be replicated during an actual replication.
• Link to download a CSV file containing a Replication Report. This file lists the databases
and tables that were replicated.
• View the number of Errors that occurred during the replication.
• View the number of Impala UDFs replicated. (Displays only for Hive replications where
Replicate Impala Metadata is selected.)
• Click the link to download a CSV file containing a Download Listing. This file lists the files
and directories that were replicated.
• Click the link to download a CSV file containing Download Status.
• If a user was specified in the Run as field when creating the replication job, the selected
user displays.
• View messages returned from the replication job.
Column Description
Files Skipped Number of files skipped during the replication. The replication process skips files that already
exist in the destination and have not changed.
Note: These configurations apply to all HDFS replication jobs, and might not improve performance
for replications with different total file counts.
You can configure the number of chunks that will be generated by setting the following base parameters:
distcp.dynamic.max.chunks.ideal
The “ideal” chunk count. Identifies the goal for how many chunks need to be configured. (The default value is 100.)
distcp.dynamic.min.records_per_chunk
The minimum number of records per chunk. Ensures that each chunk is at least a certain size (The default value is
5.)
The distcp.dynamic.max.chunks.ideal parameter is the most important, and controls how many chunks are
generated in the general case. The distcp.dynamic.min.records_per_chunk parameter identifies the minimum
number of records that are packed into a chunk. This effectively lowers the number of chunks created if the computed
records per chunk falls below this number.
In addition, there are other two other parameters with default values, which validate the settings in the base parameters:
distcp.dynamic.max.chunks.tolerable
The maximum chunk count. Set to a value that is greater than or equal to the value set for
distcp.dynamic.max.chunks.ideal. Identifies the maximum number of chunks that are allowed to be
generated. An error condition is triggered if your configuration causes this value to be exceeded. (The default value
is 400.)
distcp.dynamic.split.ratio
Validates that each mapper has at least a certain number of chunks. (The default value is 2.)
Consider the following examples:
• If you use the default values for all parameters, and you have 500 files in your replication and 20 mappers, the
replication job generates 100 chunks with 5 files in each replication (and 5 chunks per mapper, on average).
• If you have only 200 total files, the replication job automatically packages them into 40 chunks to satisfy the
requirement defined by the distcp.dynamic.min.records_per_chunk parameter.
• If you used 250 mappers, then the default value of 2 chunks per mapper configured with the
distcp.dynamic.split.ratio parameter leads to requiring 500 total chunks, which would be higher than
the maximum set by the distcp.dynamic.max.chunks.tolerable parameter number of 400, and will trigger
an error.
To configure chunking:
1. Open the Cloudera Manager Admin Console for the destination cluster and go to the MapReduce service for your
cluster.
2. Click the Configuration tab.
3. Search for the HDFS Replication Advanced Configuration Snippet (Safety Valve) for mapred-site.xml property.
4. Click to add each of the following new properties:
• distcp.dynamic.max.chunks.ideal
Set the ideal number for the total chunks generated. (Default value is 100.)
• distcp.dynamic.max.chunks.tolerable
Set the upper limit for how many chunks are generated. (Default value is 400.)
5. Set the value of the properties to a value greater than 10,000. Tune these properties so that the number of files
per chunk is greater than the value set for the distcp.dynamic.min.records_per_chunk property. Set each
property to the same value.
For example:
Hive Replication
Minimum Required Role: BDR Administrator (also provided by Full Administrator)
Hive replication enables you to copy (replicate) your Hive metastore and data from one cluster to another and
synchronize the Hive metastore and data set on the destination cluster with the source, based on a specified replication
schedule. The destination cluster must be managed by the Cloudera Manager Server where the replication is being
set up, and the source cluster can be managed by that same server or by a peer Cloudera Manager Server.
Note:
If you configured Synchronizing HDFS ACLs and Sentry Permissions on the target cluster for the directory
where HDFS data is copied during Hive replication, the permissions that were copied during replication,
are overwritten by the HDFS ACL synchronization and are not preserved.
Note: JARs for permanent UDFs are stored in HDFS at a user-defined location. If you are replicating
Hive data, you should also replicate this directory.
9. Use the Advanced Options section to specify an export location, modify the parameters of the MapReduce job
that will perform the replication, and set other options. You can select a MapReduce service (if there is more than
one in your cluster) and change the following parameters:
• Uncheck the Replicate HDFS Files checkbox to skip replicating the associated data files.
• Uncheck the Replicate Impala Metadata checkbox to skip replicating Impala metadata. (This option is checked
by default.) See Impala Metadata Replication on page 404.
• The Force Overwrite option, if checked, forces overwriting data in the destination metastore if incompatible
changes are detected. For example, if the destination metastore was modified, and a new partition was added
to a table, this option forces deletion of that partition, overwriting the table with the version found on the
source.
Important: If the Force Overwrite option is not set, and the Hive replication process detects
incompatible changes on the source cluster, Hive replication fails. This sometimes occurs
with recurring replications, where the metadata associated with an existing database or table
on the source cluster changes over time.
Note: In a Kerberized cluster, the HDFS principal on the source cluster must have read,
write, and execute access to the Export Path directory on the destination cluster.
• By default, Hive HDFS data files (for example, /user/hive/warehouse/db1/t1) are replicated to a location
relative to "/" (in this example, to /user/hive/warehouse/db1/t1). To override the default, enter a path
in the HDFS Destination Path field. For example, if you enter /ReplicatedData, the data files would be
replicated to /ReplicatedData/user/hive/warehouse/db1/t1.
• Select the MapReduce Service to use for this replication (if there is more than one in your cluster).
• To specify the user that should run the MapReduce job, use the Run As option. By default, MapReduce jobs
run as hdfs. To run the MapReduce job as a different user, enter the user name. If you are using Kerberos,
you must provide a user name here, and it must have an ID greater than 1000.
Note: If you are using different principals on the source and destination clusters, the user
running the MapReduce job should have read and execute permissions on the Hive
warehouse directory on the source cluster. If you configure the replication job to preserve
permissions, superuser privileges are required on the destination cluster.
• Scheduler Pool - The name of a resource pool. The value you enter is used by the MapReduce Service you
specified when Cloudera Manager executes the MapReduce job for the replication. The job specifies the
value using one of these properties:
– MapReduce - Fair scheduler: mapred.fairscheduler.pool
Note: You must be running as a superuser to preserve permissions. Use the "Run as" option
to ensure that is the case.
• Alerts - Whether to generate alerts for various state changes in the replication workflow. You can alert On
Failure, On Start, On Success, or On Abort (when the replication workflow is aborted).
10. Click Save Schedule.
The replication task appears as a row in the Replications Schedule table. See Viewing Replication Schedules on
page 400.
Note: If your replication job takes a long time to complete, and tables change before the replication
finishes, the replication may fail. Consider making the Hive Warehouse Directory and the directories
of any external tables snapshottable, so that the replication job creates snapshots of the directories
before copying the files. See Using Snapshots with Replication on page 405.
Only one job corresponding to a replication schedule can occur at a time; if another job associated with that same
replication schedule starts before the previous one has finished, the second one is canceled.
You can limit the replication jobs that are displayed by selecting filters on the left. If you do not see an expected
schedule, adjust or clear the filters. Use the search box to search the list of schedules for path, database, or table
names.
The Replication Schedules columns are described in the following table.
Column Description
ID An internally generated ID number that identifies the schedule. Provides a convenient way to
identify a schedule.
Click the ID column label to sort the replication schedule table by ID.
Last Run The date and time when the replication last ran. Displays None if the scheduled replication has
not yet been run. Click the date and time link to view the Replication History page for the
replication.
Displays one of the following icons:
• - Successful. Displays the date and time of the last run replication.
Column Description
• - Failed. Displays the date and time of a failed replication.
• - None. This scheduled replication has not yet run.
•
- Running. Displays a spinner and bar showing the progress of the replication.
Click the Last Run column label to sort the Replication Schedules table by the last run date.
Next Run The date and time when the next replication is scheduled, based on the schedule parameters
specified for the schedule. Hover over the date to view additional details about the scheduled
replication.
Click the Next Run column label to sort the Replication Schedules table by the next run date.
Actions The following items are available from the Action button:
• Show History - Opens the Replication History page for a replication. See Viewing Replication
History on page 394.
• Edit Configuration - Opens the Edit Replication Schedule page.
• Dry Run - Simulates a run of the replication task but does not actually copy any files or
tables. After a Dry Run, you can select Show History, which opens the Replication History
page where you can view any error messages and the number and size of files or tables
that would be copied in an actual replication.
• Click Collect Diagnostic Data to open the Send Diagnostic Data screen, which allows you
to collect replication-specific diagnostic data for the last 10 runs of the schedule:
1. Select Send Diagnostic Data to Cloudera to automatically send the bundle to Cloudera
Support. You can also enter a ticket number and comments when sending the bundle.
2. Click Collect and Send Diagnostic Data to generate the bundle and open the
Replications Diagnostics Command screen.
3. When the command finishes, click Download Result Data to download a zip file
containing the bundle.
• Run Now - Runs the replication task immediately.
• Disable | Enable - Disables or enables the replication schedule. No further replications are
scheduled for disabled replication schedules.
• Delete - Deletes the schedule. Deleting a replication schedule does not delete copied files
or tables.
• While a job is in progress, the Last Run column displays a spinner and progress bar, and each stage of the replication
task is indicated in the message beneath the job's row. Click the Command Details link to view details about the
execution of the command.
• If the job is successful, the number of files copied is indicated. If there have been no changes to a file at the source
since the previous job, then that file is not copied. As a result, after the initial job, only a subset of the files may
actually be copied, and this is indicated in the success message.
• If the job fails, the icon displays.
• To view more information about a completed job, select Actions > Show History. See Viewing Replication History
on page 394.
Enabling, Disabling, or Deleting A Replication Schedule
When you create a new replication schedule, it is automatically enabled. If you disable a replication schedule, it can
be re-enabled at a later time.
To enable, disable, or delete a replication schedule, do one of the following:
• 1. Click Actions > Enable|Disable|Delete in the row for a replication schedule.
-or-
• 1. Select one or more replication schedules in the table by clicking the check box the in the left column of the
table.
2. Click Actions for Selected > Enable|Disable|Delete.
The Replication History page displays a table of previously run replication jobs with the following columns:
Column Description
Start Time Time when the replication job started.
Click to expand the display and show details of the replication. In this screen, you can:
• Click the View link to open the Command Details page, which displays details and
messages about each step in the execution of the command. Click to expand the display
for a Step to:
– View the actual command string.
– View the Start time and duration of the command.
– Click the Context link to view the service status page relevant to the command.
Column Description
– Select one of the tabs to view the Role Log, stdout, and stderr for the command.
See Viewing Running and Recent Commands.
• Click Collect Diagnostic Data to open the Send Diagnostic Data screen, which allows
you to collect replication-specific diagnostic data for this run of the schedule:
1. Select Send Diagnostic Data to Cloudera to automatically send the bundle to
Cloudera Support. You can also enter a ticket number and comments when sending
the bundle.
2. Click Collect and Send Diagnostic Data to generate the bundle and open the
Replications Diagnostics Command screen.
3. When the command finishes, click Download Result Data to download a zip file
containing the bundle.
• (HDFS only) Link to view details on the MapReduce Job used for the replication. See
Viewing and Filtering MapReduce Activities.
• (Dry Run only) View the number of Replicable Files. Displays the number of files that
would be replicated during an actual replication.
• (Dry Run only) View the number of Replicable Bytes. Displays the number of bytes that
would be replicated during an actual replication.
• Link to download a CSV file containing a Replication Report. This file lists the databases
and tables that were replicated.
• View the number of Errors that occurred during the replication.
• View the number of Impala UDFs replicated. (Displays only for Hive replications where
Replicate Impala Metadata is selected.)
• Click the link to download a CSV file containing a Download Listing. This file lists the files
and directories that were replicated.
• Click the link to download a CSV file containing Download Status.
• If a user was specified in the Run as field when creating the replication job, the selected
user displays.
• View messages returned from the replication job.
When you select the Replicate Impala Metadata property, Impala UDFs (user-defined functions) will be available on
the target cluster, just as on the source cluster. As part of replicating the UDFs, the binaries in which they are defined
are also replicated.
If you are using external tables in Hive, also make the directories hosting any external tables not stored in the Hive
warehouse directory snapshottable.
Similarly, if you are using Impala and are replicating any Impala tables using Hive/Impala replication, ensure that the
storage locations for the tables and associated databases are also snapshottable. See Enabling and Disabling HDFS
Snapshots on page 429.
Note: If either the source or destination cluster is running Cloudera Manager 4.6 or higher, then both
clusters (source and destination) must be running 4.6 or higher. For example, cross-realm authentication
does not work if one cluster is running Cloudera Manager 4.5.x and one is running Cloudera Manager
4.6 or higher.
• You can use the same realm name if the clusters use the same KDC or different KDCs that are part of a unified
realm, for example where one KDC is the master and the other is a slave KDC.
•
Note: If you have multiple clusters that are used to segregate production and non-production
environments, this configuration could result in principals that have equal permissions in both
environments. Make sure that permissions are set appropriately for each type of environment.
Important: If the source and destination clusters are in the same realm but do not use the same KDC
or the KDcs are not part of a unified realm, the replication job will fail.
HDFS Replication
1. On the hosts in the destination cluster, ensure that the krb5.conf file (typically located at /etc/kbr5.conf)
on each host has the following information:
• The kdc information for the source cluster's Kerberos realm. For example:
[realms]
SOURCE.MYCO.COM = {
kdc = src-kdc-1.src.myco.com:88
admin_server = src-kdc-1.src.myco.com:749
default_domain = src.myco.com
}
DEST.MYCO.COM = {
kdc = dest-kdc-1.dest.myco.com:88
admin_server = dest-kdc-1.dest.myco.com:749
default_domain = dest.myco.com
}
• Domain/host-to-realm mapping for the source cluster NameNode hosts. You configure these mappings in
the [domain_realm] section. For example, to map two realms named SRC.MYCO.COM and DEST.MYCO.COM,
to the domains of hosts named hostname.src.myco.com and hostname.dest.myco.com, make the
following mappings in the krb5.conf file:
[domain_realm]
.src.myco.com = SRC.MYCO.COM
src.myco.com = SRC.MYCO.COM
.dest.myco.com = DEST.MYCO.COM
dest.myco.com = DEST.MYCO.COM
2. On the destination cluster, use Cloudera Manager to add the realm of the source cluster to the Trusted Kerberos
Realms configuration property:
a. Go to the HDFS service.
b. Click the Configuration tab.
c. In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
d. Enter the source cluster realm.
e. Click Save Changes to commit the changes.
3. If your Cloudera Manager release is 5.0.1 or lower, restart the JobTracker to enable it to pick up the new Trusted
Kerberos Realm settings. Failure to restart the JobTracker prior to the first replication attempt may cause the
JobTracker to fail.
Hive Replication
1. Perform the procedure described in the previous section, including restarting the JobTracker.
2. On the hosts in the source cluster, ensure that the krb5.conf file on each host has the following information:
• The kdc information for the destination cluster's Kerberos realm.
• Domain/host-to-realm mapping for the destination cluster NameNode hosts.
3. On the source cluster, use Cloudera Manager to add the realm of the destination cluster to the Trusted Kerberos
Realms configuration property.
a. Go to the HDFS service.
b. Click the Configuration tab.
c. In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
d. Enter the destination cluster realm.
e. Click Save Changes to commit the changes.
It is not necessary to restart any services on the source cluster.
Note: Regardless of whether HDFS encryption is in use, you must always use TLS/SSL to encrypt data
during replication.
A source directory and destination directory may or may not be in an encryption zone. If the destination directory is
in an encryption zone, the data on the destination directory is encrypted. If the destination directory is not in an
encryption zone, the data on that directory is not encrypted, even if the source directory is in an encryption zone. For
more information about HDFS encryption zones, see HDFS Transparent Encryption. Encryption zones are not supported
in CDH versions 5.1 or lower.
Important: When you configure HDFS replication, you must select the Skip Checksum check property
to prevent replication failure in the following cases:
• Replications from an encrypted zone on the source cluster to an encrypted zone on a destination
cluster.
• Replications from an encryption zone on the source cluster to an unencrypted zone on the
destination cluster.
• Replications from an unencrypted zone on the source cluster to an encrypted zone on the
destination cluster.
Even when the source and destination directories are both in encryption zones, the data is decrypted as it is read from
the source cluster (using the key for the source encryption zone) and encrypted again when it is written to the destination
cluster (using the key for the destination encryption zone). By default, it is transmitted as plain text.
During replication, data travels from the source cluster to the destination cluster using distcp. To encrypt data
transmission between the source and destination using TLS/SSL:
• Enable TLS/SSL for HDFS clients on both the source and the destination clusters. For instructions, see Configuring
TLS/SSL for HDFS. You may also need to configure trust between the SSL certificates on the source and destination.
• Enable TLS/SSL for the two peer Cloudera Manager Servers as described here: Configuring TLS Encryption Only
for Cloudera Manager.
• Cloudera recommends you also enable TLS/SSL communication between the Cloudera Manager Server and Agents.
See Configuring TLS Security for Cloudera Manager for instructions.
HBase Replication
If your data is already in an HBase cluster, replication is useful for getting the data into additional HBase clusters. In
HBase, cluster replication refers to keeping one cluster state synchronized with that of another cluster, using the
write-ahead log (WAL) of the source cluster to propagate the changes. Replication is enabled at column family granularity.
Before enabling replication for a column family, create the table and all column families to be replicated, on the
destination cluster.
Cluster replication uses an active-push methodology. An HBase cluster can be a source (also called active, meaning
that it writes new data), a destination (also called passive, meaning that it receives data using replication), or can fulfill
both roles at once. Replication is asynchronous, and the goal of replication is consistency.
When data is replicated from one cluster to another, the original source of the data is tracked with a cluster ID, which
is part of the metadata. In CDH 5, all clusters that have already consumed the data are also tracked. This prevents
replication loops.
At the top of the diagram, the San Jose and Tokyo clusters, shown in red, replicate changes to each other, and each
also replicates changes to a User Data and a Payment Data cluster.
Each cluster in the second row, shown in blue, replicates its changes to the All Data Backup 1 cluster, shown in
grey. The All Data Backup 1 cluster replicates changes to the All Data Backup 2 cluster (also shown in grey),
as well as the Data Analysis cluster (shown in green). All Data Backup 2 also propagates any of its own changes
back to All Data Backup 1.
The Data Analysis cluster runs MapReduce jobs on its data, and then pushes the processed data back to the San
Jose and Tokyo clusters.
Requirements
Before configuring replication, make sure your environment meets the following requirements:
• You must manage ZooKeeper yourself. It must not be managed by HBase, and must be available throughout the
deployment.
• Each host in both clusters must be able to reach every other host, including those in the ZooKeeper cluster.
• Both clusters must be running the same major version of CDH; for example CDH 4 or CDH 5.
• Every table that contains families that are scoped for replication must exist on each cluster and have exactly the
same name. If your tables do not yet exist on the destination cluster, see Creating the Empty Table On the
Destination Cluster on page 411.
• HBase version 0.92 or greater is required for complex replication topologies, such as active-active.
Important: To run replication-related HBase comands, your user must have HBase administrator
permissions. If ZooKeeper uses Kerberos, configure HBase Shell to authenticate to ZooKeeper using
Kerberos before attempting to run replication-related commands. There are currently no
replication-related ACLs.
$ kinit -k -t /etc/hbase/conf/hbase.keytab
hbase/[email protected]
5. On the source cluster, in HBase Shell, add the destination cluster as a peer, using the add_peer command. The
syntax is as follows:
add_peer 'ID', 'CLUSTER_KEY'
The ID must be a short integer. To compose the CLUSTER_KEY, use the following template:
hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent
If both clusters use the same ZooKeeper cluster, you must use a different zookeeper.znode.parent, because they
cannot write in the same folder.
6. On the source cluster, configure each column family to be replicated by setting its REPLICATION_SCOPE to 1,
using commands such as the following in HBase Shell.
7. Verify that replication is occurring by examining the logs on the source cluster for messages such as the following.
8. To verify the validity of replicated data, use the included VerifyReplication MapReduce job on the source
cluster, providing it with the ID of the replication peer and table name to verify. Other options are available, such
as a time range or specific families to verify.
The command has the following form:
hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication
[--starttime=timestamp1] [--stoptime=timestamp] [--families=comma separated list of
families] <peerId> <tablename>
The VerifyReplication command prints GOODROWS and BADROWS counters to indicate rows that did and did
not replicate correctly.
Note: This log accumulation is a powerful side effect of the disable_peer command and can be
used to your advantage. See Initiating Replication When Data Already Exists on page 412.
To re-enable the peer, use the command enable_peer(<"peerID">). Replication resumes where it was stopped.
Examples:
• To disable peer 1:
disable_peer("1")
• To re-enable peer 1:
enable_peer("1")
If you disable replication, and then later decide to enable it again, you must manually remove the old replication data
from ZooKeeper by deleting the contents of the replication queue within the /hbase/replication/rs/ znode. If
you fail to do so, and you re-enable replication, the source cluster cannot reassign previously-replicated regions. Instead,
you will see logged errors such as the following:
Won't transfer the queue, another RS took care of it because of: KeeperErrorCode
= NoNode for
/hbase/replication/rs/c856fqz.example.com,60020,1426225601879/lock
hbase(main):001:0> stop_replication
Already queued edits will be replicated after you use the disable_table_replication command, but new entries
will not. See Understanding How WAL Rolling Affects Replication on page 412.
To start replication again, use the enable_peer command.
CREATE cme_users
3. On the destination cluster, paste the command from the previous step into HBase Shell to create the table.
Follow these instructions to recover HBase data from a replicated cluster in a disaster recovery scenario.
1. Change the value of the column family property REPLICATION_SCOPE on the sink to 0 for each column to be
restored, so that its data will not be replicated during the restore operation.
2. Change the value of the column family property REPLICATION_SCOPE on the source to 1 for each column to be
restored, so that its data will be replicated.
3. Use the CopyTable or distcp commands to import the data from the backup to the sink cluster, as outlined in
Initiating Replication When Data Already Exists on page 412.
4. Add the sink as a replication peer to the source, using the add_peer command as discussed in Deploying HBase
Replication on page 409. If you used distcp in the previous step, restart or rolling restart both clusters, so that
the RegionServers will pick up the new files. If you used CopyTable, you do not need to restart the clusters. New
data will be replicated as it is written.
5. When restoration is complete, change the REPLICATION_SCOPE values back to their values before initiating the
restoration.
Replication Caveats
• Two variables govern replication: hbase.replication as described above under Deploying HBase Replication
on page 409, and a replication znode. Stopping replication (using stop_replication as above) sets the znode
to false. Two problems can result:
– If you add a new RegionServer to the active cluster while replication is stopped, its current log will not be
added to the replication queue, because the replication znode is still set to false. If you restart replication
at this point (using enable_peer), entries in the log will not be replicated.
– Similarly, if a log rolls on an existing RegionServer on the active cluster while replication is stopped, the new
log will not be replicated, because the replication znode was set to false when the new log was created.
• In the case of a long-running, write-intensive workload, the destination cluster may become unresponsive if its
meta-handlers are blocked while performing the replication. CDH 5 provides three properties to deal with this
problem:
– hbase.regionserver.replication.handler.count - the number of replication handlers in the
destination cluster (default is 3). Replication is now handled by separate handlers in the destination cluster
to avoid the above-mentioned sluggishness. Increase it to a high value if the ratio of active to passive
RegionServers is high.
– replication.sink.client.retries.number - the number of times the HBase replication client at the
sink cluster should retry writing the WAL entries (default is 1).
– replication.sink.client.ops.timeout - the timeout for the HBase replication client at the sink cluster
(default is 20 seconds).
• For namespaces, tables, column families, or cells with associated ACLs, the ACLs themselves are not replicated.
The ACLs need to be re-created manually on the target table. This behavior opens up the possibility for the ACLs
could be different in the source and destination cluster.
Snapshots
You can create HBase and HDFS snapshots using Cloudera Manager or by using the command line.
• HBase snapshots allow you to create point-in-time backups of tables without making data copies, and with minimal
impact on RegionServers. HBase snapshots are supported for clusters running CDH 4.2 or higher.
• HDFS snapshots allow you to create point-in-time backups of directories or the entire filesystem without actually
cloning the data. They can improve data replication performance and prevent errors caused by changes to a source
directory. These snapshots appear on the filesystem as read-only directories that can be accessed just like other
ordinary directories. HDFS snapshots are supported for clusters running CDH 5 or higher. CDH 4 does not support
snapshots for HDFS.
Note: You can improve the reliability of Data Replication on page 385 by also using snapshots. See
Using Snapshots with Replication on page 405.
Note: You must enable an HDFS directory for snapshots to allow snapshot policies to be created for
that directory. To designate a HDFS directory as snapshottable, follow the procedure in Enabling and
Disabling HDFS Snapshots on page 429.
Each time unit in the schedule information is shared with the time units of larger granularity. That is, the minute
value is shared by all the selected schedules, hour by all the schedules for which hour is applicable, and so on. For
example, if you specify that hourly snapshots are taken at the half hour, and daily snapshots taken at the hour
20, the daily snapshot will occur at 20:30.
To select an interval, check its box. Fields display where you can edit the time and number of snapshots to keep.
For example:
7. Specify whether Alerts should be generated for various state changes in the snapshot workflow. You can alert on
failure, on start, on success, or when the snapshot workflow is aborted.
8. Click Save Policy.
The new Policy displays on the Snapshot Policies page. See Snapshot Policies Page on page 415.
Column Description
Policy Name The name of the policy.
Cluster The cluster that hosts the service (HDFS or HBase).
Service The service from which the snapshot is taken.
Objects HDFS Snapshots: The directories included in the snapshot.
HBase Snapshots: The tables included in the snapshot.
Last Run The date and time the snapshot last ran. Click the link to view the Snapshots History page. Also
displays the status icon for the last run.
Snapshot Schedule The type of schedule defined for the snapshot: Hourly, Daily, Weekly, Monthly, or Yearly.
Actions A drop-down menu with the following options:
• Show History - Opens the Snapshots History page. See Snapshots History on page 416.
• Edit Configuration - Edit the snapshot policy.
• Delete - Deletes the snapshot policy.
Column Description
• Enable - Enables running of scheduled snapshot jobs.
• Disable - Disables running of scheduled snapshot jobs.
Snapshots History
The Snapshots History page displays information about Snapshot jobs that have been run or attempted. The page
displays a table of Snapshot jobs with the following columns:
Column Description
Start Time Time when the snapshot job started execution.
Click to display details about the snapshot. For example:
Click the View link to open the Managed scheduled snapshots Command page, which displays details
and messages about each step in the execution of the command. For example:
Paths | Tables HDFS Snapshots: the number of Paths Unprocessed for the snapshot.
Unprocessed
HBase Snapshots: the number of Tables Unprocessed for the snapshot.
Column Description
Snapshots Number of snapshots created.
Created
Snapshots Number of snapshots deleted.
Deleted
Errors During Errors that occurred when creating the snapshot. View details about these errors by expanding the
Creation display to include details of the snapshot.
Errors During Errors that occurred when deleting snapshots. View details about these errors by expanding the
Deletion display to include details of the snapshot.
See Managing HDFS Snapshots on page 428 and Managing HBase Snapshots on page 417 for more information about
managing snapshots.
Orphaned Snapshots
When a snapshot policy includes a limit on the number of snapshots to keep, Cloudera Manager checks the total
number of stored snapshots each time a new snapshot is added, and automatically deletes the oldest existing snapshot
if necessary. When a snapshot policy is edited or deleted, files, directories, or tables that were removed from the policy
may leave "orphaned" snapshots behind that are not deleted automatically because they are no longer associated
with a current snapshot policy. Cloudera Manager never selects these snapshots for automatic deletion because
selection for deletion only occurs when the policy creates a new snapshot containing those files, directories, or tables.
You can delete snapshots manually through Cloudera Manager or by creating a command-line script that uses the
HDFS or HBase snapshot commands. Orphaned snapshots can be hard to locate for manual deletion. Snapshot policies
automatically receive the prefix cm-auto followed by a globally unique identifier (GUID). You can locate all snapshots
for a specific policy by searching for t the prefix cm-auto-guid that is unique to that policy.
To avoid orphaned snapshots, delete snapshots before editing or deleting the associated snapshot policy, or record
the identifying name for the snapshots you want to delete. This prefix is displayed in the summary of the policy in the
policy list and appears in the delete dialog box. Recording the snapshot names, including the associated policy prefix,
is necessary because the prefix associated with a policy cannot be determined after the policy has been deleted, and
snapshot names do not contain recognizable references to snapshot policies.
Warning: If you use coprocessors, the coprocessor must be available on the destination cluster before
restoring the snapshot.
To restore a snapshot to a new table, select Restore As from the menu associated with the snapshot, and provide a
name for the new table.
Warning: If you "Restore As" to an existing table (that is, specify a table name that already exists),
the existing table will be overwritten.
Note: When HBase snapshots are stored on, or restored from, Amazon S3, a MapReduce (MRv2) job
is created to copy the HBase table data and metadata. The YARN service must be running on your
Cloudera Manager cluster to use this feature.
To configure HBase to store snapshots on Amazon S3, you must have the following information:
1. The access key ID for your Amazon S3 account.
2. The secret access key for your Amazon S3 account.
3. The path to the directory in Amazon S3 where you want your HBase snapshots to be stored.
Configuring HBase in Cloudera Manager to Store Snapshots in Amazon S3
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
Perform the following steps in Cloudera Manager:
1. Open the HBase service page.
2. Select Scope > HBASE (Service-Wide).
3. Select Category > Backup.
4. Type AWS in the Search box.
5. Enter your Amazon S3 access key ID in the field AWS S3 access key ID for remote snapshots.
6. Enter your Amazon S3 secret access key in the field AWS S3 secret access key for remote snapshots.
7. Enter the path to the location in Amazon S3 where your HBase snapshots will be stored in the field AWS S3 path
for remote snapshots.
Warning: Do not use the Amazon S3 location defined by the path entered in AWS S3 path for
remote snapshots for any other purpose, or directly add or delete content there. Doing so risks
corrupting the metadata associated with the HBase snapshots stored there. Use this path and
Amazon S3 location only through Cloudera Manager, and only for managing HBase snapshots.
8. In a terminal window, log in to your Cloudera Manager cluster at the command line and create a /user/hbase
directory in HDFS. Change the owner of the directory to hbase. For example:
Configuring the Dynamic Resource Pool Used for Exporting and Importing Snapshots in Amazon S3
Dynamic resource pools are used to control the resources available for MapReduce jobs created for HBase snapshots
on Amazon S3. By default, MapReduce jobs run against the default dynamic resource pool. To choose a different
dynamic resource pool for HBase snapshots stored on Amazon S3, follow these steps:
1. Open the HBase service page.
2. Select Scope > HBASE (Service-Wide).
3. Select Category > Backup.
4. Type Scheduler in the Search box.
5. Enter name of a dynamic resource pool in the Scheduler pool for remote snapshots in AWS S3 property.
6. Click Save Changes.
HBase Snapshots on Amazon S3 with Kerberos Enabled
By default, when Kerberos is enabled, YARN does not allow the system user hbase to run MapReduce jobs. If Kerberos
is enabled on your cluster, perform the following steps:
1. Open the YARN service page in Cloudera Manager.
2. Select Scope > NodeManager.
3. Select Category > Security.
4. In the Allowed System Users property, click the + sign and add hbase to the list of allowed system users.
5. Click Save Changes.
6. Restart the YARN service.
Managing HBase Snapshots on Amazon S3 in Cloudera Manager
Minimum Required Role: BDR Administrator (also provided by Full Administrator)
To take HBase snapshots and store them on Amazon S3, perform the following steps:
1. On the HBase service page in Cloudera Manager, click the Table Browser tab.
2. Select a table in the Table Browser. If any recent local or remote snapshots already exist, they display on the right
side.
3. In the dropdown for the selected table, click Take Snapshot.
4. Enter a name in the Snapshot Name field of the Take Snapshot dialog box.
5. If Amazon S3 storage is configured as described above, the Take Snapshot dialog box Destination section shows
a choice of Local or Remote S3. Select Remote S3.
6. Click Take Snapshot.
While the Take Snapshot command is running, a local copy of the snapshot with a name beginning cm-tmp
followed by an auto-generated filename is displayed in the Table Browser. This local copy is deleted as soon as
the remote snapshot has been stored in Amazon S3. If the command fails without being completed, the temporary
local snapshot might not be deleted. This copy can be manually deleted or kept as a valid local snapshot. To store
a current snapshot in Amazon S3, run the Take Snapshot command again, selecting Remote S3 as the Destination,
or use the HBase command-line tools to manually export the existing temporary local snapshot to Amazon S3.
3. Click Delete.
Restoring an HBase Snapshot from Amazon S3
To restore an HBase snapshot that is stored in Amazon S3:
1. Select the table in the Table Browser.
2. Click Restore Table.
3. Choose Remote S3 and select the table to restore.
4. Click Restore.
Cloudera Manager creates a local copy of the remote snapshot with a name beginning with cm-tmp followed by
an auto-generated filename, and uses that local copy to restore the table in HBase. Cloudera Manager then
automatically deletes the local copy. If the Restore command fails without completing, the temporary copy might
not be deleted and can be seen in the Table Browser. In that case, delete the local temporary copy manually and
re-run the Restore command to restore the table from Amazon S3.
Note: You can only configure a policy as Local or Remote S3 at the time the policy is created and
cannot change the setting later. If the setting is wrong, create a new policy.
When you create a snapshot based on a snapshot policy, a local copy of the snapshot is created with a name beginning
with cm-auto followed by an auto-generated filename. The temporary copy of the snapshot is displayed in the Table
Browser and is deleted as soon as the remote snapshot has been stored in Amazon S3. If the snapshot procedure fails
without being completed, the temporary local snapshot might not be deleted. This copy can be manually deleted or
kept as a valid local snapshot. To export the HBase snapshot to Amazon S3, use the HBase command-line tools to
manually export the existing temporary local snapshot to Amazon S3.
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
Use Cases
• Recovery from user or application errors
– Useful because it may be some time before the database administrator notices the error.
Note:
The database administrator needs to schedule the intervals at which to take and delete
snapshots. Use a script or management tool; HBase does not have this functionality.
– The database administrator may want to save a snapshot before a major application upgrade or change.
Note:
Snapshots are not primarily used for system upgrade protection because they do not roll
back binaries, and would not necessarily prevent bugs or errors in the system or the upgrade.
– Recovery cases:
– Roll back to previous snapshot and merge in reverted data.
– View previous snapshots and selectively merge them into production.
• Backup
– Capture a copy of the database and store it outside HBase for disaster recovery.
– Capture previous versions of data for compliance, regulation, and archiving.
– Export from a snapshot on a live system provides a more consistent view of HBase than CopyTable and
ExportTable.
• Offload work
– Capture, copy, and restore data to another site
– Export data to another cluster
Storage Considerations
Because hfiles are immutable, a snapshot consists of a reference to the files in the table at the moment the snapshot
is taken. No copies of the data are made during the snapshot operation, but copies may be made when a compaction
or deletion is triggered. In this case, if a snapshot has a reference to the files to be removed, the files are moved to an
archive folder, instead of being deleted. This allows the snapshot to be restored in full.
Because no copies are performed, multiple snapshots share the same hfiles, butfor tables with lots of updates, and
compactions, each snapshot could have a different set of hfiles.
Configuring and Enabling Snapshots
Snapshots are on by default; to disable them, set the hbase.snapshot.enabled property in hbase-site.xml to
false:
<property>
<name>hbase.snapshot.enabled</name>
<value>
false
</value>
</property>
To enable snapshots after you have disabled them, set hbase.snapshot.enabled to true.
Note:
If you have taken snapshots and then decide to disable snapshots, you must delete the snapshots
before restarting the HBase master; the HBase master will not start if snapshots are disabled and
snapshots exist.
Shell Commands
You can manage snapshots by using the HBase shell or the HBaseAdmin Java API.
The following table shows actions you can take from the shell.
#!/bin/bash
# Take a snapshot of the table passed as an argument
# Usage: snapshot_script.sh table_name
# Names the snapshot in the format snapshot-YYYYMMDD
HBase Shell returns an exit code of 0 on successA non-zero exit code indicates the possibility of failure, not a definite
failure. Your script should check to see if the snapshot was created before taking the snapshot again, in the event of
a reported failure.
Exporting a Snapshot to Another Cluster
You can export any snapshot from one cluster to another. Exporting the snapshot copies the table's hfiles, logs, and
the snapshot metadata, from the source cluster to the destination cluster. Specify the -copy-from option to copy
from a remote cluster to the local cluster or another remote cluster. If you do not specify the -copy-from option, the
hbase.rootdir in the HBase configuration is used, which means that you are exporting from the current cluster. You
must specify the -copy-to option, to specify the destination cluster.
Note: Snapshots must be enabled on the destination cluster. See Configuring and Enabling Snapshots
on page 422.
Warning: If you use coprocessors, the coprocessor must be available on the destination cluster before
restoring the snapshot.
The ExportSnapshot tool executes a MapReduce Job similar to distcp to copy files to the other cluster. It works
at file-system level, so the HBase cluster can be offline.
Run ExportSnapshot as the hbase user or the user that owns the files. If the user, group, or permissions need to
be different on the destination cluster than the source cluster, use the -chuser, -chgroup, or -chmod options as
in the second example below, or be sure the destination directory has the correct permissions. In the following examples,
replace the HDFS server path and port with the appropriate ones for your cluster.
To copy a snapshot called MySnapshot to an HBase cluster srv2 (hdfs://srv2:8020/hbase) using 16 mappers:
To export the snapshot and change the ownership of the files during the copy:
You can also use the Java -D option in many tools to specify MapReduce or other configuration properties. For example,
the following command copies MY_SNAPSHOT to hdfs://cluster2/hbase using groups of 10 hfiles per mapper:
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot
-Dsnapshot.export.default.map.group=10 -snapshot MY_SNAPSHOT -copy-to
hdfs://cluster2/hbase
To specify a different name for the snapshot on the target cluster, use the -target option.
Restrictions
Warning:
Do not use merge in combination with snapshots. Merging two regions can cause data loss if
snapshots or cloned tables exist for this table.
The merge is likely to corrupt the snapshot and any tables cloned from the snapshot. If the table has
been restored from a snapshot, the merge may also corrupt the table. The snapshot may survive intact
if the regions being merged are not in the snapshot, and clones may survive if they do not share files
with the original table or snapshot. You can use the Snapinfo tool (see Information and Debugging
on page 427) to check the status of the snapshot. If the status is BROKEN, the snapshot is unusable.
Note: This restriction also applies to a rolling upgrade, which can be done only through Cloudera
Manager.
If you are using HBase Replication and you need to restore a snapshot:
Important:
Snapshot restore is an emergency tool; you need to disable the table and table replication to get to
an earlier state, and you may lose data in the process.
If you are using HBase Replication, the replicas will be out of sync when you restore a snapshot. If you need to restore
a snapshot, proceed as follows:
1. Disable the table that is the restore target, and stop the replication.
2. Remove the table from both the master and worker clusters.
3. Restore the snapshot on the master cluster.
4. Create the table on the worker cluster and use CopyTable to initialize it.
Note:
If this is not an emergency (for example, if you know exactly which rows you have lost), you can create
a clone from the snapshot and create a MapReduce job to copy the data that you have lost.
In this case, you do not need to stop replication or disable your main table.
Snapshot Failures
Region moves, splits, and other metadata actions that happen while a snapshot is in progress can cause the snapshot
to fail. The software detects and rejects corrupted snapshot attempts.
Information and Debugging
You can use the SnapshotInfo tool to get information about a snapshot, including status, files, disk usage, and
debugging information.
Examples:
Use the -h option to print usage instructions for the SnapshotInfo utility.
$ hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -h
Usage: bin/hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo [options]
where [options] are:
-h|-help Show this help and exit.
-remote-dir Root directory that contains the snapshots.
-list-snapshots List all the available snapshots and exit.
-snapshot NAME Snapshot to examine.
-files Files and logs list.
-stats Files and logs stats.
-schema Describe the snapshotted table.
Use the -remote-dir option with the -list-snapshots option to list snapshots located on a remote system.
Use the -snapshot with the -stats options to display additional statistics about a snapshot.
1 HFiles (0 in archive), total size 1.0k (100.00% 1.0k shared with the source table)
Use the -schema option with the -snapshot option to display the schema of a snapshot.
Table Descriptor
----------------------------------------
'test', {NAME => 'cf', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW',
REPLICATION_SCOPE => '0',
COMPRESSION => 'GZ', VERSIONS => '1', TTL => 'FOREVER', MIN_VERSIONS => '0',
KEEP_DELETED_CELLS => 'false',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
Use the -files option with the -snapshot option to list information about files contained in a snapshot.
Snapshot Files
----------------------------------------
52.4k test-table/02ba3a0f8964669520cf96bb4e314c60/cf/bdf29c39da2a4f2b81889eb4f7b18107
(archive)
52.4k test-table/02ba3a0f8964669520cf96bb4e314c60/cf/1e06029d0a2a4a709051b417aec88291
(archive)
86.8k test-table/02ba3a0f8964669520cf96bb4e314c60/cf/506f601e14dc4c74a058be5843b99577
(archive)
52.4k test-table/02ba3a0f8964669520cf96bb4e314c60/cf/5c7f6916ab724eacbcea218a713941c4
(archive)
293.4k test-table/02ba3a0f8964669520cf96bb4e314c60/cf/aec5e33a6564441d9bd423e31fc93abb
(archive)
52.4k test-table/02ba3a0f8964669520cf96bb4e314c60/cf/97782b2fbf0743edaacd8fef06ba51e4
(archive)
6 HFiles (6 in archive), total size 589.7k (0.00% 0.0 shared with the source table)
0 Logs, total size 0.0
Note: Cloudera Manager does not support snapshot operations for HDFS paths with encryption-at-rest
enabled. This limitation is only for Cloudera Manager, and does not effect CDH command-line tools.
Note: Once you enable snapshots for a directory, you cannot enable snapshots on any of its
subdirectories. Snapshots can be taken only on directories that have snapshots enabled.
Taking Snapshots
Note: You can also schedule snapshots to occur regularly by creating a Snapshot Policy.
Deleting Snapshots
1. From the Clusters tab, select your CDH 5 HDFS service.
2. Go to the File Browser tab.
3. Go to the directory with the snapshot you want to delete.f
4.
In the list of snapshots, locate the snapshot you want to delete and click .
5. Select Delete.
Restoring Snapshots
1. From the Clusters tab, select your CDH 5 HDFS service.
2. Go to the File Browser tab.
3. Go to the directory you want to restore.
4. In the File Browser, click the drop-down menu next to the full file path (to the right of the file browser listings)
and select one of the following:
• Restore Directory From Snapshot
• Restore Directory From Snapshot As...
The Restore Snapshot screen displays.
5. If you selected Restore Directory From Snapshot As..., enter the username to apply when restoring the snapshot.
6. Select one of the following:
• Use HDFS 'copy' command - This option executes more slowly and does not require credentials in a secure
cluster. It copies the contents of the snapshot as a subdirectory or as files within the target directory.
• Use DistCp / MapReduce - This options executes more quickly and requires credentials (Run As) in secure
clusters. It merges the target directory with the contents of the source snapshot. When you select this option,
the following additional fields, which are similar to those available when configuring a replication, display
under More Options:
Important:
• If you use Cloudera Manager, do not use these command-line instructions.
• This information applies specifically to CDH 5.5.x. If you use a lower version of CDH, see the
documentation for that version located at Cloudera Documentation.
For information about managing snapshots using the command line, see HDFS Snapshots.
You can stop (for example, to perform maintenance on its host) or restart the Cloudera Manager Server without
affecting the other services running on your cluster. Statistics data used by activity monitoring and service monitoring
will continue to be collected during the time the server is down.
To stop the Cloudera Manager Server:
Setting Description
HTTP Port for Admin Console Specify the HTTP port to use to access the Server using
the Admin Console.
HTTPS Port for Admin Console Specify the HTTPS port to use to access the Server using
the Admin Console.
Agent Port to connect to Server Specify the port for Agents to use to connect to the
Server.
Important:
• The Cloudera Manager version on the destination host must match the version on the source
host.
• Do not install the other components, such as CDH and databases.
3. Copy the entire content of /var/lib/cloudera-scm-server/ on the old host to that same path on the new
host. Ensure you preserve permissions and all file content.
4. If the database server is not available:
a. Install the database packages on the host that will host the restored database. This could be the same host
on which you have just installed Cloudera Manager or it could be a different host. If you used the embedded
PostgreSQL database, install the PostgreSQL package as described in Embedded PostgreSQL Database. If you
used an external MySQL, PostgreSQL, or Oracle database, reinstall following the instructions in Cloudera
Manager and Managed Service Datastores.
b. Restore the backed up databases to the new database installation.
5. Update /etc/cloudera-scm-server/db.properties with the database name, database instance name,
user name, and password.
6. In /etc/cloudera-scm-agent/config.ini on each host, update the server_host property to the new
hostname and restart the Agents.
7. Start the Cloudera Manager Server. Cloudera Manager should resume functioning as it did before the failure.
Because you restored the database from the backup, the server should accept the running state of the Agents,
meaning it will not terminate any running processes.
The process is similar with secure clusters, though files in /etc/cloudera-scm-server must be restored in addition
to the database. See Cloudera Security.
Note: You can also view the Cloudera Manager Server log at
/var/log/cloudera-scm-server/cloudera-scm-server.log on the Server host.
2. Set the CMF_VAR environment variable in /etc/default/cloudera-scm-server to the new parent directory:
export CMF_VAR=/opt
3. Create log/cloudera-scm_server and run directories in the new parent directory and set the owner and
group of all directories to cloudera-scm. For example, if the new parent directory is /opt/, do the following:
$ sudo su
$ cd /opt
$ mkdir log
$ chown cloudera-scm:cloudera-scm log
$ mkdir /opt/log/cloudera-scm-server
$ chown cloudera-scm:cloudera-scm log/cloudera-scm-server
$ mkdir run
$ chown cloudera-scm:cloudera-scm run
cm_processes
To enable Cloudera Manager to run scripts in subdirectories of /var/run/cloudera-scm-agent, (because /var/run
is mounted noexec in many Linux distributions), Cloudera Manager mounts a tmpfs, named cm_processes, for
process subdirectories.
A tmpfs defaults to a max size of 50% of physical RAM but this space is not allocated until its used, and tmpfs is paged
out to swap if there is memory pressure.
The lifecycle actions of cmprocesses can be described by the following statements:
• Created when the Agent starts up for the first time with a new supervisord process.
• If it already exists without noexec, reused when the Agent is started using start and not recreated.
• Remounted if Agent is started using clean_restart.
• Unmounting and remounting cleans out the contents (since it is mounted as a tmpfs).
• Unmounted when the host is rebooted.
• Not unmounted when the Agent is stopped.
Starting Agents
To start Agents, the supervisord process, and all managed service processes, use one of the following commands:
• Start
• Restart
Warning: The hard_stop and hard_restart commands kill all running managed service processes
on the host(s) where the command is run.
To stop or restart Agents, the supervisord process, and all managed service processes, use one of the following
commands:
• Hard Stop
– RHEL-compatible 7 and higher:
• Hard Restart
– RHEL-compatible 7 and higher:
Property Description
Send Agent Heartbeat Every The interval in seconds between each heartbeat that is sent from Cloudera
Manager Agents to the Cloudera Manager Server.
Default: 15 sec.
Property Description
Set health status to Concerning if the The number of missed consecutive heartbeats after which a Concerning
Agent heartbeats fail health status is assigned to that Agent.
Default: 5.
Set health status to Bad if the Agent The number of missed consecutive heartbeats after which a Bad health
heartbeats fail status is assigned to that Agent.
Default: 10.
log_file The path to the Agent log file. If the Agent is being started using
the init.d script,
/var/log/cloudera-scm-agent/cloudera-scm-agent.out
max_collection_wait_seconds Maximum time to wait for all metric collectors to finish collecting
data.
Default: 10 sec.
metrics_url_timeout_seconds Maximum time to wait when connecting to a local role's web server
to fetch metrics.
Default: 30 sec.
supervisord_port The supervisord port. A change takes effect the next time
supervisord is restarted (not when the Agent is restarted).
Default: 19001.
[JDBC] cloudera_mysql_connector_jar, Location of JDBC drivers. See Cloudera Manager and Managed
cloudera_oracle_connector_jar, Service Datastores.
cloudera_postgresql_jdbc_jar
Default:
• MySQL - /usr/share/java/mysql-connector-java.jar
• Oracle - /usr/share/java/oracle-connector-java.jar
• PostgreSQL -
/usr/share/cmf/lib/postgresql-version-build.jdbc4.jar
log_file=/opt/log/cloudera-scm-agent/cloudera-scm-agent.log
2. Create log/cloudera-scm_agent directories and set the owner and group to cloudera-scm. For example, if
the log is stored in /opt/log/cloudera-scm-agent, do the following:
$ sudo su
$ cd /opt
$ mkdir log
$ chown cloudera-scm:cloudera-scm log
$ mkdir /opt/log/cloudera-scm-agent
$ chown cloudera-scm:cloudera-scm log/cloudera-scm-agent
Changing Hostnames
Minimum Required Role: Full Administrator
Important:
• The process described here requires Cloudera Manager and cluster downtime.
• If any user created scripts reference specific hostnames those must also be updated.
• Due to the length and complexity of the following procedure, changing cluster hostnames is not
recommended by Cloudera.
After you have installed Cloudera Manager and created a cluster, you may need to update the names of the hosts
running the Cloudera Manager Server or cluster services. To update a deployment with new hostnames, follow these
steps:
1. Verify if TLS/SSL certificates have been issued for any of the services and make sure to create new TLS/SSL certificates
in advance for services protected by TLS/SSL. See Configuring Encryption.
2. Export the Cloudera Manager configuration using one of the following methods:
• Open a browser and go to this URL http://cm_hostname:7180/api/api_version/cm/deployment.
Save the displayed configuration.
• From terminal type:
$ curl -u admin:admin http://cm_hostname:7180/api/api_version/cm/deployment >
cme-cm-export.json
where cm_hostname is the name of the Cloudera Manager host and api_version is the correct version of the API
for the version of Cloudera Manager you are using. For example,
http://tcdn5-1.ent.cloudera.com:7180/api/v11/cm/deployment.
3. Stop all services on the cluster.
4. Stop the Cloudera Management Service.
5. Stop the Cloudera Manager Server.
6. Stop the Cloudera Manager Agents on the hosts that will be having the hostname changed.
7. Back up the Cloudera Manager Server database using mysqldump, pg_dump, or another preferred backup utility.
Store the backup in a safe location.
8. Update names and principals:
a. Update the target hosts using standard per-OS/name service methods (/etc/hosts, dns,
/etc/sysconfig/network, hostname, and so on). Ensure that you remove the old hostname.
b. If you are changing the hostname of the host running Cloudera Manager Server do the following:
a. Change the hostname per step 8.a.
b. Update the Cloudera Manager hostname in /etc/cloudera-scm-agent/config.ini on all Agents.
c. If the cluster is configured for Kerberos security, do the following:
a. Remove the old hostname cluster principals.
• If you are using an MIT KDC, remove old hostname cluster service principals from the KDC database
using one of the following:
– Use the delprinc command within kadmin.local interactive shell.
OR
– From the command line:
Open cluster-princ.txt and remove any noncluster service principal entries. Make sure
that the default krbtgt and other principals you created, or that were created by Kerberos
by default, are not removed by running the following: for i in `cat
cluster-princ.txt`; do yes yes | kadmin.local -q "delprinc $i"; done.
• For an Active Directory KDC, an AD administrator must manually delete the principals for the old
hostname from Active Directory.
b. Start the Cloudera Manager database and Cloudera Manager Server.
c. Start the Cloudera Manager Agents on the newly renamed hosts. The Agents should show a current
heartbeat in Cloudera Manager.
d. Within the Cloudera Manager Admin Console click the Hosts tab.
e. Select the checkbox next to the host with the new name.
f. Select Actions > Regenerate Keytab.
9. If one of the hosts that was renamed has a NameNode configured with high availability and automatic failover
enabled, reconfigure the ZooKeeper Failover Controller znodes to reflect the new hostname.
a. Start ZooKeeper Servers.
Warning: All other services, and most importantly HDFS, and the ZooKeeper Failover
Controller (FC) role within the HDFS, should not be running.
b. On one of the hosts that has a ZooKeeper Server role, run zookeeper-client.
a. If the cluster is configured for Kerberos security, configure ZooKeeper authorization as follows:
a. Go to the HDFS service.
b. Click the Instances tab.
c. Click the Failover Controller role.
d. Click the Process tab.
e. In the Configuration Files column of the hdfs/hdfs.sh ["zkfc"] program, expand Show.
f. Inspect core-site.xml in the displayed list of files and determine the value of the
ha.zookeeper.auth property, which will be something like:
digest:hdfs-fcs:TEbW2bgoODa96rO3ZTn7ND5fSOGx0h. The part after digest:hdfs-fcs:
is the password (in the example it is TEbW2bgoODa96rO3ZTn7ND5fSOGx0h)
g. Run the addauth command with the password:
Alerts
An alert is an event that is considered especially noteworthy and is triggered by a selected event. Alerts are shown
with an badge when they appear in a list of events. You can configure the Alert Publisher to send alert
notifications by email or by SNMP trap to a trap receiver.
Service instances of type HDFS, MapReduce, and HBase (and their associated roles) can generate alerts if so configured.
Alerts can also be configured for the monitoring roles that are a part of the Cloudera Management Service.
The settings to enable or disable specific alerts are found under the Configuration tab for the services to which they
pertain. See Configuring Alerts and for more information on setting up alerting.
For information about configuring the Alert Publisher to send email or SNMP notifications for alerts, see Configuring
Alert Delivery.
Managing Alerts
Minimum Required Role: Full Administrator
The Administration > Alerts page provides a summary of the settings for alerts in your clusters.
Alert Type The left column lets you select by alert type (Health, Log, or Activity) and within that by service instance.
In the case of Health alerts, you can look at alerts for Hosts as well. You can select an individual service to see just the
alert settings for that service.
Health/Log/Activity Alert Settings Depending on your selection in the left column, the right hand column show you
the list of alerts that are enabled or disabled for the selected service type.
To change the alert settings for a service, click Edit next to the service name. This will take you to the Monitoring
section of the Configuration tab for the service. From here you can enable or disable alerts and configure thresholds
as needed.
Recipients You can also view the list of recipients configured for the enabled alerts.
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
You can configure the Alert Publisher to run a user-written script in response to an alert. The Alert Publisher passes a
single argument to the script that is a UTF-8 JSON file containing a list of alerts. The script runs on the host where the
Alert Publisher service is running and must have read and execute permissions for the cloudera-scm user. Only one
instance of a script runs at a time. The standard out and standard error messages from the script are logged to the
Alert Publisher log file.
You use the Alert Publisher: Maximum Batch Size and Alert Publisher: Maximum Batch interval to configure when
the Alert Publisher delivers alerts. See Configuring Alerts.
To configure the Alert Publisher to deliver alerts using a script:
1. Save the script on the host where the Alert Publisher role is running.
2. Change the owner of the file to cloudera-scm and set its permissions to read and execute:
3. Open the Cloudera Manager Admin console and select Clusters > Cloudera Management Service.
4. Click the Configuration tab.
5. Select Scope > Alert Publisher .
6. Enter the path to the script in the Custom Alert Script property.
7. Click Save Changes to commit the changes.
Sample JSON Alert File
When a custom script runs, it passes a JSON file that contains the alerts. For example:
[ {
"body" : {
"alert" : {
"content" : "The health test result for MAPREDUCE_HA_JOB_TRACKER_HEALTH has become
bad: JobTracker summary: myCluster.com (Availability: Active, Health: Bad). This health
test reflects the health of the active JobTracker.",
"timestamp" : {
"iso8601" : "2015-06-11T03:52:56Z",
"epochMs" : 1433994776083
},
"source" :
"http://myCluster.com:7180/cmf/eventRedirect/89521139-0859-4bef-bf65-eb141e63dbba",
"attributes" : {
"__persist_timestamp" : [ "1433994776172" ],
"ALERT_SUPPRESSED" : [ "false" ],
"HEALTH_TEST_NAME" : [ "MAPREDUCE_HA_JOB_TRACKER_HEALTH" ],
"SEVERITY" : [ "CRITICAL" ],
"HEALTH_TEST_RESULTS" : [ {
"content" : "The health test result for MAPREDUCE_HA_JOB_TRACKER_HEALTH has
become bad: JobTracker summary: myCluster.com (Availability: Active, Health: Bad). This
health test reflects the health of the active JobTracker.",
"testName" : "MAPREDUCE_HA_JOB_TRACKER_HEALTH",
"eventCode" : "EV_SERVICE_HEALTH_CHECK_BAD",
"severity" : "CRITICAL"
} ],
"CLUSTER_DISPLAY_NAME" : [ "Cluster 1" ],
"ALERT" : [ "true" ],
"CATEGORY" : [ "HEALTH_CHECK" ],
"BAD_TEST_RESULTS" : [ "1" ],
"SERVICE_TYPE" : [ "MAPREDUCE" ],
"EVENTCODE" : [ "EV_SERVICE_HEALTH_CHECK_BAD", "EV_SERVICE_HEALTH_CHECK_GOOD"
],
"ALERT_SUMMARY" : [ "The health of service MAPREDUCE-1 has become bad." ],
"CLUSTER_ID" : [ "1" ],
"SERVICE" : [ "MAPREDUCE-1" ],
"__uuid" : [ "89521139-0859-4bef-bf65-eb141e63dbba" ],
"CLUSTER" : [ "Cluster 1" ],
"CURRENT_COMPLETE_HEALTH_TEST_RESULTS" : [ "{\"content\":\"The health test result
for MAPREDUCE_HA_JOB_TRACKER_HEALTH has become bad: JobTracker summary: myCluster.com
(Availability: Active, Health: Bad). This health test reflects the health of the active
JobTracker.\",\"testName\":\"MAPREDUCE_HA_JOB_TRACKER_HEALTH\",\"eventCode\":\"EV_SERVICE_HEALTH_CHECK_BAD\",\"severity\":\"CRITICAL\"}",
"{\"content\":\"The health test result for MAPREDUCE_TASK_TRACKERS_HEALTHY has become
good: Healthy TaskTracker: 3. Concerning TaskTracker: 0. Total TaskTracker: 3. Percent
healthy: 100.00%. Percent healthy or concerning:
100.00%.\",\"testName\":\"MAPREDUCE_TASK_TRACKERS_HEALTHY\",\"eventCode\":\"EV_SERVICE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}"
],
"PREVIOUS_HEALTH_SUMMARY" : [ "GREEN" ],
"CURRENT_HEALTH_SUMMARY" : [ "RED" ],
"MONITOR_STARTUP" : [ "false" ],
"PREVIOUS_COMPLETE_HEALTH_TEST_RESULTS" : [ "{\"content\":\"The health test
result for MAPREDUCE_HA_JOB_TRACKER_HEALTH has become good: JobTracker summary:
myCluster.com (Availability: Active, Health:
Good)\",\"testName\":\"MAPREDUCE_HA_JOB_TRACKER_HEALTH\",\"eventCode\":\"EV_SERVICE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for MAPREDUCE_TASK_TRACKERS_HEALTHY has become
good: Healthy TaskTracker: 3. Concerning TaskTracker: 0. Total TaskTracker: 3. Percent
healthy: 100.00%. Percent healthy or concerning:
100.00%.\",\"testName\":\"MAPREDUCE_TASK_TRACKERS_HEALTHY\",\"eventCode\":\"EV_SERVICE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}"
],
"SERVICE_DISPLAY_NAME" : [ "MAPREDUCE-1" ]
}
}
},
"header" : {
"type" : "alert",
"version" : 2
}
}, {
"body" : {
"alert" : {
"content" : "The health test result for JOB_TRACKER_SCM_HEALTH has become bad:
This role's process exited. This role is supposed to be started.",
"timestamp" : {
"iso8601" : "2015-06-11T03:52:56Z",
"epochMs" : 1433994776083
},
"source" :
"http://myCluster.com:7180/cmf/eventRedirect/67b4d1c4-791b-428e-a9ea-8a09d4885f5d",
"attributes" : {
"__persist_timestamp" : [ "1433994776173" ],
"ALERT_SUPPRESSED" : [ "false" ],
"HEALTH_TEST_NAME" : [ "JOB_TRACKER_SCM_HEALTH" ],
"SEVERITY" : [ "CRITICAL" ],
"ROLE" : [ "MAPREDUCE-1-JOBTRACKER-10624c438dee9f17211d3f33fa899957" ],
"HEALTH_TEST_RESULTS" : [ {
"content" : "The health test result for JOB_TRACKER_SCM_HEALTH has become bad:
This role's process exited. This role is supposed to be started.",
"testName" : "JOB_TRACKER_SCM_HEALTH",
"eventCode" : "EV_ROLE_HEALTH_CHECK_BAD",
"severity" : "CRITICAL"
} ],
"CLUSTER_DISPLAY_NAME" : [ "Cluster 1" ],
"HOST_IDS" : [ "75e763c2-8d22-47a1-8c80-501751ae0db7" ],
"ALERT" : [ "true" ],
"ROLE_TYPE" : [ "JOBTRACKER" ],
"CATEGORY" : [ "HEALTH_CHECK" ],
"BAD_TEST_RESULTS" : [ "1" ],
"SERVICE_TYPE" : [ "MAPREDUCE" ],
"EVENTCODE" : [ "EV_ROLE_HEALTH_CHECK_BAD", "EV_ROLE_HEALTH_CHECK_GOOD",
"EV_ROLE_HEALTH_CHECK_DISABLED" ],
"ALERT_SUMMARY" : [ "The health of role jobtracker (nightly-1) has become bad."
],
"CLUSTER_ID" : [ "1" ],
"SERVICE" : [ "MAPREDUCE-1" ],
"__uuid" : [ "67b4d1c4-791b-428e-a9ea-8a09d4885f5d" ],
"CLUSTER" : [ "Cluster 1" ],
"CURRENT_COMPLETE_HEALTH_TEST_RESULTS" : [ "{\"content\":\"The health test result
for JOB_TRACKER_SCM_HEALTH has become bad: This role's process exited. This role is
supposed to be
started.\",\"testName\":\"JOB_TRACKER_SCM_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_BAD\",\"severity\":\"CRITICAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_UNEXPECTED_EXITS has become
good: This role encountered 0 unexpected exit(s) in the previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_UNEXPECTED_EXITS\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_FILE_DESCRIPTOR has become good:
Open file descriptors: 244. File descriptor limit: 32,768. Percentage in use:
0.74%.\",\"testName\":\"JOB_TRACKER_FILE_DESCRIPTOR\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_SWAP_MEMORY_USAGE has become
good: 0 B of swap memory is being used by this role's
process.\",\"testName\":\"JOB_TRACKER_SWAP_MEMORY_USAGE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE has
become good: This role's Log Directory (/var/log/hadoop-0.20-mapreduce) is on a filesystem
with more than 20.00% of its space
free.\",\"testName\":\"JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HOST_HEALTH has become good:
The health of this role's host is
good.\",\"testName\":\"JOB_TRACKER_HOST_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_WEB_METRIC_COLLECTION has become
good: The web server of this role is responding with metrics. The most recent collection
took 49
millisecond(s).\",\"testName\":\"JOB_TRACKER_WEB_METRIC_COLLECTION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_GC_DURATION has become good:
Average time spent in garbage collection was 0 second(s) (0.00%) per minute over the
previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_GC_DURATION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE
has become disabled: Test disabled because role is not configured to dump heap when
out of memory. Test of whether this role's heap dump directory has enough free
space.\",\"testName\":\"JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_DISABLED\",\"severity\":\"INFORMATIONAL\"}"
],
"CURRENT_HEALTH_SUMMARY" : [ "RED" ],
"PREVIOUS_HEALTH_SUMMARY" : [ "GREEN" ],
"MONITOR_STARTUP" : [ "false" ],
"ROLE_DISPLAY_NAME" : [ "jobtracker (nightly-1)" ],
"PREVIOUS_COMPLETE_HEALTH_TEST_RESULTS" : [ "{\"content\":\"The health test
result for JOB_TRACKER_SCM_HEALTH has become good: This role's status is as expected.
The role is
started.\",\"testName\":\"JOB_TRACKER_SCM_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_UNEXPECTED_EXITS has become
good: This role encountered 0 unexpected exit(s) in the previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_UNEXPECTED_EXITS\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_FILE_DESCRIPTOR has become good:
Open file descriptors: 244. File descriptor limit: 32,768. Percentage in use:
0.74%.\",\"testName\":\"JOB_TRACKER_FILE_DESCRIPTOR\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_SWAP_MEMORY_USAGE has become
good: 0 B of swap memory is being used by this role's
process.\",\"testName\":\"JOB_TRACKER_SWAP_MEMORY_USAGE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE has
become good: This role's Log Directory (/var/log/hadoop-0.20-mapreduce) is on a filesystem
with more than 20.00% of its space
free.\",\"testName\":\"JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HOST_HEALTH has become good:
The health of this role's host is
good.\",\"testName\":\"JOB_TRACKER_HOST_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_WEB_METRIC_COLLECTION has become
good: The web server of this role is responding with metrics. The most recent collection
took 49
millisecond(s).\",\"testName\":\"JOB_TRACKER_WEB_METRIC_COLLECTION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_GC_DURATION has become good:
Average time spent in garbage collection was 0 second(s) (0.00%) per minute over the
previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_GC_DURATION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE
has become disabled: Test disabled because role is not configured to dump heap when
out of memory. Test of whether this role's heap dump directory has enough free
space.\",\"testName\":\"JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_DISABLED\",\"severity\":\"INFORMATIONAL\"}"
],
"SERVICE_DISPLAY_NAME" : [ "MAPREDUCE-1" ],
"HOSTS" : [ "myCluster.com" ]
}
}
},
"header" : {
"type" : "alert",
"version" : 2
}
}, {
"body" : {
"alert" : {
"content" : "The health test result for JOB_TRACKER_UNEXPECTED_EXITS has become
bad: This role encountered 1 unexpected exit(s) in the previous 5 minute(s).This included
1 exit(s) due to OutOfMemory errors. Critical threshold: any.",
"timestamp" : {
"iso8601" : "2015-06-11T03:53:41Z",
"epochMs" : 1433994821940
},
"source" :
"http://myCluster.com:7180/cmf/eventRedirect/b8c4468d-08c2-4b5b-9bda-2bef892ba3f5",
"attributes" : {
"__persist_timestamp" : [ "1433994822027" ],
"ALERT_SUPPRESSED" : [ "false" ],
"HEALTH_TEST_NAME" : [ "JOB_TRACKER_UNEXPECTED_EXITS" ],
"SEVERITY" : [ "CRITICAL" ],
"ROLE" : [ "MAPREDUCE-1-JOBTRACKER-10624c438dee9f17211d3f33fa899957" ],
"HEALTH_TEST_RESULTS" : [ {
"content" : "The health test result for JOB_TRACKER_UNEXPECTED_EXITS has become
bad: This role encountered 1 unexpected exit(s) in the previous 5 minute(s).This included
1 exit(s) due to OutOfMemory errors. Critical threshold: any.",
"testName" : "JOB_TRACKER_UNEXPECTED_EXITS",
"eventCode" : "EV_ROLE_HEALTH_CHECK_BAD",
"severity" : "CRITICAL"
} ],
"CLUSTER_DISPLAY_NAME" : [ "Cluster 1" ],
"HOST_IDS" : [ "75e763c2-8d22-47a1-8c80-501751ae0db7" ],
"ALERT" : [ "true" ],
"ROLE_TYPE" : [ "JOBTRACKER" ],
"CATEGORY" : [ "HEALTH_CHECK" ],
"BAD_TEST_RESULTS" : [ "1" ],
"SERVICE_TYPE" : [ "MAPREDUCE" ],
"EVENTCODE" : [ "EV_ROLE_HEALTH_CHECK_BAD", "EV_ROLE_HEALTH_CHECK_GOOD",
"EV_ROLE_HEALTH_CHECK_DISABLED" ],
"ALERT_SUMMARY" : [ "The health of role jobtracker (nightly-1) has become bad."
],
"CLUSTER_ID" : [ "1" ],
"SERVICE" : [ "MAPREDUCE-1" ],
"__uuid" : [ "b8c4468d-08c2-4b5b-9bda-2bef892ba3f5" ],
"CLUSTER" : [ "Cluster 1" ],
"CURRENT_COMPLETE_HEALTH_TEST_RESULTS" : [ "{\"content\":\"The health test result
for JOB_TRACKER_SCM_HEALTH has become bad: This role's process exited. This role is
supposed to be
started.\",\"testName\":\"JOB_TRACKER_SCM_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_BAD\",\"severity\":\"CRITICAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_UNEXPECTED_EXITS has become bad:
This role encountered 1 unexpected exit(s) in the previous 5 minute(s).This included
1 exit(s) due to OutOfMemory errors. Critical threshold:
any.\",\"testName\":\"JOB_TRACKER_UNEXPECTED_EXITS\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_BAD\",\"severity\":\"CRITICAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_FILE_DESCRIPTOR has become good:
Open file descriptors: 244. File descriptor limit: 32,768. Percentage in use:
0.74%.\",\"testName\":\"JOB_TRACKER_FILE_DESCRIPTOR\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_SWAP_MEMORY_USAGE has become
good: 0 B of swap memory is being used by this role's
process.\",\"testName\":\"JOB_TRACKER_SWAP_MEMORY_USAGE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE has
become good: This role's Log Directory (/var/log/hadoop-0.20-mapreduce) is on a filesystem
with more than 20.00% of its space
free.\",\"testName\":\"JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HOST_HEALTH has become good:
The health of this role's host is
good.\",\"testName\":\"JOB_TRACKER_HOST_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_WEB_METRIC_COLLECTION has become
good: The web server of this role is responding with metrics. The most recent collection
took 49
millisecond(s).\",\"testName\":\"JOB_TRACKER_WEB_METRIC_COLLECTION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_GC_DURATION has become good:
Average time spent in garbage collection was 0 second(s) (0.00%) per minute over the
previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_GC_DURATION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE
has become disabled: Test disabled because role is not configured to dump heap when
out of memory. Test of whether this role's heap dump directory has enough free
space.\",\"testName\":\"JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_DISABLED\",\"severity\":\"INFORMATIONAL\"}"
],
"CURRENT_HEALTH_SUMMARY" : [ "RED" ],
"PREVIOUS_HEALTH_SUMMARY" : [ "RED" ],
"MONITOR_STARTUP" : [ "false" ],
"ROLE_DISPLAY_NAME" : [ "jobtracker (nightly-1)" ],
"PREVIOUS_COMPLETE_HEALTH_TEST_RESULTS" : [ "{\"content\":\"The health test
result for JOB_TRACKER_SCM_HEALTH has become bad: This role's process exited. This role
is supposed to be
started.\",\"testName\":\"JOB_TRACKER_SCM_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_BAD\",\"severity\":\"CRITICAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_UNEXPECTED_EXITS has become
good: This role encountered 0 unexpected exit(s) in the previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_UNEXPECTED_EXITS\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_FILE_DESCRIPTOR has become good:
Open file descriptors: 244. File descriptor limit: 32,768. Percentage in use:
0.74%.\",\"testName\":\"JOB_TRACKER_FILE_DESCRIPTOR\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_SWAP_MEMORY_USAGE has become
good: 0 B of swap memory is being used by this role's
process.\",\"testName\":\"JOB_TRACKER_SWAP_MEMORY_USAGE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE has
become good: This role's Log Directory (/var/log/hadoop-0.20-mapreduce) is on a filesystem
with more than 20.00% of its space
free.\",\"testName\":\"JOB_TRACKER_LOG_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HOST_HEALTH has become good:
The health of this role's host is
good.\",\"testName\":\"JOB_TRACKER_HOST_HEALTH\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_WEB_METRIC_COLLECTION has become
good: The web server of this role is responding with metrics. The most recent collection
took 49
millisecond(s).\",\"testName\":\"JOB_TRACKER_WEB_METRIC_COLLECTION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_GC_DURATION has become good:
Average time spent in garbage collection was 0 second(s) (0.00%) per minute over the
previous 5
minute(s).\",\"testName\":\"JOB_TRACKER_GC_DURATION\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_GOOD\",\"severity\":\"INFORMATIONAL\"}",
"{\"content\":\"The health test result for JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE
has become disabled: Test disabled because role is not configured to dump heap when
out of memory. Test of whether this role's heap dump directory has enough free
space.\",\"testName\":\"JOB_TRACKER_HEAP_DUMP_DIRECTORY_FREE_SPACE\",\"eventCode\":\"EV_ROLE_HEALTH_CHECK_DISABLED\",\"severity\":\"INFORMATIONAL\"}"
],
"SERVICE_DISPLAY_NAME" : [ "MAPREDUCE-1" ],
"HOSTS" : [ "myCluster.com" ]
}
}
},
"header" : {
"type" : "alert",
"version" : 2
}
} ]
Managing Licenses
Minimum Required Role: Full Administrator
When you install Cloudera Manager, you can select among the following editions: Cloudera Express (no license required),
a 60-day Cloudera Enterprise Data Hub Edition trial license, or Cloudera Enterprise (which requires a license). To obtain
a Cloudera Enterprise license, fill in this form or call 866-843-7207.
A Cloudera Enterprise license is required for the following features:
• LDAP and SAML authentication
• Configuration history
• Alerts delivered as SNMP traps and custom alert scripts
• Backup and disaster recovery
• Operational reports
• Cloudera Navigator
• Commands such as Rolling Restart, History and Rollback, and Send Diagnostic Data
For details see Cloudera Express and Cloudera Enterprise Features.
License Expiration
When a Cloudera Enterprise license expires, the following occurs:
• Cloudera Enterprise Data Hub Edition Trial - Enterprise features are no longer available.
• Cloudera Enterprise - Cloudera Manager Admin Console displays a banner indicating license expiration. Contact
Cloudera Support to receive an updated license. In the meanwhile, all enterprise features will continue to be
available.
Trial Licenses
You can use a trial license only once; when the 60-day trial period expires or you have ended the trial, you cannot
restart the trial.
When a trial ends, enterprise features immediately become unavailable. However, data or configurations associated
with the disabled functions are not deleted, and become available again once you install a Cloudera Enterprise license.
Upgrading from Cloudera Express to a Cloudera Enterprise Data Hub Edition Trial
To start a trial, on the License page, click Try Cloudera Enterprise Data Hub Edition for 60 Days.
1. Cloudera Manager displays a pop-up describing the features enabled with Cloudera Enterprise Data Hub Edition.
Click OK to proceed. At this point, your installation is upgraded and the Customize Role Assignments page displays.
2. Under Reports Manager click Select a host. The pageable host selection dialog box displays.
The following shortcuts for specifying hostname patterns are supported:
• IP addresses
• Rack name
3. Select a host and click OK.
4. When you are satisfied with the assignments, click Continue.
5. Configure database settings:
a. Choose the database type:
• Keep the default setting of Use Embedded Database to have Cloudera Manager create and configure
required databases. Record the auto-generated passwords.
or to keep the history for a longer period of time. See Configuring Oozie Data Purge Settings Using
Cloudera Manager on page 219.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the database using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise, check and correct
the information you have provided for the database and then try the test again. (For some servers, if you are
using the embedded database, you will see a message saying the database will be created at a later step in
the installation process.) The Review Changes screen displays.
6. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file paths
required vary based on the services to be installed. If you chose to add the Sqoop service, indicate whether to use
the default Derby database or the embedded PostgreSQL database. If the latter, type the database name, host,
and user credentials that you specified when you created the database.
Warning: Do not place DataNode data directories on NAS devices. When resizing an NAS, block
replicas can be deleted, which will result in reports of missing blocks.
Upgrading from a Cloudera Enterprise Data Hub Edition Trial to Cloudera Enterprise
1. Purchase a Cloudera Enterprise license from Cloudera.
2. On the License page, click Upload License.
3. Click the document icon to the left of the Select a License File text field.
4. Go to the location of your license file, click the file, and click Open.
5. Click Upload.
• IP addresses
• Rack name
Warning: Do not place DataNode data directories on NAS devices. When resizing an NAS, block
replicas can be deleted, which will result in reports of missing blocks.
Renewing a License
1. Download the license file and save it locally.
2. In Cloudera Manager, go to the Home page.
3. Select Administration > License.
4. Click Upload License.
5. Browse to the license file you downloaded.
6. Click Upload.
You do not need to restart Cloudera Manager for the new license to take effect.
Note:
• Automatically sending diagnostic data requires the Cloudera Manager Server host to have Internet
access, and be configured for sending data automatically. If your Cloudera Manager server does
not have Internet access, and you have a Cloudera Enterprise license, you can manually send the
diagnostic data as described in Manually Triggering Collection and Transfer of Diagnostic Data to
Cloudera on page 457.
• Automatically sending diagnostic data may fail sometimes and return an error message of "Could
not send data to Cloudera." To work around this issue, you can manually send the data to Cloudera
Support.
Note: If you are using Cloudera Express, host metrics are not included.
• Whether there's an active trial, and if so, metadata about the trial.
• Metadata about the Cloudera Manager server, such as its JMX metrics, stack traces, and the database/host it's
running with.
• HDFS/Hive replication schedules (including command history) for the deployment.
• Impala query logs.
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
Disabling the Automatic Sending of Diagnostic Data from a Manually Triggered Collection
If you do not want data automatically sent to Cloudera after manually triggering data collection, you can disable this
feature. The data you collect will be saved and can be downloaded for sending to Cloudera Support at a later time.
1. Select Administration > Settings.
2. Under the Support category, uncheck the box for Send Diagnostic Data to Cloudera Automatically.
3. Click Save Changes to commit the changes.
Note: The Send Diagnostic Data form that displays when you collect data in one of the following
procedures indicates whether the data will be sent automatically.
2. Under the Support menu at the top right of the navigation bar, choose Send Diagnostic Data. The Send Diagnostic
Data form displays.
3. Fill in or change the information here as appropriate:
• Optionally, you can improve performance by reducing the size of the data bundle that is sent. Click Restrict
log and metrics collection to expand this section of the form. The three filters, Host, Service, and Role Type,
allow you to restrict the data that will be sent. Cloudera Manager will only collect logs and metrics for roles
that match all three filters.
• Cloudera Manager populates the End Time based on the setting of the Time Range selector. You should
change this to be a few minutes after you observed the problem or condition that you are trying to capture.
The time range is based on the timezone of the host where Cloudera Manager Server is running.
• If you have a support ticket open with Cloudera Support, include the support ticket number in the field
provided.
4. Depending on whether you have disabled automatic sending of data, do one of the following:
• Click Collect and Send Diagnostic Data. A Running Commands window shows you the progress of the data
collection steps. When these steps are complete, the collected data is sent to Cloudera.
• Click Collect Diagnostic Data. A Command Details window shows you the progress of the data collection
steps.
1. In the Command Details window, click Download Result Data to download and save a zip file of the
information.
2. Send the data to Cloudera Support by doing one of the following:
• Send the bundle using a Python script:
1. Download the phone_home script.
2. Copy the script and the downloaded data file to a host that has Internet access.
3. Run the following command on that host:
• Attach the bundle to the SFDC case. Do not rename the bundle as this can cause a delay in processing
the bundle.
• Contact Cloudera Support and arrange to send the data file.
2. Stop the Cloudera Manager Server by running the following command on the Cloudera Manager Server host:
3. If you are using the embedded PostgreSQL database for Cloudera Manager, stop the database:
Important: If you are not running the embedded database service and you attempt to stop it,
you receive a message indicating that the service cannot be found. If instead you get a message
that the shutdown failed, the embedded database is still running, probably because services are
connected to the Hive metastore. If the database shutdown fails due to connected services, issue
the following command:
• RHEL-compatible 7 and higher:
Backing up Databases
Several steps in the backup procedures require you to back up various databases used in a CDH cluster. The steps for
backing up and restoring databases differ depending on the database vendor and version you select for your cluster
and are beyond the scope of this document.
See the following vendor resources for more information:
• MariaDB 5.5: http://mariadb.com/kb/en/mariadb/backup-and-restore-overview/
• MySQL 5.5: http://dev.mysql.com/doc/refman/5.5/en/backup-and-recovery.html
• MySQL 5.6: http://dev.mysql.com/doc/refman/5.6/en/backup-and-recovery.html
• PostgreSQL 8.4: https://www.postgresql.org/docs/8.4/static/backup.html
• PostgreSQL 9.2: https://www.postgresql.org/docs/9.2/static/backup.html
• PostgreSQL 9.3: https://www.postgresql.org/docs/9.3/static/backup.html
• Oracle 11gR2: http://docs.oracle.com/cd/E11882_01/backup.112/e10642/toc.htm
Settings
The Settings page provides a number of categories as follows:
• Performance - Set the Cloudera Manager Agent heartbeat interval. See Configuring Agent Heartbeat and Health
Status Options on page 437.
• Advanced - Enable API debugging and other advanced options.
• Monitoring - Set Agent health status parameters. For configuration instructions, see Configuring Cloudera Manager
Agents on page 437.
• Security - Set TLS encryption settings to enable TLS encryption between the Cloudera Manager Server, Agents,
and clients. For configuration instructions, see Configuring TLS Security for Cloudera Manager. You can also:
– Set the realm for Kerberos security and point to a custom keytab retrieval script. For configuration instructions,
see Cloudera Security.
– Specify session timeout and a "Remember Me" option.
• Ports and Addresses - Set ports for the Cloudera Manager Admin Console and Server. For configuration instructions,
see Configuring Cloudera Manager Server Ports on page 433.
• Other
– Enable Cloudera usage data collection For configuration instructions, see Managing Anonymous Usage Data
Collection on page 455.
– Set a custom header color and banner text for the Admin console.
– Set an "Information Assurance Policy" statement – this statement will be presented to every user before they
are allowed to access the login dialog box. The user must click "I Agree" in order to proceed to the login dialog
box.
– Disable/enable the auto-search for the Events panel at the bottom of a page.
• Support
– Configure diagnostic data collection properties. See Diagnostic Data Collection on page 456.
– Configure how to access Cloudera Manager help files.
• External Authentication - Specify the configuration to use LDAP, Active Directory, or an external program for
authentication. See Configuring External Authentication for Cloudera Manager for instructions.
• Parcels - Configure settings for parcels, including the location of remote repositories that should be made available
for download, and other settings such as the frequency with which Cloudera Manager will check for new parcels,
limits on the number of downloads or concurrent distribution uploads. See Parcels for more information.
• Network - Configure proxy server settings. See Configuring Network Settings on page 443.
• Custom Service Descriptors - Configure custom service descriptor properties for Add-on Services on page 38.
Alerts
See Managing Alerts on page 443.
Users
See Cloudera Manager User Accounts.
Kerberos
See Enabling Kerberos Authentication Using the Wizard.
License
See Managing Licenses on page 450.
Peers
See Designating a Replication Source on page 387.
2. Click Start to confirm. The Command Details window shows the progress of starting the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
2. Click Stop to confirm. The Command Details window shows the progress of stopping the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
3. Check the checkboxes next to the Navigator Audit Server and Navigator Metadata Server roles.
4. Do one of the following depending on your role:
• Minimum Required Role: Full Administrator
1. Check the checkboxes next to the Navigator Audit Server and Navigator Metadata Server roles.
2. Select Actions for Selected > Stop and click Stop to confirm.
• Minimum Required Role: Navigator Administrator (also provided by Full Administrator)
1. Click the Navigator Audit Server role link.
2. Select Actions > Stop this Navigator Audit Server and click Stop this Navigator Audit Server to confirm.
3. Click the Navigator Metadata Server role link.
4. Select Actions > Stop this Navigator Metadata Server and click Stop this Navigator Metadata Server to
confirm.
5. Check the checkboxes next to the Navigator Audit Server and Navigator Metadata Server roles.
6. Select Actions for Selected > Delete. Click Delete to confirm the deletion.
Related Information
• Cloudera Navigator 2 Overview
• Installing the Cloudera Navigator Data Management Component
• Upgrading the Cloudera Navigator Data Management Component
• Cloudera Data Management
• Configuring Authentication in the Cloudera Navigator Data Management Component
• Configuring TLS/SSL for the Cloudera Navigator Data Management Component
• Cloudera Navigator Data Management Component User Roles
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the database using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise, check and correct
the information you have provided for the database and then try the test again. (For some servers, if you use
the embedded database, you will see a message stating that the database will be created at a later step in
the installation process.) The Review Changes screen displays.
8. Click Finish.
log4j.logger.kafkaAuditStream=TRACE,KAFKA
log4j.appender.KAFKA=kafka.producer.KafkaLog4jAppender
log4j.additivity.com.cloudera.navigator.kafkaAuditStream=false
log4j.appender.KAFKA.layout=org.apache.log4j.PatternLayout
log4j.appender.KAFKA.layout.ConversionPattern=%m%n
log4j.appender.KAFKA.SyncSend=false
log4j.appender.KAFKA.BrokerList=broker_host:broker_port
log4j.appender.KAFKA.Topic=NavigatorAuditEvents
Where broker_host and broker_port are the host and port of the Kafka service.
5. Click Save Changes to commit the changes.
6. Restart the role.
Changes to the Kafka service broker host and port are not handled automatically; you must manually modify those
properties in the advanced configuration snippet and restart the role.
The Log4j SyslogAppender supports only UDP. An example syslog configuration would be:
$ModLoad imudp
$UDPServerRun 514
It is also possible to attach other appenders to the auditStream to provide other integration behaviors.
You can publish audit events to syslog in two formats: JSON and RSA EnVision. To configure audit logging to syslog, do
the following:
1. Do one of the following:
• Select Clusters > Cloudera Management Service > Cloudera Management Service.
• On the Status tab of the Home > Status tab, in Cloudera Management Service table, click the Cloudera
Management Service link.
2. Click the Configuration tab.
3. Locate the Navigator Audit Server Logging Advanced Configuration Snippet property by typing its name in the
Search box.
4. Depending on the format type, enter:
log4j.appender.SYSLOG = org.apache.log4j.net.SyslogAppender
log4j.appender.SYSLOG.SyslogHost = hostname
log4j.appender.SYSLOG.Facility = Local2
log4j.appender.SYSLOG.FacilityPrinting = true
Format Property
JSON log4j.logger.auditStream = TRACE,SYSLOG
log4j.additivity.auditStream = false
Important: This feature is available only with a Cloudera Enterprise license; it is not available in
Cloudera Express. For information on Cloudera Enterprise licenses, see Managing Licenses on page
450.
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
To compute the memory required by the Metadata Server during normal operation, use the number of documents in
nav_elements * 200. So for the above example, the recommended amount of memory would be (68813088 *
200) or about 14 GB.
For upgrade, use the number of documents in nav_elements + nav_relations. If you use the number in the
above example, for upgrade you would need ((68813088 + 78813930) * 200) or about 30 GB.
By default, during the Cloudera Manager Installation wizard the Navigator Audit Server and Navigator Metadata Server
are assigned to the same host as the Cloudera Management Service monitoring roles. This configuration works for a
small cluster, but should be updated before the cluster grows. You can either change the configuration at installation
time or move the Navigator Metadata Server if necessary.
You can configure the location of the Navigator Metadata Server logs (by default,
/var/log/cloudera-scm-navigator) as follows:
nav.max_inactive_interval=period (s)
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
5. Click Save Changes to commit the changes.
6. Restart the role.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
5. Click Save Changes to commit the changes.
6. Restart the role.
nav.spark.extraction.enable=true
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
5. Click Save Changes to commit the changes.
6. Restart the role.
Property Description
JMS URL The URL of the JMS server to which notifications of changes to entities affected by policies
are sent.
Default: tcp://localhost:61616.
JMS User The JMS user to which notifications of changes to entities affected by policies are sent.
Default: admin.
JMS Password The password of the JMS user to which notifications of changes to entities affected by
policies are sent.
Default: admin.
JMS Queue The JMS queue to which notifications of changes to entities affected by policies are sent.
Default: Navigator.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See
Modifying Configuration Properties Using Cloudera Manager on page 10.
5. Click Save Changes to commit the changes.
6. Restart the role.