Cloudera ODBC Driver for Apache Hive Install Guide
Cloudera ODBC Driver for Apache Hive Install Guide
Cloudera, the Cloudera logo, and any other product or service names or slogans contained in this
document, except as otherwise disclaimed, are trademarks of Cloudera and its suppliers or
licensors, and may not be copied, imitated or used, in whole or in part, without the prior written
permission of Cloudera or the applicable trademark holder.
Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation. All
other trademarks, registered trademarks, product names and company names or logos
mentioned in this document are the property of their respective owners. Reference to any
products, services, processes or other information, by trade name, trademark, manufacturer,
supplier or otherwise does not constitute or imply endorsement, sponsorship or
recommendation thereof by us.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the
rights under copyright, no part of this document may be reproduced, stored in or introduced into
a retrieval system, or transmitted in any form or by any means (electronic, mechanical,
photocopying, recording, or otherwise), or for any purpose, without the express written
permission of Cloudera.
Cloudera may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Cloudera, the furnishing of this document does not give you any
license to these patents, trademarks copyrights, or other intellectual property.
The information in this document is subject to change without notice. Cloudera shall not be liable
for any damages resulting from technical errors or omissions which may be present in this
document, or from use of this document.
Cloudera, Inc.
1001 Page Mill Road, Building 2
Palo Alto, CA 94304-1008
[email protected]
US: 1-888-789-1488
Intl: 1-650-843-0595
www.cloudera.com
Release Information
Version: 2.6.4
The Cloudera ODBC Driver for Apache Hive complies with the ODBC 3.80 data standard and adds
important functionality such as Unicode and 32- and 64-bit support for high-performance
computing environments.
ODBC is one of the most established and widely supported APIs for connecting to and working
with databases. At the heart of the technology is the ODBC driver, which connects an application
to the database. For more information about ODBC, see Data Access Standards on the Simba
Technologies website: https://www.simba.com/resources/data-access-standards-glossary. For
complete information about the ODBC specification, see the ODBC API Reference from the
Microsoft documentation: https://docs.microsoft.com/en-us/sql/odbc/reference/syntax/odbc-
api-reference.
The Installation and Configuration Guide is suitable for users who are looking to access data
residing within Hive from their desktop environment. Application developers might also find the
information helpful. Refer to your application for details on connecting via ODBC.
Windows Driver
Windows System Requirements
The Cloudera ODBC Driver for Apache Hive supports Apache Hive versions 0.11 through 3.1, and
CDH versions 5.0 through 6.2.
Install the driver on client machines where the application is installed. Before installing the driver,
make sure that you have the following:
l Administrator rights on your machine.
l A machine that meets the following system requirements:
l One of the following operating systems:
l Windows 10, 8.1, or 7 SP1
l Windows Server 2016, 2012, or 2008 R2 SP1
l 100 MB of available disk space
l Visual C++ Redistributable for Visual Studio 2013 installed (with the same bitness as
the driver that you are installing).
You can download the installation packages at https://www.microsoft.com/en-
ca/download/details.aspx?id=40784.
You can install both versions of the driver on the same machine.
Alternatively, you can specify connection settings in a connection string or as driver-wide settings.
Settings in the connection string take precedence over settings in the DSN, and settings in the DSN
take precedence over driver-wide settings.
The following instructions describe how to create a DSN. For information about specifying settings
in a connection string, see "Using a Connection String" on page 54. For information about driver-
wide settings, see "Configuring a DSN-less Connection on Windows" on page 10.
Note:
Make sure to select the ODBC Data Source Administrator that has the same bitness as the
client application that you are using to connect to Hive.
2. In the ODBC Data Source Administrator, click the Drivers tab, and then scroll down as
needed to confirm that the Cloudera ODBC Driver for Apache Hive appears in the
alphabetical list of ODBC drivers that are installed on your system.
3. Choose one:
l To create a DSN that only the user currently logged into Windows can use, click the
User DSN tab.
l Or, to create a DSN that all users who log into Windows can use, click the System
DSN tab.
Note:
It is recommended that you create a System DSN instead of a User DSN. Some
applications load the data using a different user account, and might not be able to detect
User DSNs that are created under another user account.
4. Click Add.
5. In the Create New Data Source dialog box, select Cloudera ODBC Driver for Apache Hive
and then click Finish. The Cloudera ODBC Driver for Apache Hive DSN Setup dialog box
opens.
6. In the Data Source Name field, type a name for your DSN.
7. Optionally, in the Description field, type relevant details about the DSN.
8. In the Hive Server Type drop-down list, select Hive Server 1 or Hive Server 2.
Note:
If you are connecting through Apache ZooKeeper, then Hive Server 1 is not supported.
9. Specify whether the driver uses the ZooKeeper service when connecting to Hive, and
provide the necessary connection information:
l To connect to Hive without using the Apache ZooKeeper service, do the following:
a. From the Service Discovery Mode drop-down list, select No Service
Discovery.
b. In the Host(s) field, type the IP address or host name of the Hive server.
c. In the Port field, type the number of the TCP port that the Hive server uses to
listen for client connections.
l Or, to discover Hive Server 2 services via the ZooKeeper service, do the following:
a. From the Service Discovery Mode drop-down list, select ZooKeeper.
b. In the Host(s) field, type a comma-separated list of ZooKeeper servers. Use the
following format, where [ZK_Host] is the IP address or host name of the
ZooKeeper server and [ZK_Port] is the number of the TCP port that the
ZooKeeper server uses to listen for client connections:
[ZK_Host1]:[ZK_Port1],[ZK_Host2]:[ZK_Port2]
Note:
You can still issue queries on other schemas by explicitly specifying the schema in the
query. To inspect your databases and determine the appropriate schema to use, type the
show databases command at the Hive command prompt.
11. In the Authentication area, configure authentication as needed. For more information, see
"Configuring Authentication on Windows" on page 12.
Note:
Hive Server 1 does not support authentication. Most default configurations of Hive Server
2 require User Name authentication. To verify the authentication mechanism that you
need to use for your connection, check the configuration of your Hadoop / Hive
distribution. For more information, see "Authentication Mechanisms" on page 52.
12. Optionally, if the operations against Hive are to be done on behalf of a user that is different
than the authenticated user for the connection, type the name of the user to be delegated
in the Delegation UID field. For more information, see "Delegating Authentication to a
Specific User" on page 16.
Note:
This option is applicable only when connecting to a Hive Server 2 instance that supports
this feature.
13. In the Thrift Transport drop-down list, select the transport protocol to use in the Thrift
layer.
Note:
For information about how to determine which Thrift transport protocols your Hive
server supports, see "Authentication Mechanisms" on page 52.
14. If the Thrift Transport option is set to HTTP, then to configure HTTP options such as custom
headers, click HTTP Options. For more information, see "Configuring HTTP Options on
Windows" on page 18.
15. To configure client-server verification over SSL, click SSL Options. For more information, see
"Configuring SSL Verification on Windows" on page 19.
Note:
If you selected User Name as the authentication mechanism, SSL is not available.
16. To configure advanced driver options, click Advanced Options. For more information, see
"Configuring Advanced Options on Windows" on page 16.
17. To configure server-side properties, click Advanced Options and then click Server Side
Properties. For more information, see "Configuring Server-Side Properties on Windows" on
page 21.
18. To configure the Temporary Table feature, click Advanced Options and then click
Temporary Table Configuration. For more information, see "Configuring the Temporary
Table Feature" on page 20 and "Temporary Tables" on page 60.
Important:
When connecting to Hive 0.14 or later, the Temporary Tables feature is always enabled
and you do not need to configure it in the driver.
19. To configure logging behavior for the driver, click Logging Options. For more information,
see "Configuring Logging Options on Windows" on page 22.
20. To test the connection, click Test. Review the results as needed, and then click OK.
Note:
If the connection fails, then confirm that the settings in the Cloudera ODBC Driver for
Apache Hive DSN Setup dialog box are correct. Contact your Hive server administrator as
needed.
21. To save your settings and close the Cloudera ODBC Driver for Apache Hive DSN Setup dialog
box, click OK.
22. To close the ODBC Data Source Administrator, click OK.
The following section explains how to use the driver configuration tool. For information about
using connection strings, see "Using a Connection String" on page 54.
Note:
l Settings in the connection string take precedence over settings in the DSN, and settings in
the DSN take precedence over driver-wide settings.
l The drop-down lists in the driver configuration tool only display one option at a time. Use
the scroll arrows on the right side of the drop-down list to view and select other options.
Note:
Make sure to select the Driver Configuration Tool that has the same bitness as the client
application that you are using to connect to Hive.
2. If you are prompted for administrator permission to make modifications to the machine,
click OK.
Note:
You must have administrator access to the machine to run this application because it
makes changes to the registry.
3. In the Hive Server Type drop-down list, select Hive Server 1 or Hive Server 2.
Note:
If you are connecting through Apache ZooKeeper, then Hive Server 1 is not supported.
4. Specify whether the driver uses the ZooKeeper service when connecting to Hive:
l To connect to Hive without using the Apache ZooKeeper service, from the Service
Discovery Mode drop-down list, select No Service Discovery.
l Or, to discover Hive Server 2 services via the ZooKeeper service, do the following:
a. From the Service Discovery Mode drop-down list, select ZooKeeper.
b. In the Host(s) field, type a comma-separated list of ZooKeeper servers. Use the
following format, where [ZK_Host] is the IP address or host name of the
ZooKeeper server and [ZK_Port] is the number of the TCP port that the
ZooKeeper server uses to listen for client connections:
[ZK_Host1]:[ZK_Port1],[ZK_Host2]:[ZK_Port2]
Note:
Hive Server 1 does not support authentication. Most default configurations of Hive Server
2 require User Name authentication. To verify the authentication mechanism that you
need to use for your connection, check the configuration of your Hadoop / Hive
distribution. For more information, see "Authentication Mechanisms" on page 52.
6. Optionally, if the operations against Hive are to be done on behalf of a user that is different
than the authenticated user for the connection, then in the Delegation UID field, type the
name of the user to be delegated. For more information, see "Delegating Authentication to
a Specific User" on page 16.
Note:
This option is applicable only when connecting to a Hive Server 2 instance that supports
this feature.
7. In the Thrift Transport drop-down list, select the transport protocol to use in the Thrift
layer.
Note:
For information about how to determine which Thrift transport protocols your Hive
server supports, see "Authentication Mechanisms" on page 52.
8. If the Thrift Transport option is set to HTTP, then to configure HTTP options such as custom
headers, click HTTP Options. For more information, see "Configuring HTTP Options on
Windows" on page 18.
9. To configure client-server verification over SSL, click SSL Options. For more information, see
"Configuring SSL Verification on Windows" on page 19.
Note:
If you selected User Name as the authentication mechanism, SSL is not available.
10. To configure advanced options, click Advanced Options. For more information, see
"Configuring Advanced Options on Windows" on page 16.
11. To configure server-side properties, click Advanced Options and then click Server Side
Properties. For more information, see "Configuring Server-Side Properties on Windows" on
page 21.
12. To configure the Temporary Table feature, click Advanced Options and then click
Temporary Table Configuration. For more information, see "Temporary Tables" on page 60
and "Configuring the Temporary Table Feature" on page 20.
Important:
When connecting to Hive 0.14 or later, the Temporary Tables feature is always enabled
and you do not need to configure it in the driver.
13. To save your settings and close the Cloudera Hive ODBC Driver Configuration tool, click OK.
For information about how to determine the type of authentication your Hive server requires, see
"Authentication Mechanisms" on page 52.
You can specify authentication settings in a DSN, in a connection string, or as driver-wide settings.
Settings in the connection string take precedence over settings in the DSN, and settings in the DSN
take precedence over driver-wide settings.
If cookie-based authentication is enabled in your Hive Server 2 database, you can specify a list of
authentication cookies in the HTTPAuthCookies connection property. In this case, the driver
authenticates the connection once based on the provided authentication credentials. It then uses
the cookie generated by the server for each subsequent request in the same connection. For
more information, see "HTTPAuthCookies" on page 87.
Note:
Using No Authentication
When connecting to a Hive server of type Hive Server 1, you must use No Authentication. When
you use No Authentication, Binary is the only Thrift transport protocol that is supported.
Using Kerberos
This authentication mechanism is available only for Hive Server 2. When you use Kerberos
authentication, the Binary transport protocol is not supported.
If the Use Only SSPI advanced option is disabled, then Kerberos must be installed and configured
before you can use this authentication mechanism. For information about configuring Kerberos
on your machine, see "Configuring Kerberos Authentication for Windows" on page 24. For
information about setting the Use Only SSPI advanced option, see "Configuring Advanced Options
on Windows" on page 16.
l To use the default realm defined in your Kerberos setup, leave the Realm field empty.
l Or, if your Kerberos setup does not define a default realm or if the realm of your Hive
Server 2 host is not the default, then, in the Realm field, type the Kerberos realm of
the Hive Server 2.
4. In the Host FQDN field, type the fully qualified domain name of the Hive Server 2 host.
Note:
To use the Hive server host name as the fully qualified domain name for Kerberos
authentication, in the Host FQDN field, type _HOST.
5. In the Service Name field, type the service name of the Hive server.
6. Optionally, if you are using MIT Kerberos and a Kerberos realm is specified in the Realm
field, then choose one:
l To have the Kerberos layer canonicalize the server's service principal name, leave the
Canonicalize Principal FQDN check box selected.
l Or, to prevent the Kerberos layer from canonicalizing the server's service principal
name, clear the Canonicalize Principal FQDN check box.
7. To allow the driver to pass your credentials directly to the server for use in authentication,
select Delegate Kerberos Credentials.
8. From the Thrift Transport drop-down list, select the transport protocol to use in the Thrift
layer.
Important:
When using this authentication mechanism, the Binary transport protocol is not
supported.
9. If the Hive server is configured to use SSL, then click SSL Options to configure SSL for the
connection. For more information, see "Configuring SSL Verification on Windows" on page
19.
10. To save your settings and close the dialog box, click OK.
This authentication mechanism requires a user name but not a password. The user name labels
the session, facilitating database tracking.
This authentication mechanism is available only for Hive Server 2. Most default configurations of
Hive Server 2 require User Name authentication. When you use User Name authentication, SSL is
not supported and SASL is the only Thrift transport protocol available.
Important:
The password is obscured, that is, not saved in plain text. However, it is still possible for
the encrypted password to be copied and used.
6. From the Thrift Transport drop-down list, select the transport protocol to use in the Thrift
layer.
7. If the Hive server is configured to use SSL, then click SSL Options to configure SSL for the
connection. For more information, see "Configuring SSL Verification on Windows" on page
19.
8. To save your settings and close the dialog box, click OK.
Some Hive Server 2 instances support the ability to delegate all operations against Hive to the
specified user, rather than to the authenticated user for the connection.
If the server returns an error message such as Failed to validate proxy privilege of [RealUser]
for [DelegationUID], you may need to modify the server's core-site.xml configuration file,
as follows:
1. In the server's core-site.xml configuration file, add the following properties:
hadoop.proxyuser.[RealUser].groups=*
hadoop.proxyuser.[RealUser].hosts=*
Where [Principal] is the primary Kerberos principal user. For example, if the primary
Kerberos principal user is [email protected], replace [Principal] with kerbuser.
For more information on resolving this error, see your Hive Server documentation.
The following instructions describe how to configure advanced options in a DSN and in the driver
configuration tool. You can specify the connection settings described below in a DSN, in a
connection string, or as driver-wide settings. Settings in the connection string take precedence
over settings in the DSN, and settings in the DSN take precedence over driver-wide settings.
Important:
l When this option is enabled, the driver cannot execute parameterized queries.
l By default, the driver applies transformations to the queries emitted by an
application to convert the queries into an equivalent form in HiveQL. If the
application is Hive-aware and already emits HiveQL, then turning off the translation
avoids the additional overhead of query transformation.
3. To defer query execution to SQLExecute, select the Fast SQLPrepare check box.
4. To allow driver-wide configurations to take precedence over connection and DSN settings,
select the Driver Config Take Precedence check box.
5. To use the asynchronous version of the API call against Hive for executing a query, select
the Use Async Exec check box.
Note:
This option is applicable only when connecting to a Hive cluster running Hive 0.12.0 or
later.
6. To retrieve table names from the database by using the SHOW TABLES query, select the Get
Tables With Query check box.
Note:
7. To enable the driver to return SQL_WVARCHAR instead of SQL_VARCHAR for STRING and
VARCHAR columns, and SQL_WCHAR instead of SQL_CHAR for CHAR columns, select the
Unicode SQL Character Types check box.
8. To enable the driver to return the hive_system table for catalog function calls such as
SQLTables and SQLColumns, select the Show System Table check box.
9. To specify which mechanism the driver uses by default to handle Kerberos authentication,
do one of the following:
l To use the SSPI plugin by default, select the Use Only SSPI check box.
l To use MIT Kerberos by default and only use the SSPI plugin if the GSSAPI library is
not available, clear the Use Only SSPI check box.
10. To enable the driver to automatically open a new session when the existing session is no
longer valid, select the Invalid Session Auto Recover check box.
Note:
11. To have the driver automatically attempt to reconnect to the server if communications are
lost, select Enable Auto Reconnect.
12. In the Rows Fetched Per Block field, type the number of rows to be fetched per block.
13. In the Default String Column Length field, type the maximum data length for STRING
columns.
14. In the Binary Column Length field, type the maximum data length for BINARY columns.
15. In the Decimal Column Scale field, type the maximum number of digits to the right of the
decimal point for numeric data types.
16. In the Socket Timeout field, type the number of seconds that an operation can remain idle
before it is closed.
Note:
This option is applicable only when asynchronous query execution is being used against
Hive Server 2 instances.
17. To save your settings and close the Advanced Options dialog box, click OK.
The following instructions describe how to configure HTTP options in a DSN and in the driver
configuration tool. You can specify the connection settings described below in a DSN, in a
connection string, or as driver-wide settings. Settings in the connection string take precedence
over settings in the DSN, and settings in the DSN take precedence over driver-wide settings.
Note:
The HTTP options are available only when the Thrift Transport option is set to HTTP.
3. In the HTTP Path field, type the partial URL corresponding to the Hive server.
4. To create a custom HTTP header, click Add, then type appropriate values in the Key and
Value fields, and then click OK.
5. To edit a custom HTTP header, select the header from the list, then click Edit, then update
the Key and Value fields as needed, and then click OK.
6. To delete a custom HTTP header, select the header from the list, and then click Remove. In
the confirmation dialog box, click Yes.
7. To save your settings and close the HTTP Properties dialog box, click OK.
The following instructions describe how to configure SSL in a DSN and in the driver configuration
tool. You can specify the connection settings described below in a DSN, in a connection string, or
as driver-wide settings. Settings in the connection string take precedence over settings in the DSN,
and settings in the DSN take precedence over driver-wide settings.
Note:
If you selected User Name as the authentication mechanism, SSL is not available.
Important:
l If you are using the Windows trust store, make sure to import the trusted CA
certificates into the trust store.
l If the trusted CA supports certificate revocation, select the Check Certificate
Revocation check box.
6. From the Minimum TLS Version drop-down list, select the minimum version of TLS to use
when connecting to your data store.
7. To configure two-way SSL verification, select the Two-Way SSL check box and then do the
following:
a. In the Client Certificate File field, specify the full path of the PEM file containing the
client's certificate.
b. In the Client Private Key File field, specify the full path of the file containing the
client's private key.
c. If the private key file is protected with a password, type the password in the Client
Private Key Password field. To save the password, select the Save Password
(Encrypted) check box.
Important:
The password is obscured, that is, not saved in plain text. However, it is still
possible for the encrypted password to be copied and used.
8. To save your settings and close the SSL Options dialog box, click OK.
Important:
When connecting to Hive 0.14 or later, the Temporary Tables feature is always enabled and you
do not need to configure it in the driver.
1. Choose one:
l To configure the temporary table feature for a DSN, open the ODBC Data Source
Administrator where you created the DSN, then select the DSN and click Configure,
then click Advanced Options, and then click Temporary Table Configuration.
l Or, to configure the temporary table feature for a DSN-less connection, open the
Cloudera Hive ODBC Driver Configuration tool, then click Advanced Options, and
then click Temporary Table Configuration.
2. To enable the Temporary Table feature, select the Enable Temporary Table check box.
3. In the Web HDFS Host field, type the host name or IP address of the machine hosting both
the namenode of your Hadoop cluster and the WebHDFS service. If this field is left blank,
then the host name of the Hive server is used.
4. In the Web HDFS Port field, type the WebHDFS port for the namenode.
5. In the HDFS User field, type the name of the HDFS user that the driver uses to create the
necessary files for supporting the Temporary Table feature.
6. In the Data File HDFS Dir field, type the HDFS directory that the driver uses to store the
necessary files for supporting the Temporary Table feature.
Note:
7. In the Temp Table TTL field, type the number of minutes that a temporary table is
guaranteed to exist in Hive after it is created.
8. To save your settings and close the Temporary Table Configuration dialog box, click OK.
The following instructions describe how to configure server-side properties in a DSN and in the
driver configuration tool. You can specify the connection settings described below in a DSN, in a
connection string, or as driver-wide settings. Settings in the connection string take precedence
over settings in the DSN, and settings in the DSN take precedence over driver-wide settings.
2. To create a server-side property, click Add, then type appropriate values in the Key and
Value fields, and then click OK. For example, to set the value of the
mapreduce.job.queuename property to myQueue, type mapreduce.job.queuename
in the Key field and then type myQueue in the Value field.
Note:
For a list of all Hadoop and Hive server-side properties that your implementation
supports, type set -v at the Hive CLI command line or Beeline. You can also execute the
set -v query after connecting using the driver.
3. To edit a server-side property, select the property from the list, then click Edit, then update
the Key and Value fields as needed, and then click OK.
4. To delete a server-side property, select the property from the list, and then click Remove. In
the confirmation dialog box, click Yes.
5. To configure the driver to convert server-side property key names to all lower-case
characters, select the Convert Key Name To Lower Case check box.
6. To change the method that the driver uses to apply server-side properties, do one of the
following:
l To configure the driver to apply each server-side property by executing a query when
opening a session to the Hive server, select the Apply Server Side Properties With
Queries check box.
l Or, to configure the driver to use a more efficient method for applying server-side
properties that does not involve additional network round-tripping, clear the Apply
Server Side Properties With Queries check box.
Note:
The more efficient method is not available for Hive Server 1, and it might not be
compatible with some Hive Server 2 builds. If the server-side properties do not take effect
when the check box is clear, then select the check box.
7. To save your settings and close the Server Side Properties dialog box, click OK.
Important:
Only enable logging or tracing long enough to capture an issue. Logging or tracing decreases
performance and can consume a large quantity of disk space.
The settings for logging apply to every connection that uses the Cloudera ODBC Driver for
Apache Hive, so make sure to disable the feature after you are done using it.
FATAL Logs severe error events that lead the driver to abort.
ERROR Logs error events that might allow the driver to continue
running.
WARNING Logs events that might result in an error if action is not taken.
DEBUG Logs detailed information that is useful for debugging the driver.
3. In the Log Path field, specify the full path to the folder where you want to save log files.
4. If requested by Technical Support, type the name of the component for which to log
messages in the Log Namespace field. Otherwise, do not type a value in the field.
5. In the Max Number Files field, type the maximum number of log files to keep.
Note:
After the maximum number of log files is reached, each time an additional file is created,
the driver deletes the oldest log file.
6. In the Max File Size field, type the maximum size of each log file in megabytes (MB).
Note:
After the maximum file size is reached, the driver creates a new file and continues logging.
7. Click OK.
8. Restart your ODBC application to make sure that the new settings take effect.
The Cloudera ODBC Driver for Apache Hive produces the following log files at the location you
specify in the Log Path field:
The Cloudera ODBC Driver for Apache Hive supports Active Directory Kerberos on Windows. There
are two prerequisites for using Active Directory Kerberos on Windows:
l MIT Kerberos is not installed on the client Windows machine.
l The MIT Kerberos Hadoop realm has been configured to trust the Active Directory realm so
that users in the Active Directory realm can access services in the MIT Kerberos Hadoop
realm.
MIT Kerberos
For information about Kerberos and download links for the installer, see the MIT Kerberos
website: http://web.mit.edu/kerberos/.
Note:
The 64-bit installer includes both 32-bit and 64-bit libraries. The 32-bit installer includes 32-
bit libraries only.
2. To run the installer, double-click the .msi file that you downloaded above.
Settings for Kerberos are specified through a configuration file. You can set up the configuration
file as an .ini file in the default location, which is the C:\ProgramData\MIT\Kerberos5
directory, or as a .conf file in a custom location.
Note:
For more information on configuring Kerberos, refer to the MIT Kerberos documentation.
11. Click OK to close the Environment Variables dialog box, and then click OK to close the
System Properties dialog box.
Note:
krb5cache is a file (not a directory) that is managed by the Kerberos software, and it
should not be created by the user. If you receive a permission error when you first use
Kerberos, make sure that the krb5cache file does not already exist as a file or a
directory.
If the authentication succeeds, then your ticket information appears in MIT Kerberos Ticket
Manager.
[Principal] is the Kerberos user principal to use for authentication. For example:
[email protected].
3. If the cache location KRB5CCNAME is not set or used, then use the -c option of the kinit
command to specify the location of the credential cache. In the command, the -c
argument must appear last. For example:
kinit -k -t C:\mykeytabs\myUser.keytab [email protected] -c
C:\ProgramData\MIT\krbcache
To obtain a ticket for a Kerberos principal using the default keytab file:
Note:
For information about configuring a default keytab file for your Kerberos configuration, refer to
the MIT Kerberos documentation.
[principal] is the Kerberos user principal to use for authentication. For example:
[email protected].
3. If the cache location KRB5CCNAME is not set or used, then use the -c option of the kinit
command to specify the location of the credential cache. In the command, the -c
argument must appear last. For example:
kinit -k -t C:\mykeytabs\myUser.keytab [email protected] -c
C:\ProgramData\MIT\krbcache
Note:
Make sure to select the ODBC Data Source Administrator that has the same bitness as the
client application that you are using to connect to Hive.
2. Click the Drivers tab and then find the Cloudera ODBC Driver for Apache Hive in the list of
ODBC drivers that are installed on your system. The version number is displayed in the
Version column.
macOS Driver
macOS System Requirements
The Cloudera ODBC Driver for Apache Hive supports Apache Hive versions 0.11 through 3.1, and
CDH versions 5.0 through 6.2.
Install the driver on client machines where the application is installed. Each client machine that
you install the driver on must meet the following minimum system requirements:
l macOS version 10.12, 10.13, or 10.14
l 100 MB of available disk space
l iODBC 3.52.9, 3.52.10, 3.52.11, or 3.52.12
Note:
6. To accept the installation location and begin the installation, click Install.
7. When the installation completes, click Close.
Next, configure the environment variables on your machine to make sure that the ODBC driver
manager can work with the driver. For more information, see "Configuring the ODBC Driver
Manager on Non-Windows Machines" on page 36.
The command returns information about the Cloudera ODBC Driver for Apache Hive that is
installed on your machine, including the version number.
Linux Driver
For most Linux distributions, you can install the driver using the RPM file. If you are installing the
driver on a Debian machine, you must use the Debian package.
Install the driver on client machines where the application is installed. Each client machine that
you install the driver on must meet the following minimum system requirements:
l One of the following distributions:
o Red Hat® Enterprise Linux® (RHEL) 6 or 7
o CentOS 6 or 7
o SUSE Linux Enterprise Server (SLES) 11 or 12
o Debian 6 or 7
o Ubuntu 14.04
o Oracle Linux 7.5 or 7.6
l 150 MB of available disk space
l One of the following ODBC driver managers installed:
o iODBC 3.52.9, 3.52.10, 3.52.11, or 3.52.12
o unixODBC 2.3.2, 2.3.3, or 2.3.4
l All of the following libsasl libraries installed:
o cyrus-sasl-2.1.22-7 or later
o cyrus-sasl-gssapi-2.1.22-7 or later
o cyrus-sasl-plain-2.1.22-7 or later
Note:
If the package manager in your Linux distribution cannot resolve the dependencies
automatically when installing the driver, then download and manually install the
packages.
To install the driver, you must have root access on the machine.
You can install both the 32-bit and 64-bit versions of the driver on the same machine.
To install the Cloudera ODBC Driver for Apache Hive using the RPM File:
1. Log in as the root user.
2. Navigate to the folder containing the RPM package for the driver.
3. Depending on the Linux distribution that you are using, run one of the following commands
from the command line, where [RPMFileName] is the file name of the RPM package:
l If you are using Red Hat Enterprise Linux or CentOS, run the following command:
yum --nogpgcheck localinstall [RPMFileName]
l Or, if you are using SUSE Linux Enterprise Server, run the following command:
zypper install [RPMFileName]
The Cloudera ODBC Driver for Apache Hive files are installed in the
/opt/cloudera/hiveodbc directory.
Note:
If the package manager in your Linux distribution cannot resolve the libsasl
dependencies automatically when installing the driver, then download and manually
install the packages.
Next, configure the environment variables on your machine to make sure that the ODBC driver
manager can work with the driver. For more information, see "Configuring the ODBC Driver
Manager on Non-Windows Machines" on page 36.
On 64-bit editions of Debian, you can execute both 32- and 64-bit applications. However, 64-bit
applications must use 64-bit drivers, and 32-bit applications must use 32-bit drivers. Make sure
that you use the version of the driver that matches the bitness of the client application:
l ClouderaHiveODBC-32bit-[Version]-[Release]_i386.deb for the 32-bit
driver
l ClouderaHiveODBC-[Version]-[Release]_amd64.deb for the 64-bit driver
[Version] is the version number of the driver, and [Release] is the release number for this
version of the driver.
You can install both versions of the driver on the same machine.
The Cloudera ODBC Driver for Apache Hive files are installed in the
/opt/cloudera/hiveodbc directory.
Note:
If the package manager in your Ubuntu distribution cannot resolve the libsasl
dependencies automatically when installing the driver, then download and manually
install the packages required by the version of the driver that you want to install.
Next, configure the environment variables on your machine to make sure that the ODBC driver
manager can work with the driver. For more information, see "Configuring the ODBC Driver
Manager on Non-Windows Machines" on page 36.
l
rpm -qa | grep ClouderaHiveODBC
l
dpkg -l | grep clouderahiveodbc
The command returns information about the Cloudera ODBC Driver for Apache Hive that is
installed on your machine, including the version number.
AIX Driver
AIX System Requirements
The Cloudera ODBC Driver for Apache Hive supports Apache Hive versions 0.11 through 3.1, and
CDH versions 5.0 through 6.2.
Install the driver on client machines where the application is installed. Each machine that you
install the driver on must meet the following minimum system requirements:
l IBM AIX 5.3, 6.1, or 7.1
l 150 MB of available disk space
l One of the following ODBC driver managers installed:
o iODBC 3.52.9, 3.52.10, 3.52.11, or 3.52.12
o unixODBC 2.3.2, 2.3.3, or 2.3.4
To install the driver, you must have root access on the machine.
[Version] is the version number of the driver, and [Release] is the release number for this version
of the driver.
You can install both versions of the driver on the same machine.
The Cloudera ODBC Driver for Apache Hive files are installed in the
/opt/cloudera/hiveodbc directory.
Next, configure the environment variables on your machine to make sure that the ODBC driver
manager can work with the driver. For more information, see "Configuring the ODBC Driver
Manager on Non-Windows Machines" on page 36.
The command returns information about the Cloudera ODBC Driver for Apache Hive that is
installed on your machine, including the version number.
After configuring the ODBC driver manager, you can configure a connection and access your data
store through the driver.
macOS
If you are using a macOS machine, then set the DYLD_LIBRARY_PATH environment variable to
include the paths to the ODBC driver manager libraries. For example, if the libraries are installed in
/usr/local/lib, then run the following command to set DYLD_LIBRARY_PATH for the current
user session:
export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:/usr/local/lib
For information about setting an environment variable permanently, refer to the macOS shell
documentation.
Linux or AIX
If you are using a Linux or AIX machine, then set the LD_LIBRARY_PATH environment variable to
include the paths to the ODBC driver manager libraries. For example, if the libraries are installed in
/usr/local/lib, then run the following command to set LD_LIBRARY_PATH for the current
user session:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
For information about setting an environment variable permanently, refer to the Linux or AIX shell
documentation.
driver installation directory. If you store these configuration files elsewhere, then you must set the
environment variables described below so that the driver manager can locate the files.
For example, if your odbc.ini and odbcinst.ini files are located in /usr/local/odbc
and your cloudera.hiveodbc.ini file is located in /etc, then set the environment
variables as follows:
For iODBC:
export ODBCINI=/usr/local/odbc/odbc.ini
export ODBCINSTINI=/usr/local/odbc/odbcinst.ini
export CLOUDERAHIVEINI=/etc/cloudera.hiveodbc.ini
For unixODBC:
export ODBCINI=/usr/local/odbc/odbc.ini
export ODBCSYSINI=/usr/local/odbc
export CLOUDERAHIVEINI=/etc/cloudera.hiveodbc.ini
To locate the cloudera.hiveodbc.ini file, the driver uses the following search order:
1. If the CLOUDERAHIVEINI environment variable is defined, then the driver searches for the
file specified by the environment variable.
2. The driver searches the directory that contains the driver library files for a file named
cloudera.hiveodbc.ini.
3. The driver searches the current working directory of the application for a file named
cloudera.hiveodbc.ini.
4. The driver searches the home directory for a hidden file named
.cloudera.hiveodbc.ini (prefixed with a period).
5. The driver searches the /etc directory for a file named cloudera.hiveodbc.ini.
You can specify connection settings in a DSN (in the odbc.ini file), in a connection string, or as
driver-wide settings (in the cloudera.hiveodbc.ini file). Settings in the connection string
take precedence over settings in the DSN, and settings in the DSN take precedence over driver-
wide settings.
The following instructions describe how to create a DSN by specifying connection settings in the
odbc.ini file. If your machine is already configured to use an existing odbc.ini file, then
update that file by adding the settings described below. Otherwise, copy the odbc.ini file from
the Setup subfolder in the driver installation directory to the home directory, and then update
the file as described below.
For information about specifying settings in a connection string, see "Configuring a DSN-less
Connection on a Non-Windows Machine" on page 41 and "Using a Connection String" on page 54.
For information about driver-wide settings, see "Setting Driver-Wide Configuration Options on a
Non-Windows Machine" on page 50.
Note:
If you are using a hidden copy of the odbc.ini file, you can remove the period (.) from
the start of the file name to make the file visible while you are editing it.
2. In the [ODBC Data Sources] section, add a new entry by typing a name for the DSN,
an equal sign (=), and then the name of the driver.
3. Create a section that has the same name as your DSN, and then specify configuration
options as key-value pairs in the section:
a. Set the Driver property to the full path of the driver library file that matches the
bitness of the application.
For example:
HiveServerType=1
c. Specify whether the driver uses the ZooKeeper service when connecting to Hive, and
provide the necessary connection information:
l To connect to Hive without using the ZooKeeper service do the following:
a. Set the ServiceDiscoveryMode property to 0.
b. Set the Host property to the IP address or host name of the server.
c. Set the Port property to the number of the TCP port that the server
uses to listen for client connections
For example:
ServiceDiscoveryMode=0
Host=192.168.222.160
Port=10000
l Or, to discover Hive Server 2 services through the ZooKeeper service, set
properties as described in "Configuring Service Discovery Mode on a Non-
Windows Machine" on page 43.
d. If authentication is required to access the server, then specify the authentication
mechanism and your credentials. For more information, see "Configuring
Authentication on a Non-Windows Machine" on page 43.
e. If you want to connect to the server through SSL, then enable SSL and specify the
certificate information. For more information, see "Configuring SSL Verification on a
Non-Windows Machine" on page 46.
Note:
f. If you want to configure server-side properties, then set them as key-value pairs
using a special syntax. For more information, see "Configuring Server-Side Properties
on a Non-Windows Machine" on page 47.
g. Optionally, set additional key-value pairs as needed to specify other optional
connection settings. For detailed information about all the configuration options
supported by the Cloudera ODBC Driver for Apache Hive, see "Driver Configuration
Options" on page 64.
4. Save the odbc.ini configuration file.
Note:
If you are storing this file in its default location in the home directory, then prefix the file
name with a period (.) so that the file becomes hidden. If you are storing this file in
another location, then save it as a non-hidden file (without the prefix), and make sure
that the ODBCINI environment variable specifies the location. For more information, see
"Specifying the Locations of the Driver Configuration Files" on page 36.
For example, the following is an odbc.ini configuration file for macOS containing a DSN that
connects to a Hive Thrift Server instance directly and authenticates the connection using a user
name and password:
[ODBC Data Sources]
Sample DSN=Cloudera Hive ODBC Driver
[Sample DSN]
Driver=/
opt/cloudera/hiveodbc/lib/universal/libclouderahiveodbc.dylib
HiveServerType=2
ServiceDiscoveryMode=0
Host=192.168.222.160
Port=10000
UID=username
PWD=userpassword
As another example, the following is an odbc.ini configuration file for a 32-bit driver on a
Linux/AIX/Debian machine, containing a DSN that connects to a HiveThrift Server instance directly
and authenticates the connection using a user name and password:
[ODBC Data Sources]
Sample DSN=Cloudera Hive ODBC Driver 32-bit
[Sample DSN]
Driver=/opt/cloudera/hiveodbc/lib/32/libclouderahiveodbc32.so
HiveServerType=2
ServiceDiscoveryMode=0
Host=192.168.222.160
Port=10000
UID=username
PWD=userpassword
You can now use the DSN in an application to connect to the data store.
If your machine is already configured to use an existing odbcinst.ini file, then update that file
by adding the settings described below. Otherwise, copy the odbcinst.ini file from the
Setup subfolder in the driver installation directory to the home directory, and then update the
file as described below.
Note:
If you are using a hidden copy of the odbcinst.ini file, you can remove the period (.)
from the start of the file name to make the file visible while you are editing it.
2. In the [ODBC Drivers] section, add a new entry by typing a name for the driver, an
equal sign (=), and then Installed.
For example:
[ODBC Drivers]
Cloudera ODBC Driver for Apache Hive=Installed
3. Create a section that has the same name as the driver (as specified in the previous step),
and then specify the following configuration options as key-value pairs in the section:
a. Set the Driver property to the full path of the driver library file that matches the
bitness of the application.
Driver=/opt
/cloudera/hiveodbc/lib/universal/libclouderahiveodbc.dy
lib
For example:
Description=Cloudera ODBC Driver for Apache Hive
Note:
If you are storing this file in its default location in the home directory, then prefix the file
name with a period (.) so that the file becomes hidden. If you are storing this file in
another location, then save it as a non-hidden file (without the prefix), and make sure
that the ODBCINSTINI or ODBCSYSINI environment variable specifies the location. For
more information, see "Specifying the Locations of the Driver Configuration Files" on page
36.
As another example, the following is an odbcinst.ini configuration file for both the 32- and
64-bit drivers on Linux/AIX/Debian:
[ODBC Drivers]
Cloudera Hive ODBC Driver 32-bit=Installed
Cloudera Hive ODBC Driver 64-bit=Installed
[Cloudera Hive ODBC Driver 32-bit]
Description=Cloudera ODBC Driver for Apache Hive (32-bit)
Driver=/opt/cloudera/hiveodbc/lib/32/libclouderahiveodbc32.so
[Cloudera Hive ODBC Driver 64-bit]
Description=Cloudera ODBC Driver for Apache Hive (64-bit)
Driver=/opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
You can now connect to your data store by providing your application with a connection string
where the Driver property is set to the driver name specified in the odbcinst.ini file, and all
the other necessary connection properties are also set. For more information, see "DSN-less
Connection String Examples" in "Using a Connection String" on page 54.
For instructions about configuring specific connection features, see the following:
l "Configuring Service Discovery Mode on a Non-Windows Machine" on page 43
l "Configuring Authentication on a Non-Windows Machine" on page 43
l "Configuring SSL Verification on a Non-Windows Machine" on page 46
l "Configuring Server-Side Properties on a Non-Windows Machine" on page 47
For detailed information about all the connection properties that the driver supports, see "Driver
Configuration Options" on page 64.
You can set the connection properties described below in a connection string, in a DSN (in the
odbc.ini file), or as a driver-wide setting (in the cloudera.hiveodbc.ini file). Settings in
the connection string take precedence over settings in the DSN, and settings in the DSN take
precedence over driver-wide settings.
Important:
For information about how to determine the type of authentication your Hive server requires, see
"Authentication Mechanisms" on page 52.
You can set the connection properties for authentication in a connection string, in a DSN (in the
odbc.ini file), or as a driver-wide setting (in the cloudera.hiveodbc.ini file). Settings in
the connection string take precedence over settings in the DSN, and settings in the DSN take
precedence over driver-wide settings.
Depending on the authentication mechanism you use, there might be additional connection
attributes that you must define. For more information about the attributes involved in
configuring authentication, see "Driver Configuration Options" on page 64.
If cookie-based authentication is enabled in your Hive Server 2 database, you can specify a list of
authentication cookies in the HTTPAuthCookies connection property. In this case, the driver
authenticates the connection once based on the provided authentication credentials. It then uses
the cookie generated by the server for each subsequent request in the same connection. For
more information, see "HTTPAuthCookies" on page 87.
Using No Authentication
When connecting to a Hive server of type Hive Server 1, you must use No Authentication. When
you use No Authentication, Binary is the only Thrift transport protocol that is supported.
Using Kerberos
Kerberos must be installed and configured before you can use this authentication mechanism. For
more information, refer to the MIT Kerberos Documentation: http://web.mit.edu/kerberos/krb5-
latest/doc/.
This authentication mechanism is available only for Hive Server 2. When you use Kerberos
authentication, the Binary transport protocol is not supported.
l Or, to prevent the Kerberos layer from canonicalizing the server's service principal
name, set the ServicePrincipalCanonicalization attribute to 0.
4. Set the KrbHostFQDN attribute to the fully qualified domain name of the Hive Server 2
host.
Note:
To use the Hive server host name as the fully qualified domain name for Kerberos
authentication, set KrbHostFQDN to _HOST.
Important:
8. If the Hive server is configured to use SSL, then configure SSL for the connection. For more
information, see "Configuring SSL Verification on a Non-Windows Machine" on page 46.
This authentication mechanism requires a user name but does not require a password. The user
name labels the session, facilitating database tracking.
This authentication mechanism is available only for Hive Server 2. Most default configurations of
Hive Server 2 require User Name authentication. When you use User Name authentication, SSL is
not supported and SASL is the only Thrift transport protocol available.
3. Set the PWD attribute to the password corresponding to the user name you provided
above.
4. Set the ThriftTransport connection attribute to the transport protocol to use in the
Thrift layer.
5. If the Hive server is configured to use SSL, then configure SSL for the connection. For more
information, see "Configuring SSL Verification on a Non-Windows Machine" on page 46.
Some Hive Server 2 instances support the ability to delegate all operations against Hive to the
specified user, rather than to the authenticated user for the connection.
If the server returns an error message such as Failed to validate proxy privilege of [RealUser]
for [DelegationUID], you may need to modify the server's core-site.xml configuration file,
as follows:
1. In the server's core-site.xml configuration file, add the following properties:
hadoop.proxyuser.[RealUser].groups=*
hadoop.proxyuser.[RealUser].hosts=*
Where [Principal] is the primary Kerberos principal user. For example, if the primary
Kerberos principal user is [email protected], replace [Principal] with kerbuser.
For more information on resolving this error, see your Hive Server documentation.
Note:
You can set the connection properties described below in a connection string, in a DSN (in the
odbc.ini file), or as a driver-wide setting (in the cloudera.hiveodbc.ini file). Settings in
the connection string take precedence over settings in the DSN, and settings in the DSN take
precedence over driver-wide settings.
You can set the connection properties described below in a connection string, in a DSN (in the
odbc.ini file), or as a driver-wide setting (in the cloudera.hiveodbc.ini file). Settings in
the connection string take precedence over settings in the DSN, and settings in the DSN take
precedence over driver-wide settings.
Note:
2. To change the method that the driver uses to apply server-side properties, do one of the
following:
l To configure the driver to apply each server-side property by executing a query when
opening a session to the Hive server, set the ApplySSPWithQueries property to
1.
l Or, to configure the driver to use a more efficient method for applying server-side
properties that does not involve additional network round-tripping, set the
ApplySSPWithQueries property to 0.
Note:
The more efficient method is not available for Hive Server 1, and it might not be
compatible with some Hive Server builds. If the server-side properties do not take effect
when the ApplySSPWithQueries property is set to 0, then set it to 1.
3. To disable the driver's default behavior of converting server-side property key names to all
lower-case characters, set the LCaseSspKeyName property to 0.
Important:
Only enable logging long enough to capture an issue. Logging decreases performance and can
consume a large quantity of disk space.
The settings for logging apply to every connection that uses the Cloudera ODBC Driver for
Apache Hive, so make sure to disable the feature after you are done using it.
To enable logging:
1. Open the cloudera.hiveodbc.ini configuration file in a text editor.
2. To specify the level of information to include in log files, set the LogLevel property to one
of the following numbers:
3. Set the LogPath key to the full path to the folder where you want to save log files.
4. Set the LogFileCount key to the maximum number of log files to keep.
Note:
After the maximum number of log files is reached, each time an additional file is created,
the driver deletes the oldest log file.
5. Set the LogFileSize key to the maximum size of each log file in bytes.
Note:
After the maximum file size is reached, the driver creates a new file and continues logging.
The Cloudera ODBC Driver for Apache Hive produces the following log files at the location you
specify using the LogPath key:
l A clouderahiveodbcdriver.log file that logs driver activity that is not specific to a
connection.
l A clouderahiveodbcdriver_connection_[Number].log file for each
connection made to the database, where [Number] is a number that identifies each log file.
This file logs driver activity that is specific to the connection.
To disable logging:
1. Open the cloudera.hiveodbc.ini configuration file in a text editor.
2. Set the LogLevel key to 0.
Note:
Settings in the connection string take precedence over settings in the DSN, and settings in the
DSN take precedence over driver-wide settings.
For example, to enable User Name authentication using "cloudera" as the user name, type
the following:
AuthMech=2
UID=cloudera
For detailed information about all the configuration options supported by the driver, see
"Driver Configuration Options" on page 64.
3. Save the cloudera.hiveodbc.ini configuration file.
You can use the iodbctest and iodbctestw utilities to establish a test connection with your driver.
Use iodbctest to test how your driver works with an ANSI application, or use iodbctestw to test
how your driver works with a Unicode application.
Note:
There are 32-bit and 64-bit installations of the iODBC driver manager available. If you have only
one or the other installed, then the appropriate version of iodbctest (or iodbctestw) is available.
However, if you have both 32- and 64-bit versions installed, then you need to make sure that
you are running the version from the correct installation directory.
For more information about using the iODBC driver manager, see http://www.iodbc.org.
You can use the isql and iusql utilities to establish a test connection with your driver and your
DSN. isql and iusql can only be used to test connections that use a DSN. Use isql to test how your
driver works with an ANSI application, or use iusql to test how your driver works with a Unicode
application.
Note:
There are 32-bit and 64-bit installations of the unixODBC driver manager available. If you have
only one or the other installed, then the appropriate version of isql (or iusql) is available.
However, if you have both 32- and 64-bit versions installed, then you need to make sure that
you are running the version from the correct installation directory.
For more information about using the unixODBC driver manager, see http://www.unixodbc.org.
[DataSourceName] is the DSN that you are using for the connection.
Note:
For information about the available options, run isql or iusql without providing a DSN.
Authentication Mechanisms
To connect to a Hive server, you must configure the Cloudera ODBC Driver for Apache Hive to use
the authentication mechanism that matches the access requirements of the server and provides
the necessary credentials. To determine the authentication settings that your Hive server
requires, check the server configuration and then refer to the corresponding section below.
Hive Server 1
You must use No Authentication as the authentication mechanism. Hive Server 1 instances do not
support authentication.
Hive Server 2
Note:
Configuring authentication for a connection to a Hive Server 2 instance involves setting the
authentication mechanism, the Thrift transport protocol, and SSL support. To determine the
settings that you need to use, check the following three properties in the hive-site.xml file in
the Hive server that you are connecting to:
l hive.server2.authentication
l hive.server2.transport.mode
l hive.server2.use.SSL
Use the following table to determine the authentication mechanism that you need to configure,
based on the hive.server2.authentication value in the hive-site.xml file:
hive.server2.authentication Authentication Mechanism
NOSASL No Authentication
KERBEROS Kerberos
Use the following table to determine the Thrift transport protocol that you need to configure,
based on the hive.server2.authentication and hive.server2.transport.mode
values in the hive-site.xml file:
hive.server2.authentication hive.server2.transport.mode Thrift Transport Protocol
To determine whether SSL should be enabled or disabled for your connection, check the
hive.server2.use.SSL value in the hive-site.xml file. If the value is true, then you
must enable and configure SSL in your connection. If the value is false, then you must disable SSL
in your connection.
For detailed instructions on how to configure authentication when using the Windows driver, see
"Configuring Authentication on Windows" on page 12.
For detailed instructions on how to configure authentication when using a non-Windows driver,
see "Configuring Authentication on a Non-Windows Machine" on page 43.
The connection strings in the following sections are examples showing the minimum set of
connection attributes that you must specify to successfully connect to the data source.
Depending on the configuration of the data source and the type of connection you are working
with, you might need to specify additional connection attributes. For detailed information about
all the attributes that you can use in the connection string, see "Driver Configuration Options" on
page 64.
[DataSourceName] is the DSN that you are using for the connection.
You can set additional configuration options by appending key-value pairs to the connection
string. Configuration options that are passed in using a connection string take precedence over
configuration options that are set in the DSN.
The following is the format of a DSN-less connection string that connects to a Hive Server 1
instance:
Driver=Cloudera Hive ODBC Driver;HiveServerType=1;
Host=[Server];Port=[PortNumber];
For example:
Driver=Cloudera Hive ODBC Driver;HiveServerType=1;
Host=192.168.222.160;Port=10000;
The following is the format of a DSN-less connection string for a standard connection to a Hive
Server 2 instance. By default, the driver is configured to connect to a Hive Server 2 instance that
requires User Name authentication, and the driver uses anonymous as the user name.
Driver=Cloudera Hive ODBC Driver;Host=[Server];
Port=[PortNumber];
For example:
Driver=Cloudera Hive ODBC Driver;Host=192.168.222.160;
Port=10000;
The following is the format of a DSN-less connection string that discovers Hive Server 2 services via
the ZooKeeper service.
Driver=Cloudera Hive ODBC Driver;ServiceDiscoveryMode=1;
Host=[Server1]:[PortNumber1], [Server2]:[PortNumber2],
[Server3]:[PortNumber3];ZKNamespace=[Namespace];
For example:
Driver=Cloudera Hive ODBC Driver;ServiceDiscoveryMode=1;
Host=192.168.222.160:10000, 192.168.222.165:10000,
192.168.222.231:10000;ZKNamespace=hiveserver;
The following is the format of a DSN-less connection string for a Hive Server 2 instance that does
not require authentication.
Driver=Cloudera Hive ODBC Driver;Host=[Server];
Port=[PortNumber];AuthMech=0;
For example:
Driver=Cloudera Hive ODBC Driver;Host=192.168.222.160;
Port=10000;AuthMech=0;
The following is the format of a DSN-less connection string that connects to a Hive Server 2
instance requiring Kerberos authentication:
Driver=Cloudera Hive ODBC Driver;Host=[Server];
Port=[PortNumber];AuthMech=1;KrbRealm=[Realm];
KrbHostFQDN=[DomainName];KrbServiceName=[ServiceName];
For example:
Driver=Cloudera Hive ODBC Driver;Host=192.168.222.160;
Port=10000;AuthMech=1;KrbRealm=CLOUDERA;
KrbHostFQDN=localhost.localdomain;KrbServiceName=hive;
Connecting to a Hive Server that Requires User Name And Password Authentication (LDAP)
The following is the format of a DSN-less connection string that connects to a Hive Server 2
instance requiring User Name And Password authentication:
Driver=Cloudera Hive ODBC Driver;Host=[Server];
Port=[PortNumber];AuthMech=3;UID=[YourUserName];
PWD=[YourPassword];
For example:
Driver=Cloudera Hive ODBC Driver;Host=192.168.222.160;
Port=10000;AuthMech=3;UID=cloudera;PWD=cloudera;
Features
For more information on the features of the Cloudera ODBC Driver for Apache Hive, see the
following:
l "SQL Connector for HiveQL" on page 57
l "Data Types" on page 57
l "Catalog and Schema Support" on page 59
l "hive_system Table" on page 59
l "Server-Side Properties" on page 59
l "Get Tables With Query" on page 61
l "Active Directory" on page 61
l "Write-back" on page 61
l "Timestamp Function Support" on page 62
l "Dynamic Service Discovery using ZooKeeper" on page 62
l "Security and Authentication" on page 62
To bridge the difference between SQL and HiveQL, the SQL Connector feature translates standard
SQL-92 queries into equivalent HiveQL queries. The SQL Connector performs syntactical
translations and structural transformations. For example:
l Quoted Identifiers: The double quotes (") that SQL uses to quote identifiers are translated
into back quotes (`) to match HiveQL syntax. The SQL Connector needs to handle this
translation because even when a driver reports the back quote as the quote character,
some applications still generate double-quoted identifiers.
l Table Aliases: Support is provided for the AS keyword between a table reference and its
alias, which HiveQL normally does not support.
l JOIN, INNER JOIN, and CROSS JOIN: SQL JOIN, INNER JOIN, and CROSS JOIN syntax is
translated to HiveQL JOIN syntax.
l TOP N/LIMIT: SQL TOP N queries are transformed to HiveQL LIMIT queries.
Data Types
The Cloudera ODBC Driver for Apache Hive supports many common data formats, converting
between Hive data types and SQL data types.
BIGINT SQL_BIGINT
BINARY SQL_VARBINARY
BOOLEAN SQL_BIT
CHAR(n) SQL_CHAR
Note:
DATE SQL_TYPE_DATE
DECIMAL SQL_DECIMAL
DECIMAL(p,s) SQL_DECIMAL
DOUBLE SQL_DOUBLE
FLOAT SQL_REAL
INT SQL_INTEGER
SMALLINT SQL_SMALLINT
STRING SQL_VARCHAR
Note:
TIMESTAMP SQL_TYPE_TIMESTAMP
TINYINT SQL_TINYINT
VARCHAR(n) SQL_VARCHAR
Note:
The aggregate types (ARRAY, MAP, and STRUCT) are not supported. Columns of aggregate types
are treated as STRING columns.
The interval types (YEAR TO MONTH and DAY TIME) are supported only in query expressions and
predicates. Interval types are not supported as column data types in tables.
hive_system Table
A pseudo-table called hive_system can be used to query for Hive cluster system environment
information. The pseudo-table is under the pseudo-schema called hive_system. The table has two
STRING type columns, envkey and envvalue. Standard SQL can be executed against the hive_
system table. For example:
SELECT * FROM HIVE.hive_system.hive_system WHERE envkey LIKE
'%hive%'
The above query returns all of the Hive system environment entries whose key contains the word
"hive". A special query, set -v, is executed to fetch system environment information. Some
versions of Hive do not support this query. For versions of Hive that do not support querying
system environment information, the driver returns an empty result set.
Server-Side Properties
The Cloudera ODBC Driver for Apache Hive allows you to set server-side properties via a DSN.
Server-side properties specified in a DSN affect only the connection that is established using the
DSN.
You can also specify server-side properties for connections that do not use a DSN. To do this, use
the Cloudera Hive ODBC Driver Configuration tool that is installed with the Windows version of the
driver, or set the appropriate configuration options in your connection string or the
cloudera.hiveodbc.ini file. Properties specified in the driver configuration tool or the
cloudera.hiveodbc.ini file apply to all connections that use the Cloudera ODBC Driver for
Apache Hive.
For more information about setting server-side properties when using the Windows driver, see
"Configuring Server-Side Properties on Windows" on page 21. For information about setting
server-side properties when using the driver on a non-Windows platform, see "Configuring Server-
Side Properties on a Non-Windows Machine" on page 47.
Temporary Tables
The driver supports the creation of temporary tables and lets you insert literal values into
temporary tables. Temporary tables are only accessible by the ODBC connection that created
them, and are dropped when the connection is closed.
The driver supports the following DDL syntax for creating temporary tables:
<create table statement> := CREATE TABLE <temporary table name>
<left paren><column definition list><right paren>
<column definition list> := <column definition>[, <column
definition>]*
<column definition> := <column name> <data type>
<temporary table name> := <double quote><number sign><table
name><double quote>
<left paren> := (
<right paren> := )
<double quote> := "
<number sign> := #
The temporary table name in a SQL query must be surrounded by double quotes ("), and the
name must begin with a number sign (#).
Note:
You can only use data types that are supported by Hive.
The driver supports the following DDL syntax for inserting data into temporary tables:
<insert statement> := INSERT INTO <temporary table name> <left
paren><column name list><right paren> VALUES <left
paren><literal value list><right paren>
<column name list> := <column name>[, <column name>]*
<literal value list> := <literal value>[, <literal value>]*
<temporary table name> := <double quote><number sign><table
name><double quote>
<left paren> := (
<right paren> := )
<double quote> := "
<number sign> := #
The following is an example of a SQL statement for inserting data into temporary tables:
INSERT INTO "#TEMPTABLE1" values (VAL(C1), VAL(C2) ... VAL(Cn) )
VAL(C1) is the literal value for the first column in the table, and VAL(Cn) is the literal value for the
nth column in the table.
Note:
Hive Server 2 has a limit on the number of tables that can be in a database when handling the
GetTables API call. When the number of tables in a database is above the limit, the API call will
return a stack overflow error or a timeout error. The exact limit and the error that appears
depends on the JVM settings.
As a workaround for this issue, enable the Get Tables with Query configuration option or the
GetTablesWithQuery key to use the query instead of the API call.
Active Directory
The Cloudera ODBC Driver for Apache Hive supports Active Directory Kerberos on Windows. There
are two prerequisites for using Active Directory Kerberos on Windows:
l MIT Kerberos is not installed on the client Windows machine.
l The MIT Kerberos Hadoop realm has been configured to trust the Active Directory realm so
that users in the Active Directory realm can access services in the MIT Kerberos Hadoop
realm.
Write-back
The Cloudera ODBC Driver for Apache Hive supports translation for the following syntax when
connecting to a Hive Server 2 instance that is running Hive 0.14 or later:
l INSERT
l UPDATE
l DELETE
l CREATE
l DROP
If the statement contains non-standard SQL-92 syntax, then the driver is unable to translate the
statement to SQL and instead falls back to using HiveQL.
The types of time intervals that are supported for these functions might vary depending on the
Hive server version that you are connecting to. To return a list of the intervals supported for
TIMESTAMPADD, call the SQLGetInfo catalog function using SQL_TIMEDATE_ADD_INTERVALS as
the argument. Similarly, to return a list of the intervals supported for TIMESTAMPDIFF, call
SQLGetInfo using SQL_TIMEDATE_DIFF_INTERVALS as the argument.
Note:
For information about configuring this feature in the Windows driver, see "Creating a Data Source
Name on Windows" on page 7 or "Configuring a DSN-less Connection on Windows" on page 10.
For information about configuring this feature when using the driver on a non-Windows platform,
see "Configuring Service Discovery Mode on a Non-Windows Machine" on page 43.
Note:
In this documentation, "SSL" refers to both TLS (Transport Layer Security) and SSL (Secure
Sockets Layer). The driver supports TLS 1.0, 1.1, and 1.2. The SSL version used for the
connection is the highest version that is supported by both the driver and the server.
The driver provides mechanisms that enable you to authenticate your connection using the
Kerberos protocol, your Hive user name only, or your Hive user name and password. You must
use the authentication mechanism that matches the security requirements of the Hive server. For
information about determining the appropriate authentication mechanism to use based on the
Hive server configuration, see "Authentication Mechanisms" on page 52. For detailed driver
configuration instructions, see "Configuring Authentication on Windows" on page 12 or
"Configuring Authentication on a Non-Windows Machine" on page 43.
It is recommended that you enable SSL whenever you connect to a server that is configured to
support it. SSL encryption protects data and credentials when they are transferred over the
network, and provides stronger security than authentication alone. For detailed configuration
instructions, see "Configuring SSL Verification on Windows" on page 19 or "Configuring SSL
Verification on a Non-Windows Machine" on page 46.
When creating or configuring a connection from a Windows machine, the fields and buttons are
available in the Cloudera Hive ODBC Driver Configuration tool and the following dialog boxes:
l Cloudera ODBC Driver for Apache Hive DSN Setup
l Advanced Options
l Server Side Properties
l SSL Options
l HTTP Properties
Note:
If you are using the driver on a non-Windows machine, you can set driver configuration
properties in a connection string, in a DSN (in the odbc.ini file), or as a driver-wide setting (in
the cloudera.hiveodbc.ini file). Settings in the connection string take precedence over
settings in the DSN, and settings in the DSN take precedence over driver-wide settings.
l "Allow Common Name Host Name l "Invalid Session Auto Recover" on page
Mismatch" on page 65 75
l "Allow Self-Signed Server Certificate" l "Log Level" on page 75
on page 66 l "Log Path" on page 76
l "Apply Properties with Queries" on l "Max File Size" on page 77
page 66
l "Max Number Files" on page 77
l "Binary Column Length" on page 67
l "Mechanism" on page 78
l "Canonicalize Principal FQDN" on page
67 l "Minimum TLS" on page 78
l "Check Certificate Revocation" on page l "Password" on page 78
67 l "Port" on page 79
l "Client Certificate File" on page 68 l "Realm" on page 79
l "Client Private Key File" on page 68 l "Rows Fetched Per Block" on page 79
Description
This option specifies whether a CA-issued SSL certificate name must match the host name of the
Hive server.
l Enabled (1): The driver allows a CA-issued SSL certificate name to not match the host name
of the Hive server.
l Disabled (0): The CA-issued SSL certificate name must match the host name of the Hive
server.
Note:
Description
This option specifies whether the driver allows a connection to a Hive server that uses a self-signed
certificate, even if this certificate is not in the list of trusted certificates. This list is contained in the
Trusted Certificates file, or in the system Trust Store if the system Trust Store is used instead of a
file.
l Enabled (1): The driver authenticates the Hive server even if the server is using a self-signed
certificate that has not been added to the list of trusted certificates.
l Disabled (0): The driver does not allow self-signed certificates from the server unless they
have already been added to the list of trusted certificates.
Note:
Description
Note:
BinaryColumnLength 32767 No
Description
By default, the columns metadata for Hive does not specify a maximum data length for BINARY
columns.
Description
This option specifies whether the Kerberos layer canonicalizes the host FQDN in the server’s
service principal name.
l Enabled (1): The Kerberos layer canonicalizes the host FQDN in the server’s service principal
name.
l Disabled (0): The Kerberos layer does not canonicalize the host FQDN in the server’s service
principal name.
Note:
l This option only affects MIT Kerberos, and is ignored when using Active Directory
Kerberos.
l This option can only be disabled if the Kerberos Realm or KrbRealm key is specified.
CheckCertRevocation No
Description
This option specifies whether the driver checks to see if a certificate has been revoked while
retrieving a certificate chain from the Windows Trust Store.
This option is only applicable if you are using a CA certificate from the Windows Trust Store (see
"Use System Trust Store" on page 84).
l Enabled (1): The driver checks for certificate revocation while retrieving a certificate chain
from the Windows Trust Store.
l Disabled (0): The driver does not check for certificate revocation while retrieving a certificate
chain from the Windows Trust Store.
Note:
ClientCert None No
Description
The full path to the .pem file containing the client's SSL certificate.
Note:
Description
The full path to the .pem file containing the client's SSL private key.
If the private key file is protected with a password, then provide the password using the driver
configuration option "Client Private Key Password" on page 68.
Note:
verification is enabled
and the client's private
key file is protected with
a password.
Description
The password of the private key file that is specified in the Client Private Key File field
(ClientPrivateKey).
Description
This option specifies whether the driver converts server-side property key names to all lower-case
characters.
l Enabled (1): The driver converts server-side property key names to all lower-case
characters.
l Disabled (0): The driver does not modify the server-side property key names.
HDFSTempTableDir /tmp/simba No
Description
The HDFS directory that the driver uses to store the necessary files for supporting the Temporary
Table feature.
Note:
Database
Key Name Default Value Required
Schema default No
Description
The name of the database schema to use when a schema is not explicitly specified in a query. You
can still issue queries on other schemas by explicitly specifying the schema in the query.
Note:
To inspect your databases and determine the appropriate schema to use, at the Hive command
prompt, type show databases.
DecimalColumnScale 10 No
Description
The maximum number of digits to the right of the decimal point for numeric data types.
Default 255 No
StringColumnLength
Description
By default, the columns metadata for Hive does not specify a maximum length for STRING
columns.
DelegateKrbCreds 0 No
Description
This option specifies whether your Kerberos credentials are forwarded to the server and used for
authentication.
Note:
Delegation UID
Key Name Default Value Required
DelegationUID None No
Description
If a value is specified for this setting, the driver delegates all operations against Hive to the
specified user, rather than to the authenticated user for the connection.
Note:
This option is applicable only when connecting to a Hive Server 2 instance that supports this
feature.
Description
This option specifies whether driver-wide configuration settings take precedence over connection
and DSN settings.
l Enabled (1): Driver-wide configurations take precedence over connection and DSN settings.
l Disabled (0): Connection and DSN settings take precedence instead.
Description
This option specifies whether the driver attempts to automatically reconnect to the server when a
communication link error occurs.
l Enabled (1): The driver attempts to reconnect.
l Disabled (0): The driver does not attempt to reconnect.
Enable SSL
Key Name Default Value Required
Description
This option specifies whether the client uses an SSL encrypted connection to communicate with
the Hive server.
l Enabled (1): The client communicates with the Hive server using SSL.
l Disabled (0): SSL is disabled.
SSL is configured independently of authentication. When authentication and SSL are both
enabled, the driver performs the specified authentication method over an SSL connection.
Note:
l This option is applicable only when connecting to a Hive server that supports SSL.
l If you selected User Name as the authentication mechanism, SSL is not available.
Description
This option specifies whether the driver supports the creation and use of temporary tables.
l Enabled (1): The driver supports the creation and use of temporary tables.
l Disabled (0): The driver does not support temporary tables.
Important:
When connecting to Hive 0.14 or later, the Temporary Tables feature is always enabled and you
do not need to configure it in the driver.
Fast SQLPrepare
Key Name Default Value Required
Description
This option specifies whether the driver defers query execution to SQLExecute.
l Enabled (1): The driver defers query execution to SQLExecute.
l Disabled (0): The driver does not defer query execution to SQLExecute.
Note:
When using Native Query mode, the driver executes the HiveQL query to retrieve the result set
metadata for SQLPrepare. As a result, SQLPrepare might be slow. If the result set metadata is
not required after calling SQLPrepare, then enable Fast SQLPrepare.
Description
This option specifies whether the driver uses the SHOW TABLES query or the GetTables Thrift API
call to retrieve table names from the database.
l Enabled (1): The driver uses the SHOW TABLES query to retrieve table names.
l Disabled (0): The driver uses the GetTables Thrift API call to retrieve table names.
Note:
HDFS User
Key Name Default Value Required
HDFSUser hdfs No
Description
The name of the HDFS user that the driver uses to create the necessary files for supporting the
Temporary Tables feature.
Description
Note:
If Service Discovery Mode is enabled, then connections to Hive Server 1 are not supported.
Host(s)
Key Name Default Value Required
Description
If Service Discovery Mode is disabled, the IP address or host name of the Hive server.
If Service Discovery Mode is enabled, specify a comma-separated list of ZooKeeper servers in the
following format, where [ZK_Host] is the IP address or host name of the ZooKeeper server and
[ZK_Port] is the number of the TCP port that the ZooKeeper server uses to listen for client
connections:
[ZK_Host1]:[ZK_Port1],[ZK_Host2]:[ZK_Port2]
Host FQDN
Key Name Default Value Required
KrbHostFQDN _HOST No
Description
When the value of Host FQDN is _HOST, the driver uses the Hive server host name as the fully
qualified domain name for Kerberos authentication. If Service Discovery Mode is disabled, then
the driver uses the value specified in the Host connection attribute. If Service Discovery Mode is
enabled, then the driver uses the Hive Server 2 host name returned by ZooKeeper.
HTTP Path
Key Name Default Value Required
Description
The driver forms the HTTP address to connect to by appending the HTTP Path value to the host
and port specified in the DSN or connection string. For example, to connect to the HTTP address
http://localhost:10002/gateway/sandbox/hive/version, you would set HTTP
Path to /gateway/sandbox/hive/version.
Note:
Description
This option specifies whether the driver automatically opens a new session when the existing
session is no longer valid.
l Enabled (1): The driver automatically opens a new session when the existing session is no
longer valid.
l Disabled (0): The driver does not automatically open new sessions.
Note:
Log Level
Key Name Default Value Required
Description
Use this property to enable or disable logging in the driver and to specify the amount of detail
included in log files.
Important:
l Only enable logging long enough to capture an issue. Logging decreases performance and
can consume a large quantity of disk space.
l The settings for logging apply to every connection that uses the Cloudera ODBC Driver for
Apache Hive, so make sure to disable the feature after you are done using it.
l This option is not supported in connection strings. To configure logging for the Windows
driver, you must use the Logging Options dialog box. To configure logging for a non-
Windows driver, you must use the cloudera.hiveodbc.ini file.
When logging is enabled, the driver produces the following log files at the location you specify in
the Log Path (LogPath) property:
l A clouderahiveodbcdriver.log file that logs driver activity that is not specific to a
connection.
l A clouderahiveodbcdriver_connection_[Number].log file for each
connection made to the database, where [Number] is a number that identifies each log file.
This file logs driver activity that is specific to the connection.
Log Path
Key Name Default Value Required
Description
The full path to the folder where the driver saves log files when logging is enabled.
Important:
This option is not supported in connection strings. To configure logging for the Windows driver,
you must use the Logging Options dialog box. To configure logging for a non-Windows driver,
you must use the cloudera.hiveodbc.ini file.
LogFileSize 20971520 No
Description
The maximum size of each log file in bytes. After the maximum file size is reached, the driver
creates a new file and continues logging.
If this property is set using the Windows UI, the entered value is converted from megabytes (MB)
to bytes before being set.
Important:
This option is not supported in connection strings. To configure logging for the Windows driver,
you must use the Logging Options dialog box. To configure logging for a non-Windows driver,
you must use the cloudera.hiveodbc.ini file.
LogFileCount 50 No
Description
The maximum number of log files to keep. After the maximum number of log files is reached, each
time an additional file is created, the driver deletes the oldest log file.
Important:
This option is not supported in connection strings. To configure logging for the Windows driver,
you must use the Logging Options dialog box. To configure logging for a non-Windows driver,
you must use the cloudera.hiveodbc.ini file.
Mechanism
Key Name Default Value Required
Description
Select one of the following settings, or set the key to the corresponding number:
l No Authentication (0)
l Kerberos (1)
l User Name (2)
l User Name And Password (3)
Minimum TLS
Key Name Default Value Required
Description
The minimum version of TLS/SSL that the driver allows the data store to use for encrypting
connections. For example, if TLS 1.1 is specified, TLS 1.0 cannot be used to encrypt connections.
l TLS 1.0 (1.0): The connection must use at least TLS 1.0.
l TLS 1.1 (1.1): The connection must use at least TLS 1.1.
l TLS 1.2 (1.2): The connection must use at least TLS 1.2.
Password
Key Name Default Value Required
Description
The password corresponding to the user name that you provided in the User Name field (the UID
key).
Port
Key Name Default Value Required
Description
The number of the TCP port that the Hive server uses to listen for client connections.
Realm
Key Name Default Value Required
Description
If your Kerberos configuration already defines the realm of the Hive Server 2 host as the default
realm, then you do not need to configure this option.
RowsFetchedPerBlock 10000 No
Description
Valid values for this setting include any positive 32-bit integer. However, testing has shown that
performance gains are marginal beyond the default value of 10000 rows.
N/A Selected No
Description
This option is available only in the Windows driver. It appears in the Cloudera ODBC Driver for
Apache Hive DSN Setup dialog box and the SSL Options dialog box.
Important:
The password is obscured (not saved in plain text). However, it is still possible for the encrypted
password to be copied and used.
Description
This option specifies whether the driver uses the ZooKeeper service.
l Enabled (1): The driver discovers Hive Server 2 services via the ZooKeeper service.
l Disabled (0): The driver connects to Hive without using a discovery service.
Service Name
Key Name Default Value Required
KrbServiceName None No
Description
Description
This option specifies whether the driver returns the hive_system table for catalog function calls
such as SQLTables and SQLColumns.
l Enabled (1): The driver returns the hive_system table for catalog function calls such as
SQLTables and SQLColumns.
l Disabled (0): The driver does not return the hive_system table for catalog function calls.
Socket Timeout
Key Name Default Value Required
SocketTimeout 60 No
Description
The number of seconds that an operation can remain idle before it is closed.
Note:
This option is applicable only when asynchronous query execution is being used against Hive
Server 2 instances.
TempTableTTL 10 No
Description
The number of minutes a temporary table is guaranteed to exist in Hive after it is created.
Thrift Transport
Key Name Default Value Required
Description
Select one of the following settings, or set the key to the number corresponding to the desired
setting:
l Binary (0)
l SASL (1)
l HTTP (2)
Note:
For information about how to determine which Thrift transport protocols your Hive server
supports, see "Authentication Mechanisms" on page 52.
Trusted Certificates
Key Name Default Value Required
Description
The full path of the .pem file containing trusted CA certificates, for verifying the server when using
SSL.
If this option is not set, then the driver defaults to using the trusted CA certificates .pem file
installed by the driver.
Important:
If you are connecting from a Windows machine and the Use System Trust Store option is
enabled, the driver uses the certificates from the Windows trust store instead of your specified
.pem file. For more information, see "Use System Trust Store" on page 84.
Two-Way SSL
Key Name Default Value Required
Description
l Disabled (0): The server does not verify the client. Depending on whether one-way SSL is
enabled, the client might verify the server. For more information, see "Enable SSL" on page
72.
Note:
This option is applicable only when connecting to a Hive server that supports SSL. You must
enable SSL before Two Way SSL can be configured. For more information, see "Enable SSL" on
page 72.
Description
This option specifies the SQL types to be returned for string data types.
l Enabled (1): The driver returns SQL_WVARCHAR for STRING and VARCHAR columns, and
returns SQL_WCHAR for CHAR columns.
l Disabled (0): The driver returns SQL_VARCHAR for STRING and VARCHAR columns, and
returns SQL_CHAR for CHAR columns.
Description
Note:
This option only takes effect when connecting to a Hive cluster running Hive 0.12.0 or higher.
Description
This option specifies whether the driver uses native HiveQL queries, or converts the queries
emitted by an application into an equivalent form in HiveQL. If the application is Hive-aware and
already emits HiveQL, then enable this option to avoid the extra overhead of query
transformation.
l Enabled (1): The driver does not transform the queries emitted by an application, and
executes HiveQL queries directly.
l Disabled (0): The driver transforms the queries emitted by an application and converts
them into an equivalent form in HiveQL.
Important:
When this option is enabled, the driver cannot execute parameterized queries.
Description
This option specifies how the driver handles Kerberos authentication: either with the SSPI plugin
or with MIT Kerberos.
l Enabled (1): The driver handles Kerberos authentication by using the SSPI plugin instead of
MIT Kerberos by default.
l Disabled (0): The driver uses MIT Kerberos to handle Kerberos authentication, and only
uses the SSPI plugin if the GSSAPI library is not available.
Important:
Description
This option specifies whether to use a CA certificate from the system trust store, or from a
specified .pem file.
l Enabled (1): The driver verifies the connection using a certificate in the system trust store.
l Disabled (0): The driver verifies the connection using a specified .pem file. For information
about specifying a .pem file, see "Trusted Certificates" on page 82.
Note:
User Name
Key Name Default Value Required
Description
Description
The host name or IP address of the machine hosting both the namenode of your Hadoop cluster
and the WebHDFS service.
WebHDFSPort 50070 No
Description
ZooKeeper Namespace
Key Name Default Value Required
Description
The namespace on ZooKeeper under which Hive Server 2 znodes are added.
DelegationUserIDCase
Key Name Default Value Required
DelegationUserIDCase Unchanged No
Description
This option specifies whether the driver changes the Delegation UID (or DelegationUID) value
to all upper-case or all lower-case. The following values are supported:
l Upper: Change the delegated user name to all upper-case.
l Lower: Change the delegated user name to all lower-case.
l Unchanged: Do not modify the delegated user name.
For more information about delegating a user name, see "Delegation UID" on page 71.
Driver
Key Name Default Value Required
Description
On Windows, the name of the installed driver (Cloudera ODBC Driver for Apache
Hive).
On other platforms, the name of the installed driver as specified in odbcinst.ini, or the
absolute path of the driver shared object file.
HTTPAuthCookies
Key Name Default Value Required
HTTPAuthCookies hive.server2.auth, No
JSessionID
Description
If cookie-based authentication is enabled in your server, the driver authenticates the connection
once based on the provided authentication credentials. It then uses the cookie generated by the
server for each subsequent request in the same connection.
http.header.
Key Name Default Value Required
http.header None No
Description
Set a custom HTTP header by using the following syntax, where [HeaderKey] is the name of the
header to set and [HeaderValue] is the value to assign to the header:
http.header.[HeaderKey]=[HeaderValue]
For example:
http.header.AUTHENTICATED_USER=john
After the driver applies the header, the http.header. prefix is removed from the DSN entry, leaving
an entry of [HeaderKey]=[HeaderValue]
The example above would create the following custom HTTP header:
AUTHENTICATED_USER: john
Note:
The http.header. prefix is case-sensitive. This option is applicable only when you are using
HTTP as the Thrift transport protocol. For more information, see "Thrift Transport" on page 81.
SSP_
Key Name Default Value Required
SSP_ None No
Description
Set a server-side property by using the following syntax, where [SSPKey] is the name of the server-
side property and [SSPValue] is the value for that property:
SSP_[SSPKey]=[SSPValue]
For example:
SSP_mapred.queue.names=myQueue
After the driver applies the server-side property, the SSP_ prefix is removed from the DSN entry,
leaving an entry of [SSPKey]=[SSPValue].
Note:
ODBC compliance levels are Core, Level 1, and Level 2. These compliance levels are defined in the
ODBC Specification published with the Interface SDK from Microsoft.
Interfaces include both the Unicode and non-Unicode versions. For more information, see
"Unicode Function Arguments" in the ODBC Programmer's Reference:
http://msdn.microsoft.com/en-us/library/ms716246%28VS.85%29.aspx.
SQLGetConnectAttr
Core Core SQLPrimaryKeys
SQLGetCursorName
Core Core SQLGetInfo
SQLProcedureColumns
Core SQLGetData Level 1
Core SQLGetFunctions
Contact Us
If you are having difficulties using the driver, our Community Forum may have your solution. In
addition to providing user to user support, our forums are a great place to share your questions,
comments, and feature requests with us.
If you are a Subscription customer you may also use the Cloudera Support Portal to search the
Knowledge Base or file a Case.
Important:
To help us assist you, prior to contacting Cloudera Support please prepare a detailed summary
of the client and server environment including operating system version, patch level, and
configuration.