Unity Catalog API Specification (ext) (go_uc_api-docs)
Unity Catalog API Specification (ext) (go_uc_api-docs)
Table of Contents
Document History
Table of Contents
Terminology
User Types
Metastore Admins
Account Admins
Client Types
API Conventions
Public APIs
Metastore CRUD API
Object Models
RPC Endpoints
Endpoint Behavior Notes
Authorization
Metastore Summary API
Object Model
RPC Endpoints
Metastore Assignment API
Object Models
RPC Endpoints
Inputs
Endpoint Behavior Notes
Authorization
Page 2 of 38
Permissions API
Terminology and Permissions Management Model
SQL Object Privileges
Object Models
Changing Ownership
RPC Endpoints
Inputs
Authorization, Error Responses
User-Info API
Object Models
RPC Endpoints
Inputs
Temporary Credential API
Table Operations
Path Operations
Object Models
RPC Endpoints
Page 4 of 38
Terminology
User Types
All users that access Unity Catalog APIs must be account-level users. They must also be
added to the relevant Databricks Workspace (in order to obtain a PAT token used to access
the UC API server).
Metastore Admins
Metastore Admins can manage the privileges for all securable objects inside a metastore,
such as who can create catalogs or query a table. The Metastore Admins for a given
Metastore are users who are either:
● an Account Admin
● the configured owner of the Metastore in question
Note that a Metastore Admin may or may not be a Workspace Admin for a given workspace
(i.e., being a Workspace Admin does not automatically make the user a Metastore Admin).
Account Admins
An Account Admin is an account-level user with the “Account Owner” role configured in the
Accounts Console. Creating and updating a Metastore can only be done by an Account
Admin. An Account Admin can specify other users to be Metastore Admins by changing the
Metastore’s owner (using updateMetastore endpoint).
Client Types
The Unity Catalog’s API server is accessed by three types of clients:
● NoPE clusters: clients emanating from DBR clusters that support UC and are
Non-Permissions-Enforcing. Currently, the only DBR clusters of this type are those
with Security Mode = “Single User”). These clients authenticate with an
internally-generated token that contains a unity_catalog:cluster scope
(though not the unity_catalog:permission_enforcing scope). For these
clients, the Unity Catalog’s API service enforces access control requirements of the
Unity Catalog Data Governance Model filter data and sends results filtered by the
client user’s permissions.
● External clients: all other clients that are not PE clusters or NoPE clusters. These
clients authenticate with ‘external’ tokens (e.g., PAT tokens obtained from a
Workspace) rather than tokens generated internally for DBR clusters. This includes
client’s using the databricks-cli’s unity-catalog commands to access the UC API.
As with NoPE cluster clients, the UC API endpoints available to these clients also
enforces access control requirements on the server side.
API Conventions
● To simplify management of API message types, the *Info messages are used for
both input (to create*, update* and delete* endpoints) and output (from
create*, get*, list* and update* endpoints). The fields are marked with
REQ/OPT/IGN labels to specify whether they are REQuired, OPTional or IGNored for
each operation (organized by table column in the specification).
● All name fields are UTF-8 strings, initially created by users and visible to users
thereafter. These object names are supplied by users in SQL commands (e.g.,
“CREATE TABLE something ...” or via direct access to the UC API.
Names supplied by users are converted to lower-case by DBR clients (before they
are sent to the UC API) . Also, input names (for all object types except Table Column
Names) are converted to lower-case by the UC server, to handle the case that UC
objects are created via directly accessing the UC API. With this conversion to
Page 6 of 38
SQL objects are referenced by their full name in the RESTful API URIs, and since
these names are UTF-8 they must be URL-encoded. For example, the request URI
for a table with full name “SomeCÄt.SømeSchëma.テーブル” will be:
/api/2.0/unity-catalog/tables/SomeC%C3%84t.S%C3%B8meSch%C3%ABma.%E3%83%86%E3%83
%BC%E3%83%96%E3%83%AB
● All principals (users and groups) are referenced by their user/group name strings,
not by the User IDs (int64s) used internally by Databricks control plane services.
This means that in the UC API, users are referenced by their email address (e.g.,
“[email protected]”) while groups are referenced by their group names
(e.g., “account users”).
● There is no list of child objects within the *Info structures (e.g., SchemaInfo does
not include a field containing the list of tables within the schema). Getting a list of
child objects requires performing a list* operation on the child object type with
the query arguments specifying the parent identifier (e.g., GET
<prefix>/tables?schema_name=<some_parent_schema_name>)
● The details of error responses are to be specified, but the general form of error the
response body is:
Page 7 of 38
{
"error_code": "<error_code>",
"message": "<brief user-readable description"
}
The specific error_code values used by each endpoint will be detailed later.
Properties
The Catalog, Schema and Table objects each have a properties field, which is an
opaque list of key-value pairs. This list allows for future extension or customization of the
object’s configuration.
Field Name Type Description
Public APIs
The API endpoints in this section are for use by NoPE and External clients; that is, they are
not limited to PE clients. These API endpoints enforce permissions on Unity Catalog
objects so that the client user only has access to objects to which they have permission.
● "INTERNAL"
Page 8 of 38
● "EXTERNAL"
Internal and External Delta Sharing enabled on metastore. This allows all flavors of
Delta Sharing.
MetastoreInfo:
storage_root string (url) REQ ERR Metastore storage root path. On creation, the
new metastore’s ID (UUID) is appended to the
provided storage_root, so the output
storage_root is not the same as the input
storage_root.
storage_root_credential_id string (uuid) IGN OPT Unique identifier of the Storage Credential used
by default to access the storage_root area of
cloud storage.
data_sharing_enabled bool IGN OPT Whether delta sharing is enabled for this
Metastore (default: false)
delta_sharing_recipient_token int32 IGN OPT The lifetime of delta sharing recipient token in
_lifetime_in_seconds seconds (no default; must be specified when
data_sharing_enabled is set to true).
delta_sharing_organization_na string IGN OPT The organization name of a Delta Sharing entity.
me The name will be used in
Databricks-to-Databricks Delta Sharing as the
official name.
1
On Create, the new object’s owner field is set to the username of the user performing the
operation.
Page 9 of 38
Output-only:
cloud string IGN IGN Cloud vendor of Metastore home shard, e.g. “aws”,
“azure”
region string IGN IGN Cloud region of the Metastore home shard, e.g.
“us-west-2”, “westus”
global_metastore_id string IGN IGN Globally unique metastore ID across clouds and
regions. E.g.,
“aws:us-east-1:8dd1e334-c7df-44c9-a359-
f86f9aae8919”
updated_by string IGN IGN Username of user who last modified metastore
DeleteMetastoreOpts:
Field Name Type Req? Description
force bool OPT Default: false. When false, the deletion fails when the specified Metastore
is non-empty (contains non-deleted Catalogs, DataAccessConfigurations,
Shares or Recipients). When set to true, the specified Metastore is deleted
regardless of its contents.
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
All Metastore Admin CRUD API endpoints are restricted to Metastore Admins.
listMetastores Output
The listMetastores endpoint does not list all Metstores that exist in the customer
account. Instead it restricts the list by what the Workspace (as determined by the client’s
PAT token) can access. Effectively, this means that the output will either be an empty list (if
no Metastore is assigned to the Workspace) or a list containing a single Metastore (the one
assigned to the Workspace).
Object Model
MetastoreSummaryInfo
default_data_access_config_id string (uuid) Unique identifier of the DAC for accessing table
[DEPRECATED] data in cloud storage
RPC Endpoints
HTTP Method URI Endpoint Name Output
metastore_id String (uuid) REQ OPT REQ Unique identifier for metastore
default_catalog_name string REQ OPT IGN Default catalog used for this assignment
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
Inputs
The workspace_id path parameter is an int64 number, the unique identifier of the
workspace.
● updateMetastoreAssignment
This endpoint can be used to update metastore_id and / or
default_catalog_name for a specified workspace, if workspace is already
assigned a Metastore.
● There are no UC API endpoints for reading or listing Metastore Assignments (per
workspace) currently.
Authorization
role_arn string REQ OPT The Amazon Resource Name (ARN) of the AWS IAM
role for S3 data access
Output-only:
unity_catalog_iam_arn string IGN IGN The Amazon Resource Name (ARN) of the AWS IAM
user managed by Databricks. This is the identity that
is going to assume the AWS IAM role.
external_id string IGN IGN The external ID used in role assumption to prevent
confused deputy problems.
AzureServicePrincipal
directory_id string REQ OPT The directory ID corresponding to the Azure Active
Directory (AAD) tenant of the application
client_secret string REQ OPT The client secret generated for the above app ID in
AAD. This field is redacted on output.
GcpServiceAccountKey
private_key_id string REQ OPT The ID of the service account's private key
private_key string REQ OPT The service account's RSA private key. This field is
redacted on output.
DeleteStorageCredentialOpts:
Field Name Type Req? Description
force bool OPT Default: false. When false, the deletion fails when the specified Storage
Credential has dependent External Locations or external tables. When set to
true, the specified Storage Credential is deleted regardless of its
dependencies.
StorageCredentialInfo
2
On Create, the new object’s owner field is set to the username of the user performing the
operation.
Page 14 of 38
Output-only:
metastore_id string (uuid) IGN IGN Unique identifier of the parent Metastore
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
The deleteStorageCredntial endpoint requires that the user is an owner of the Storage
Credential.
force bool OPT Default: false. When false, the deletion fails when the specified External
Location has dependent external tables. When set to true, the specified
External Location is deleted regardless of its dependencies.
ListFilesReq
Field Name Type Req? Description
credential_name string OPT Name of Storage Credential to use for accessing the
URL
FileInfo
Field Name Type Description
ListFilesResp
ExternalLocationInfo
url string (url) REQ OPT Path URL in cloud storage, of the form:
AWS: "s3://bucket-host/[bucket-dir]"
Azure: "abfss://host/[path]"
GCP: "gs://bucket-host/[path]"
credential_name string REQ OPT Name of the Storage Credential to use with
this External Location
3
On Create, the new object’s owner field is set to the username of the user performing the
operation.
Page 17 of 38
Output-only:
credential_id string (uuid) IGN IGN Unique identifier of the External Location
[DEPRECATED]
metastore_id string (uuid) IGN IGN Unique identifier of the parent Metastore
updated_by string IGN IGN Username of user who last updated External
Location
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
The storage url for an External Location must not conflict with other External Locations
or external Tables. Specifically,
● The External Location’s url cannot overlap with (be a child of, a parent of, or the
same as) the url of another External Location
● The External Location’s url cannot be within (a child of or the same as) the url of
an external Table
Authorization
provider_name string OPT IGN For Delta Sharing Catalogs: the name of
the delta sharing provider
share_name string OPT IGN For Delta Sharing Catalogs: the name of
the share under the share provider
Output-only:
metastore_id string (uuid) IGN IGN Unique identifier of the parent Metastore
RPC Endpoints
HTTP Method URI Endpoint Name Input Output
4
On Create, the new object’s owner field is set to the username of the user performing the
operation.
Page 20 of 38
The deleteCatalog endpoint requires that the user is an owner of the Catalog.
Output-only:
metastore_id string (uuid) IGN IGN Unique identifier of the parent Metastore
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
Inputs
q_args:
5
On Create, the new object’s owner field is set to the username of the user performing the
operation.
Page 22 of 38
All *Schema endpoints require that the user have access to the parent Catalog. This
means the user either
1. is a Metastore admin
2. is the owner of the parent Catalog
3. has the USAGE privilege on the parent Catalog
All of the requirements below are in addition to this requirement of access to the parent
Catalog.
The deleteSchema endpoint requires that the user is an owner of the Schema or an owner
of the parent Catalog.
Page 23 of 38
Object Models
StagingTableInfo:
Output-only:
id string (uuid) IGN Unique identifier for staging table which would be
promoted to be actual table id
staging_location string IGN Storage root URL generated for the staging table
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
storage_location string REQ* REQ* URL of storage location for Table data
(* REQ for EXTERNAL Tables. For
Managed Tables, if the path is
provided it needs to be a Staging Table
path that has been generated through
the Sttaging Table API, otherwise
should be empty)
storage_credential_name string OPT IGN For EXTERNAL Tables only: the name
of storage credential to use (may not
be changed via UpdateTable
endpoint).
view_definition string REQ for OPT SQL text defining the view (for
View table_type == "VIEW")
Output-only:
6
On Create, the new object’s owner field is set to the username of the user performing the
operation.
Page 25 of 38
Table Type
The supported values of the table_type field (within a TableInfo) are the following
strings:
● "MANAGED"
● "EXTERNAL"
● "VIEW"
● "INTERVAL"
● "ARRAY"
● "STRUCT"
● "MAP"
● "CHAR"
● "NULL"
ColumnInfo
type_name string REQ OPT Name of (outer) type; see Column Type Name above
type_text string REQ OPT Column type spec (with metadata) as SQL text
type_json string REQ OPT Column type spec (with metadata) as JSON string
type_scale int32 OPT OPT Digits to right of decimal; applies to DECIMAL columns
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
Inputs
q_args:
schema_name string REQ Name of the parent schema relative to its parent
catalog
Both the catalog_name and schema_name arguments to the listTables endpoint are
required. To list Tables in multiple Schemas (within the same Catalog) in a paginated, “bulk”
fashion, see the listTableSummaries API below.
The createTable endpoint requires that the user meets all of the following requirements:
1. has ownership or the USAGE privilege on both the parent Catalog and Schema
(regardless of Metastore admin status)
2. has ownership or the CREATE privilege on the parent Schema
If the new table has table_type of “EXTERNAL” the user must either be a Metastore admin
or meet the permissions requirement of the Storage Credential and/or External Location
Page 28 of 38
The getTable endpoint requires that the user either is a Metastore admin or meets all of
the following requirements:
1. has ownership or the USAGE privilege on both the parent Catalog and Schema
2. has ownership or the SELECT privilege on the requested Table
In the case that the Table name is changed, updateTable also requires that the user have
the CREATE privilege on the parent Schema (even if the user is a Metastore admin).
In the case that the Table has table_type of “VIEW” and the owner field is being changed,
the updateTable endpoint requires that the user is a member of the new owner.
ListTableSummaries API
Object Models
TableSummariesReq
catalog_name string REQ Name of parent Catalog for Schemas and Tables of
interest
max_results int32 OPT Maximum number of tables to return (i.e., the page
length); defaults to 1000
page_token string OPT Opaque token to send for the next page of results
TableSummary
TableSummariesResp
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
1. TableSummary’s for all Tables (within the current Metastore and parent Catalog and
Schema), when the user is a Metastore admin
2. TableSummary’s for all Tables and Schemas (within the current Metastore and
parent Catalog) for which the user has ownership or the SELECT privilege on the
Table and ownership or USAGE privilege on the Schema, provided that the user also
has ownership or the USAGE privilege on the parent Catalog
Permissions API
Terminology and Permissions Management Model
The Databricks Permissions API manages the Permission Level (e.g., "CAN_USE",
"CAN_MANAGE"), a scalar value that users have for the various object types (Notebooks,
Jobs, Tokens, etc.). For the objects managed by Unity Catalog, principals (users or groups)
may have a collection of permissions that do not organize consistently into levels, as they
are independent abilities. For example, a given user may have the ability to MODIFY a
Schema but that ability does not imply the user’s ability to CREATE Tables within that
Schema, nor vice-versa.
Though the nomenclature may not be industry-standard, we define the following terms:
● Principal: a username (email address) or group name (including the special group
“account users”, to which all users belong)
The supported privilege values on Metastore SQL Objects (Catalogs, Schemas, Tables) are
the following strings:
● "USAGE"
● "SELECT"
● "MODIFY"
● "CREATE"
External Locations and Storage Credentials support the following privileges:
● "READ_FILES"
● "WRITE_FILES"
● "CREATE_TABLE"
Note there is no "ALL" privilege. It is the responsibility of the API client to translate the set
of all privileges to/from the "ALL" alias.
Object Models
PrivilegeAssignment
The PrivilegesAssignment type maps a single principal to the privileges assigned to
that principal.
Field Name Type Description
PermissionsList
The PermissionsList message type is used to list all permissions on a given securable. It
maps each principal to their assigned privileges.
Field Name Type Description
An example PermissionsList:
{
"privilege_assignments": [
{
"principal": "[email protected]",
"privileges": ["SELECT"]
},`
Page 32 of 38
{
"principal": "eng-data-security",
"privileges": ["SELECT","MODIFY","CREATE"]
},
{
"principal": "users",
"privileges": ["USAGE"]
}
]
}
PermissionsChange
The PermissionsChange type specifies the privileges to add to and/or remove from a
single principal.
Field Name Type Description
PermissionsDiff
The PermissionsDiff message type specifies a list of changes to make to a securable’s
permissions.
Field Name Type Description
An example PermissionsDiff:
{
"changes": [
{
"principal": "[email protected]",
"add": ["SELECT"],
"remove": ["MODIFY"]
},
Page 33 of 38
{
"principal": "eng-data-security",
"remove": ["CREATE"]
},
{
"principal": "users",
"add": ["USAGE"]
}
]
}
Changing Ownership
A special case of a permissions change is a change of ownership. This corresponds to the
SQL command “ALTER <securable_type> <securable_name> OWNER to
<principal>” and is subject to the restrictions described in the Governance Model.
Changing ownership is done by invoking the update<Securable> endpoint with input that
includes the “owner” field containing the username/groupname of the new owner. To be
clear, this ownership change does not involve calling the Permissions API.
In the near future, there may be an “OWN” privilege added to the privileges supported by
UC.
RPC Endpoints
The Unity Catalog Permissions APIs applies to multiple securable types, with the following
securable identifier (sec_full_name) fields:
Examples:
GET /api/2.0/unity-catalog/permissions/catalog/some_cat
PUT /api/2.0/unity-catalog/permissions/table/some_cat.other_schema.my_table
Inputs
q_args:
principal string OPT Principal of interest (only return permissions for this
user/group)
● updatePermissions, replacePermissions:
○ If the client user is not the owner of the securable or a Metastore admin, the
endpoint will return a 403 with the error body:
{
"error_code": "UNAUTHORIZED",
"message": "Users can only grant or revoke schema and table
permissions."
}
User-Info API
Object Models
GetMyInfoResp:
GetMyGroupsResp:
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method
Inputs
q_args:
Path Operations
The supported values for the operation fields of the
GenerateTemporaryPathCredentialReq message are:
● "PATH_READ" – for read-only access to data in cloud storage path
● "PATH_READ_WRITE" – for read and write access to data in cloud storage path
● "PATH_CREATE_TABLE" – for table creation with cloud storage path
Object Models
AwsCredentials:
access_key_id string The access key ID that identifies the temporary credentials
secret_access_key string The secret access key that can be used to sign AWS API requests
session_token string The token that users must pass to AWS API to use the temporary
credentials
AzureUserDelegationSAS:
sas_token string The signed URI (SAS Token) used to access blob services for a
given path
Page 37 of 38
GcpOauthToken:
TemporaryCredentials:
GenerateTemporaryTableCredentialReq:
GenerateTemporaryPathCredentialReq:
credential_id string (uuid) Unique ID of the Storage Credential to use to obtain the
temporary credential
RPC Endpoints
HTTP URI Endpoint Name Input Output
Method