Metadata Retrieval with Unity Catalog REST API - Quick Start (ext)
Metadata Retrieval with Unity Catalog REST API - Quick Start (ext)
Introduction
References
Introduction
This document works as a quick start guidance for ISV partners (enterprise catalog or similar) to
leverage Unity Catalog REST APIs to retrieve metadata from the Databricks environment. We
use Postman for testing and creating examples with Unity Catalog APIs.
Unity Catalog’s security model is based on standard ANSI SQL, allowing administrators to grant
permissions at the level of databases, tables, views, rows and columns in their existing data lake
using familiar syntax. Unity Catalog enforces these permissions in every programming language
and every part of the Databricks platform (notebooks, jobs, SQL endpoints, etc). Moreover, the
same Unity Catalog metastore can be attached to multiple Databricks workspaces so that they
share table definitions and permissions. You can also use Unity Catalog alongside an existing
Apache Hive metastore, so you do not need to move all your metadata to it right away.
Finally, Unity Catalog includes the ability to share data securely across organizations through Delta
Sharing, an open source standard for data exchange. Delta Sharing allows a data provider to share
data with recipients regardless of the platform they are using -- for example, the recipients can
query the shared data in pandas or Apache Spark, even if they are not using Databricks.
Administrators can configure Delta Sharing as described later in this document.
Within a metastore, Unity Catalog provides a 3-level namespace for organizing data: catalogs,
schemas (also called databases), and tables / views. Tables can either be created in a default
“managed” location for the metastore, or they can be “external” tables that refer to data already
present in a cloud storage system like S3.
Administrators can grant permissions on objects in the metastore to users and groups in their
organization using SQL GRANT statements or the Unity Catalog REST API, allowing different users
to read, modify, or administer parts of the namespace. Groups can be synchronized from an
identity provider such as Active Directory. All accesses to data in the Unity Catalog are audited and
linked to the corresponding user.
For Databricks users who already use the Apache Hive metastore available in each workspace, or
an external Hive metastore, Unity Catalog is additive: the workspace’s Hive metastore becomes one
catalog within the 3-layer namespace (called “hive_metastore”), and other catalogs use Unity
Catalog.
Finally, each metastore is configured with an Amazon IAM role to access the underlying cloud
storage for its data. User code in Databricks never gets access to the raw access credentials -- the
Unity Catalog only gives user code scoped access tokens for parts of the data that each user is
allowed to access.
Instructions
Assumptions
● Postman installed
● Metastore exists, with metastoreId = XXXXXX (supplied as variable in collection)
● Familiar with the key concepts of Unity Catalog
Setup
1. Obtain a personal access token (PAT) from an existing workspace which has Unity
Catalog enabled. See
https://docs.databricks.com/dev-tools/api/latest/authentication.html for
instruction.
2. Add this PAT obtained in Step 1 as the value for ‘Token’ in the postman collection
Authorization tab
○ Postman collection details (...) -> Edit -> navigate to Authorization tab
3. Ensure the environment variables are set appropriately (e.g. Workspace URL,
metastore_id)
1. List UC Metastores
○ GET https://{{Workspace URL}}/api/2.0/unity-catalog/metastores
2. Get UC Metastore Summary Information
○ GET
https://{{WorkspaceURL}}/api/2.0/unity-catalog/metastore_summary
3. List UC Catalogs
○ GET https://{{WorkspaceURL}}/api/2.0/unity-catalog/catalogs
4. List UC Schemas
○ GET
https://{{WorkspaceURL}}/api/2.0/unity-catalog/schemas?catalog_name
={{catalog name}}
5. List UC Tables
○ GET
https://{{WorkspaceURL}}/api/2.0/unity-catalog/tables?catalog_name=
{{catalog name}}&schema_name={{schema name}}
References
Unity Catalog API Specification is the important reference for building and testing all metadata
retrieval REST API calls. Please request Databricks’ ISV Partner SA for access to this
document.