0% found this document useful (0 votes)
6 views

Metadata Retrieval with Unity Catalog REST API - Quick Start (ext)

This document serves as a quick start guide for ISV partners to utilize Unity Catalog REST APIs for metadata retrieval in Databricks. It outlines key concepts of Unity Catalog, including its security model, metastore organization, and the use of Postman for testing API calls. The document also provides instructions for setting up and retrieving metadata using various REST API endpoints.

Uploaded by

sureshpola
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Metadata Retrieval with Unity Catalog REST API - Quick Start (ext)

This document serves as a quick start guide for ISV partners to utilize Unity Catalog REST APIs for metadata retrieval in Databricks. It outlines key concepts of Unity Catalog, including its security model, metastore organization, and the use of Postman for testing API calls. The document also provides instructions for setting up and retrieving metadata using various REST API endpoints.

Uploaded by

sureshpola
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Metadata Retrieval with Unity Catalog

REST API - Quick Start


Note: this document is shared externally with Databricks partner
contacts

Introduction

What is Unity Catalog?

Key Concepts of Unity Catalog

Using Unity Catalog REST API


Instructions
Assumptions
Setup
Retrieve Metadata using REST APIs

Postman Collection Example

References

Introduction
This document works as a quick start guidance for ISV partners (enterprise catalog or similar) to
leverage Unity Catalog REST APIs to retrieve metadata from the Databricks environment. We
use Postman for testing and creating examples with Unity Catalog APIs.

What is Unity Catalog?


Unity Catalog is a central hub for administering and securing an organization’s data which enables
fine-grained security and auditing across the Databricks platform. This document gives an
overview of the Unity Catalog and its security model. To learn more about the Unity Catalog, please
refer to the user guide.

Unity Catalog’s security model is based on standard ANSI SQL, allowing administrators to grant
permissions at the level of databases, tables, views, rows and columns in their existing data lake
using familiar syntax. Unity Catalog enforces these permissions in every programming language
and every part of the Databricks platform (notebooks, jobs, SQL endpoints, etc). Moreover, the
same Unity Catalog metastore can be attached to multiple Databricks workspaces so that they
share table definitions and permissions. You can also use Unity Catalog alongside an existing
Apache Hive metastore, so you do not need to move all your metadata to it right away.

Finally, Unity Catalog includes the ability to share data securely across organizations through Delta
Sharing, an open source standard for data exchange. Delta Sharing allows a data provider to share
data with recipients regardless of the platform they are using -- for example, the recipients can
query the shared data in pandas or Apache Spark, even if they are not using Databricks.
Administrators can configure Delta Sharing as described later in this document.

Key Concepts of Unity Catalog


The top-level container for data in Unity Catalog is a Metastore. A metastore typically organizes all
the data in an enterprise, across many departments. A Databricks account can have multiple
metastores, each representing a fully isolated data environment (e.g., development vs. production).
Each metastore can be assigned to one or more Databricks Workspaces by account administrators,
allowing those workspaces to access its data.

Within a metastore, Unity Catalog provides a 3-level namespace for organizing data: catalogs,
schemas (also called databases), and tables / views. Tables can either be created in a default
“managed” location for the metastore, or they can be “external” tables that refer to data already
present in a cloud storage system like S3.
Administrators can grant permissions on objects in the metastore to users and groups in their
organization using SQL GRANT statements or the Unity Catalog REST API, allowing different users
to read, modify, or administer parts of the namespace. Groups can be synchronized from an
identity provider such as Active Directory. All accesses to data in the Unity Catalog are audited and
linked to the corresponding user.

For Databricks users who already use the Apache Hive metastore available in each workspace, or
an external Hive metastore, Unity Catalog is additive: the workspace’s Hive metastore becomes one
catalog within the 3-layer namespace (called “hive_metastore”), and other catalogs use Unity
Catalog.

Finally, each metastore is configured with an Amazon IAM role to access the underlying cloud
storage for its data. User code in Databricks never gets access to the raw access credentials -- the
Unity Catalog only gives user code scoped access tokens for parts of the data that each user is
allowed to access.

Using Unity Catalog REST API

Instructions

Assumptions
● Postman installed
● Metastore exists, with metastoreId = XXXXXX (supplied as variable in collection)
● Familiar with the key concepts of Unity Catalog

Setup
1. Obtain a personal access token (PAT) from an existing workspace which has Unity
Catalog enabled. See
https://docs.databricks.com/dev-tools/api/latest/authentication.html for
instruction.
2. Add this PAT obtained in Step 1 as the value for ‘Token’ in the postman collection
Authorization tab
○ Postman collection details (...) -> Edit -> navigate to Authorization tab

3. Ensure the environment variables are set appropriately (e.g. Workspace URL,
metastore_id)

Retrieve Metadata using REST APIs

1. List UC Metastores
○ GET https://{{Workspace URL}}/api/2.0/unity-catalog/metastores
2. Get UC Metastore Summary Information
○ GET
https://{{WorkspaceURL}}/api/2.0/unity-catalog/metastore_summary

3. List UC Catalogs
○ GET https://{{WorkspaceURL}}/api/2.0/unity-catalog/catalogs

4. List UC Schemas
○ GET
https://{{WorkspaceURL}}/api/2.0/unity-catalog/schemas?catalog_name
={{catalog name}}

5. List UC Tables
○ GET
https://{{WorkspaceURL}}/api/2.0/unity-catalog/tables?catalog_name=
{{catalog name}}&schema_name={{schema name}}

Postman Collection Example


● UC Metadata Retrieve Postman Collection and UC Test Environment (Request your
Databricks’ ISV Partner SA for access)

References
Unity Catalog API Specification is the important reference for building and testing all metadata
retrieval REST API calls. Please request Databricks’ ISV Partner SA for access to this
document.

You might also like