(+351) 21 24 10006  ·  info@bconcepts.pt
Carnaxide, Lisbon
Databricks
Databricks 2 min

Unity Catalog in Databricks: unified governance for data and AI

João Barros 01 de January de 2025 2 min read

Unity Catalog is the Databricks data governance solution that unifies access control, lineage and auditing in a single metadata plane shared across all workspaces. It replaces local Hive metastores with a centralized, multi-workspace catalog.

Object hierarchy

Metastore (1 per region)
  └─ Catalog (e.g. prod, dev, raw)
       └─ Schema / Database
            └─ Table / View / Volume / Function / Model

Create a basic structure

-- SQL in Databricks
CREATE CATALOG IF NOT EXISTS prod;
CREATE SCHEMA IF NOT EXISTS prod.sales;
CREATE TABLE prod.sales.fact_orders
USING DELTA AS SELECT * FROM hive_metastore.legacy.orders;

Granular access control

-- Grant read access to a group
GRANT SELECT ON TABLE prod.sales.fact_orders TO `analysts`;

-- Access to a full schema
GRANT USE SCHEMA, SELECT ON SCHEMA prod.sales TO `data_team`;

-- Mask a sensitive column
ALTER TABLE prod.sales.customers
  ALTER COLUMN tax_id SET MASK mask_pii USING COLUMNS (current_user());

Automatic lineage

Unity Catalog automatically captures lineage between tables when you use SQL or Delta Live Tables. View it in the Data Explorer: Table → Lineage Graph.

External Locations and Volumes

-- Register external storage
CREATE EXTERNAL LOCATION my_adls
URL 'abfss://container@account.dfs.core.windows.net/'
WITH (STORAGE CREDENTIAL my_credential);

-- Volume for access to non-tabular files
CREATE VOLUME prod.raw.incoming_files
LOCATION 'abfss://container@account.dfs.core.windows.net/incoming/';

Conclusion

Unity Catalog turns Databricks into an enterprise-ready platform. With a single catalog for the whole organization, it eliminates permission silos between workspaces and gives data teams full visibility into who accesses what and where data comes from.

Share: