(+351) 21 24 10006  ·  info@bconcepts.pt
Carnaxide, Lisbon
ELT

Medallion Architecture in ELT: Bronze, Silver, Gold

João Barros 04 de July de 2026 4 min read

The Medallion architecture organizes an ELT pipeline into three layers — bronze, silver and gold — to turn raw data into analysis-ready information. It is a simple, widely used pattern in modern Lakehouses and data warehouses because it makes every stage clear, testable and easy to maintain. By separating responsibilities, you avoid the chaos of scripts that do everything at once and gain end-to-end traceability.

Prerequisites

  • Access to a data warehouse or Lakehouse (for example, Microsoft Fabric, Synapse, Databricks or any SQL database).
  • A source table or file with untreated data.
  • Basic SQL knowledge (SELECT, CREATE TABLE, GROUP BY).
  • Permissions to create tables in your schema.

Step 1: Understand the three layers

In the ELT pattern, you first load the data (Load) and only then transform it (Transform), directly inside the warehouse. The Medallion architecture splits that transformation into three levels of increasing quality:

Medallion Architecture in ELT: Bronze, Silver, Gold
  • Bronze — raw data, exactly as it arrives from the source, with no changes.
  • Silver — clean data: no duplicates, correct data types and business rules applied.
  • Gold — aggregated data, ready to be consumed in reports and dashboards.
Tip: think of bronze as the "warehouse", silver as the "assembly line" and gold as the "shop window".

Step 2: Create the Bronze layer

The bronze layer keeps a faithful copy of the source. Don't clean anything here: the goal is to have a history of what you received, useful for auditing and reprocessing. Keeping bronze intact lets you rerun the pipeline whenever you fix a rule, without asking the source for the data again.

-- Bronze layer: raw data, exactly as it arrives
CREATE TABLE bronze_vendas AS
SELECT * FROM origem_vendas;

If the data comes from files (CSV, Parquet), load them into this table first, with no transformations.

Step 3: Build the Silver layer

This is the heart of the work: cleaning and normalizing. You fix data types, remove duplicates and filter out invalid records so the data becomes reliable. The silver layer is also where you apply time zone conversions, name standardization and joins with reference tables.

-- Silver layer: clean and normalized
CREATE TABLE silver_vendas AS
SELECT
    CAST(id AS INT)              AS venda_id,
    CAST(data AS DATE)           AS data_venda,
    UPPER(TRIM(produto))         AS produto,
    CAST(total AS DECIMAL(10,2)) AS total
FROM bronze_vendas
WHERE id IS NOT NULL;

Every business rule (for example, ignoring sales with no identifier) is documented and centralized in this layer.

Step 4: Build the Gold layer

The gold layer answers business questions. You usually aggregate the data to the level your report or Power BI model needs — for example, sales by month and by product. You can create several gold tables, one per business area or per dashboard.

-- Gold layer: aggregated and ready for BI
CREATE TABLE gold_vendas_mensais AS
SELECT
    YEAR(data_venda)  AS ano,
    MONTH(data_venda) AS mes,
    produto,
    SUM(total)        AS total_vendas,
    COUNT(*)       AS numero_vendas
FROM silver_vendas
GROUP BY YEAR(data_venda), MONTH(data_venda), produto;

Date function syntax may vary slightly across engines; adapt it to your warehouse.

Verify the result

Confirm each layer holds the expected data by comparing row counts. Bronze should have as many rows as the source, and gold should be aggregated (fewer rows):

SELECT 'bronze' AS camada, COUNT(*) AS linhas FROM bronze_vendas
UNION ALL
SELECT 'silver', COUNT(*) FROM silver_vendas
UNION ALL
SELECT 'gold',   COUNT(*) FROM gold_vendas_mensais;

If silver has fewer rows than bronze, the cleaning is working. If gold returns totals consistent with the business, the pipeline is ready.

Conclusion

With the bronze, silver and gold layers you have an organized ELT pipeline that is easy to test and to explain to anyone on the team. The next step is to automate the refresh of each layer — for example, with a scheduled pipeline in Fabric or Data Factory — and add quality tests to silver. Which of the three layers do you think your team most needs to improve?

Share: