(+351) 21 24 10006  ·  info@bconcepts.pt
Carnaxide, Lisbon
Modelação de Dados (Kimball)
Modelação de Dados (Kimball) 2 min

Slowly Changing Dimensions Type 2: preserving history in the data warehouse

João Barros 20 de March de 2025 2 min read

Slowly Changing Dimensions (SCD) solve a fundamental problem in data warehousing: what to do when a dimension attribute changes? The answer depends on the business need to preserve historical state.

The 3 main types

TYPE 1 — Overwrite (no history):
  Before: Customer X — Segment: SMB
  After the change: Customer X — Segment: Enterprise
  → The past always looks Enterprise. Good for fixing errors.

TYPE 2 — New record (full history):
  Record 1: Customer X, Segment SMB,        valid_from: 2020-01-01, valid_to: 2023-12-31, is_current: 0
  Record 2: Customer X, Segment Enterprise, valid_from: 2024-01-01, valid_to: 9999-12-31, is_current: 1
  → Historical analysis shows SMB on sales until 2023. The most used.

TYPE 3 — Additional column (only latest + previous):
  Customer X — Current_Segment: Enterprise, Previous_Segment: SMB
  → Preserves only 1 change. Rare but useful for reorganizations.

Implement SCD Type 2 with MERGE

-- ETL process: apply source changes to the dimension
MERGE dbo.DimCustomer AS tgt
USING (SELECT nk_customer, name, segment, country FROM stg.Customers) AS src
    ON tgt.nk_customer = src.nk_customer AND tgt.is_current = 1

-- Type 1: update attributes that keep no history (name - error fix)
WHEN MATCHED AND tgt.name <> src.name
    THEN UPDATE SET tgt.name = src.name

-- Type 2: close the current record and insert a new one
WHEN MATCHED AND tgt.segment <> src.segment
    THEN UPDATE SET tgt.valid_to = CAST(GETDATE() AS DATE), tgt.is_current = 0;

-- Insert new records: new customers + new version of SCD2
INSERT INTO dbo.DimCustomer (nk_customer, name, segment, country, valid_from, valid_to, is_current)
SELECT src.nk_customer, src.name, src.segment, src.country,
       CAST(GETDATE() AS DATE), '9999-12-31', 1
FROM stg.Customers src
LEFT JOIN dbo.DimCustomer tgt ON src.nk_customer = tgt.nk_customer AND tgt.is_current = 1
WHERE tgt.sk_customer IS NULL
   OR (tgt.sk_customer IS NOT NULL AND tgt.segment <> src.segment);

Correct join in reports

-- For correct historical analysis, use the surrogate key from the transaction date
SELECT
    s.revenue,
    c.segment,  -- the customer's segment AT THE TIME of the sale
    d.year
FROM dbo.FactSales    s
JOIN dbo.DimCustomer  c ON s.sk_customer = c.sk_customer  -- SK preserves historical context
JOIN dbo.DimDate      d ON s.sk_date     = d.sk_date

Conclusion

SCD Type 2 is the backbone of correct historical analysis. Without it, historical reports show customers with their current segment, not what they had when they bought. The surrogate key is the mechanism that preserves context — never join on the natural key in historical facts.

Share: