(+351) 21 24 10006  ·  info@bconcepts.pt
Carnaxide, Lisbon
Modelação de Dados (Kimball)
Modelação de Dados (Kimball) 2 min

Kimball dimensional modeling: a practical guide to facts and dimensions

João Barros 25 de October de 2024 2 min read

Ralph Kimball's dimensional methodology has been the reference standard for data warehousing for over 30 years. Its principles — facts, dimensions and Bus Architecture — apply equally well in traditional SQL Server, Azure Synapse and Microsoft Fabric.

Facts vs Dimensions

FACT TABLE:
  - Contains numeric measures (revenue, quantity, cost)
  - Many rows, few columns
  - Foreign keys to all dimensions
  - Defined granularity: 1 row = 1 sales transaction

DIMENSION TABLE:
  - Contains descriptive attributes (name, category, region)
  - Few rows, many columns
  - Surrogate key (INT) as PK — never the natural key
  - Denormalized (Kimball dislikes snow-flake)

Define granularity — the most important decision

Question: "What does one row of the fact table represent?"

Wrong granularity: "sales data"
Correct granularity: "one row per sales line item on each invoice"

Impact: if granularity is "per invoice" and the user wants
analysis by product, it is impossible. Always pick the most
atomic granularity available in the source.

Typical structure — star schema

-- Customer dimension (denormalized)
CREATE TABLE dbo.DimCustomer (
    sk_customer     INT IDENTITY PRIMARY KEY,   -- surrogate key
    nk_customer     INT NOT NULL,               -- natural key from source
    name            NVARCHAR(200),
    segment         VARCHAR(50),
    country         VARCHAR(100),
    region          VARCHAR(100),               -- denormalized (do not normalize into DimCountry)
    valid_from      DATE NOT NULL,
    valid_to        DATE NOT NULL,
    is_current      BIT NOT NULL DEFAULT 1
);

-- Fact table
CREATE TABLE dbo.FactSales (
    sk_date         INT NOT NULL REFERENCES dbo.DimDate(sk_date),
    sk_customer     INT NOT NULL REFERENCES dbo.DimCustomer(sk_customer),
    sk_product      INT NOT NULL REFERENCES dbo.DimProduct(sk_product),
    sk_store        INT NOT NULL REFERENCES dbo.DimStore(sk_store),
    revenue         DECIMAL(18,2) NOT NULL,
    cost            DECIMAL(18,2) NOT NULL,
    quantity        INT NOT NULL
);

Date table — mandatory

-- Generate a date table with all required attributes
CREATE TABLE dbo.DimDate (
    sk_date         INT PRIMARY KEY,    -- YYYYMMDD format as INT
    full_date       DATE NOT NULL,
    year            SMALLINT,
    quarter         TINYINT,
    month           TINYINT,
    month_name      VARCHAR(20),
    week            TINYINT,
    day_of_week     TINYINT,
    day_name        VARCHAR(20),
    is_weekend      BIT,
    is_holiday      BIT
);

Conclusion

Kimball works because it aligns the model with the way business users think about data. Each dimension is an analysis perspective, each fact is a business event. This alignment makes reports more intuitive and queries simpler to write and optimize.

Share: