Indexes in SQL Server: columnstore vs rowstore for analytics and OLTP
João Barros
16 de August de 2024
2 min read
Indexes are the main performance lever in SQL Server. The choice between rowstore (traditional B-tree) and columnstore can mean the difference between a 30-second query and 300 milliseconds.
Rowstore vs Columnstore
Rowstore (B-tree):
- Stores data row by row
- Ideal for OLTP: individual inserts/updates, PK lookups
- Poor for aggregations on large tables
Columnstore:
- Stores data column by column (typical 10x compression)
- Ideal for OLAP/DW: aggregations over millions of rows
- Batch mode processing — processes 900 rows at a time
Create a Clustered Columnstore Index
-- Fact table in the DW — columnstore is the right choice
CREATE CLUSTERED COLUMNSTORE INDEX CCI_FactSales
ON dbo.FactSales;
-- Check the compression achieved
SELECT
i.name,
p.rows,
SUM(a.total_pages) * 8 / 1024.0 AS size_mb,
SUM(a.data_pages) * 8 / 1024.0 AS data_mb
FROM sys.indexes i
JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
JOIN sys.allocation_units a ON p.partition_id = a.container_id
WHERE i.object_id = OBJECT_ID('dbo.FactSales')
GROUP BY i.name, p.rows;
Nonclustered Columnstore for OLTP
-- Add columnstore to an OLTP table without removing the clustered rowstore
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Sales_Analytics
ON dbo.Sales (sale_date, product_id, amount, quantity)
WHERE sale_date >= '2023-01-01'; -- filtered index
Filtered indexes
-- Active rows only — smaller and faster index
CREATE NONCLUSTERED INDEX IX_Customers_Active
ON dbo.Customers (name, email)
WHERE status = 'Active';
Identify missing indexes
SELECT
migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) AS improvement_measure,
'CREATE INDEX IX_' + OBJECT_NAME(mid.object_id) + '_' + REPLACE(mid.equality_columns, ', ', '_') AS suggested_index,
mid.equality_columns, mid.inequality_columns, mid.included_columns
FROM sys.dm_db_missing_index_group_stats migs
JOIN sys.dm_db_missing_index_groups mig ON migs.group_handle = mig.index_group_handle
JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle
ORDER BY improvement_measure DESC;
Conclusion
Use clustered columnstore on all data warehouse fact tables. For pure OLTP, use rowstore with carefully chosen indexes. sys.dm_db_missing_index_* is your best ally for finding quick optimization opportunities.