What is a Lakehouse and how it combines a data lake and a warehouse
The term Lakehouse describes an architecture that joins the flexibility of a data lake with the reliability of a data warehouse. Instead of choosing between the two, you get the benefits of both in one place.
Prerequisites
- An understanding of a data lake (files in storage) and a data warehouse (SQL tables).
- Familiarity with file formats such as Parquet.
- A platform with Delta tables (Databricks, Microsoft Fabric or Spark with Delta Lake).
Step 1: Understand the problem it solves
A data lake stores everything cheaply, but without guarantees (no transactions, no strong schema). A warehouse is reliable but rigid and expensive. The Lakehouse puts a table layer (Delta Lake) on top of the files to bring transactions and schema to the lake.

Step 2: Store data in table format
Instead of a loose CSV, write in Delta:
df.write.format("delta").save("/dados/vendas")
Step 3: Query with SQL
SELECT categoria, SUM(valor) AS total
FROM delta.`/dados/vendas`
GROUP BY categoria;
Step 4: Take advantage of transactions and history
Delta tables support ACID and time travel — querying an earlier version of the data:
SELECT categoria, valor
FROM delta.`/dados/vendas` VERSION AS OF 3;
Verify the result
Describe the table history and confirm you can see the versions:
DESCRIBE HISTORY delta.`/dados/vendas`;
Conclusion
With the Lakehouse you no longer need to move data between lake and warehouse — the same layer serves engineering, analytics and BI. Which part of your architecture would you simplify with a Lakehouse?