MLflow in Databricks: traceability of Machine Learning experiments
João Barros
13 de March de 2026
2 min read
MLflow is the open-source MLOps platform integrated into Databricks that solves one of the biggest problems in Machine Learning: tracking what was tested, comparing results and reproducing the best model in production.
MLflow structure
- Experiment — groups related runs (e.g. "Churn_Model_v2").
- Run — a training execution with its parameters and metrics.
- Artifact — generated files (serialized model, charts, datasets).
- Model Registry — versioning and model promotion (Staging → Production).
Log an experiment
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
mlflow.set_experiment("/Experiments/Churn_Prediction")
with mlflow.start_run(run_name="RF_100trees"):
params = {"n_estimators": 100, "max_depth": 8}
model = RandomForestClassifier(**params)
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)
mlflow.log_params(params)
mlflow.log_metric("accuracy", accuracy)
mlflow.sklearn.log_model(model, "random_forest_model")
Autologging
# MLflow automatically logs sklearn parameters and metrics
mlflow.sklearn.autolog()
model.fit(X_train, y_train) # everything logged automatically
Model Registry and deployment
# Register the best model
mlflow.register_model(
model_uri=f"runs:/{run_id}/random_forest_model",
name="ChurnPrediction"
)
# Promote to Production via UI or API
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage("ChurnPrediction", version=3, stage="Production")
# Load the production model in any notebook
model = mlflow.sklearn.load_model("models:/ChurnPrediction/Production")
Conclusion
MLflow is indispensable for data science teams that want reproducibility and governance. In Databricks, it is integrated by default — each notebook has an associated experiment and the best models can be promoted to production with a click.