MLflow in Databricks: traceability of Machine Learning experiments

João Barros 13 de March de 2026 2 min read

MLflow is the open-source MLOps platform integrated into Databricks that solves one of the biggest problems in Machine Learning: tracking what was tested, comparing results and reproducing the best model in production.

MLflow structure

Experiment — groups related runs (e.g. "Churn_Model_v2").
Run — a training execution with its parameters and metrics.
Artifact — generated files (serialized model, charts, datasets).
Model Registry — versioning and model promotion (Staging → Production).

Log an experiment

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier

mlflow.set_experiment("/Experiments/Churn_Prediction")

with mlflow.start_run(run_name="RF_100trees"):
    params = {"n_estimators": 100, "max_depth": 8}
    model = RandomForestClassifier(**params)
    model.fit(X_train, y_train)

    accuracy = model.score(X_test, y_test)

    mlflow.log_params(params)
    mlflow.log_metric("accuracy", accuracy)
    mlflow.sklearn.log_model(model, "random_forest_model")

Autologging

# MLflow automatically logs sklearn parameters and metrics
mlflow.sklearn.autolog()
model.fit(X_train, y_train)  # everything logged automatically

Model Registry and deployment

# Register the best model
mlflow.register_model(
    model_uri=f"runs:/{run_id}/random_forest_model",
    name="ChurnPrediction"
)

# Promote to Production via UI or API
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage("ChurnPrediction", version=3, stage="Production")

# Load the production model in any notebook
model = mlflow.sklearn.load_model("models:/ChurnPrediction/Production")

Conclusion

MLflow is indispensable for data science teams that want reproducibility and governance. In Databricks, it is integrated by default — each notebook has an associated experiment and the best models can be promoted to production with a click.