Azure DevOps Pipelines for data: CI/CD of notebooks and ADF pipelines

João Barros 01 de September de 2025 1 min read

Azure DevOps Pipelines apply CI/CD practices to the world of data: every commit to a notebook, ADF pipeline or Power BI report goes through automatic validation before deploying to production.

CI Pipeline — validating Python/PySpark notebooks

trigger:
  - main
  - develop

pool: { vmImage: ubuntu-latest }

stages:
  - stage: CI
    jobs:
      - job: ValidateNotebooks
        steps:
          - task: UsePythonVersion@0
            inputs: { versionSpec: "3.11" }

          - script: pip install pytest nbconvert flake8 pyspark
            displayName: Install dependencies

          - script: flake8 notebooks/ --max-line-length=120 --ignore=E501
            displayName: Lint PySpark

          - script: pytest tests/ -v --tb=short
            displayName: Unit tests

          - task: PublishTestResults@2
            inputs:
              testResultsFiles: "tests/results/*.xml"

CD Pipeline — deploying ADF ARM templates

stages:
  - stage: DeployADF_Test
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    jobs:
      - deployment: DeployADF
        environment: test
        strategy:
          runOnce:
            deploy:
              steps:
                - task: AzureResourceManagerTemplateDeployment@3
                  inputs:
                    azureResourceManagerConnection: 'AzureServiceConnection_Test'
                    resourceGroupName: 'rg-adf-test'
                    location: 'westeurope'
                    templateLocation: 'Linked artifact'
                    csmFile: '$(Pipeline.Workspace)/adf-arm/ARMTemplateForFactory.json'
                    overrideParameters: >
                      -factoryName adf-bconcepts-test
                      -KeyVault_properties_typeProperties_baseUrl https://kv-bconcepts-test.vault.azure.net/

Deploying Databricks notebooks

          - task: DatabricksDeployScripts@0
            inputs:
              authMethod: ServicePrincipal
              SpId:       $(DATABRICKS_SP_ID)
              SpSecret:   $(DATABRICKS_SP_SECRET)
              TenantId:   $(TENANT_ID)
              DatabricksUrl: https://adb-xxx.azuredatabricks.net
              LocalPath:  notebooks/
              RemotePath: /Shared/Production/

Conclusion

CI/CD for data reduces error-prone manual deploys and ensures that tested code reaches production. Start simple: lint + unit tests in CI, ARM template deploy in CD. Add manual approvals before the production stage.