back to blog
Azure11 min read

Cross-Tenant Authentication in Azure: A Practical Guide for Monitoring Scenarios

#azure#authentication#service-principal#multi-tenant#security

You need to collect metrics from customer Azure tenants. Maybe it's fabric capacity utilization. Maybe it's VM health checks. Maybe it's cost monitoring. The problem is the same: how do you access resources in someone else's Azure subscription without them giving you admin credentials?

This came up when building Fabric Capacity Monitor. I needed to pull metrics from 20+ customer tenants on a schedule. Asking each customer for their admin password wasn't going to work. Neither was logging in with my own Microsoft account to each tenant.

The solution is service principals. But the docs make this way more complicated than it needs to be. Here's how it actually works.

The authentication problem

Let's be specific about what we're trying to do:

  1. Your monitoring app runs in your Azure subscription
  2. Customer resources live in their Azure subscription (different tenant)
  3. You need to call Azure APIs to read their metrics
  4. You can't use interactive login (scheduled jobs run unattended)
  5. Customer needs to control and revoke access at any time

The answer is the OAuth2 client credentials flow with a service principal that the customer creates in their tenant.

What is a service principal?

A service principal is basically a user account for applications. Instead of username/password, it uses client ID and client secret. Instead of belonging to a person, it belongs to an app registration.

When a customer creates an app registration in their Azure AD, Azure automatically creates a service principal. They can then grant this service principal permissions to specific resources.

Why is this better than user accounts?

Revocable: Customer deletes the app registration, access stops immediately. No password to change.

Auditable: Every API call shows up in their Azure Activity Log with the service principal name. They see exactly what you're accessing.

Least privilege: You request only Reader permissions. Customer can verify you can't modify anything.

No MFA issues: Service principals don't trigger MFA prompts. Perfect for automated jobs.

Scoped: Permissions are per-resource. Customer can grant access to one capacity without exposing others.

How customers set this up

Here's the exact process a customer follows to grant you access. I usually send this as a doc they can follow.

Step 1: Create the app registration

Customer runs this in their Azure tenant. PowerShell is usually easiest:

# Connect to Azure AD (customer runs this)
Connect-AzAccount

# Create the app registration
$app = New-AzADApplication -DisplayName "YourCompany-CapacityMonitor"

# Create the service principal
$sp = New-AzADServicePrincipal -ApplicationId $app.AppId

# Generate a client secret (valid for 1 year)
$secret = New-AzADAppCredential -ObjectId $app.Id -EndDate (Get-Date).AddYears(1)

# Output the values you need
Write-Host "Tenant ID: $((Get-AzContext).Tenant.Id)"
Write-Host "Client ID: $($app.AppId)"
Write-Host "Client Secret: $($secret.SecretText)"

Or via Bicep if they prefer infrastructure as code:

// app-registration.bicep
// Note: App registrations require Microsoft.Graph permissions
// This is a reference - usually easier via Portal or PowerShell

resource appRegistration 'Microsoft.Graph/applications@v1.0' = {
  displayName: 'YourCompany-CapacityMonitor'
  signInAudience: 'AzureADMyOrg'
}

The Portal UI works too. Azure AD > App registrations > New registration. Then Certificates & secrets > New client secret.

Step 2: Assign permissions to the Fabric capacity

Now they grant the service principal read access to their Fabric capacity:

# Get the service principal object ID
$spObjectId = (Get-AzADServicePrincipal -ApplicationId $app.AppId).Id

# Get the Fabric capacity resource ID
# Format: /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Fabric/capacities/{name}
$capacityId = "/subscriptions/12345678-1234-1234-1234-123456789012/resourceGroups/fabric-rg/providers/Microsoft.Fabric/capacities/fabriccapacity1"

# Assign Reader role
New-AzRoleAssignment -ObjectId $spObjectId -RoleDefinitionName "Reader" -Scope $capacityId

The Reader role is enough for monitoring. It allows:

  • Reading capacity properties and SKU
  • Reading metrics via Azure Monitor
  • Reading health status

It does NOT allow:

  • Modifying capacity settings
  • Pausing/resuming capacity
  • Changing SKU
  • Deleting the capacity

Always use Reader for monitoring. Never request Contributor or Owner unless you actually need to make changes.

Step 3: Share credentials securely

Customer now has three values to share with you:

  • Tenant ID (their Azure AD tenant)
  • Client ID (the app registration ID)
  • Client Secret (the password)

How they share this matters. Never ask them to email the secret. Options:

Key Vault shared access: They add the secret to your Key Vault directly (requires cross-tenant setup, more complex)

Secure file sharing: They upload to a shared encrypted location, you retrieve and delete

Portal direct entry: They enter credentials directly into your monitoring portal (my preferred approach)

Once you have the credentials, store them in Key Vault immediately. More on that below.

The authentication flow

Now your app needs to use these credentials to get an access token. Here's the OAuth2 client credentials flow:

Your App                              Azure AD (Customer's Tenant)
   |                                           |
   |  POST /oauth2/v2.0/token                  |
   |  - client_id                              |
   |  - client_secret                          |
   |  - scope: https://management.azure.com    |
   |------------------------------------------>|
   |                                           |
   |  200 OK                                   |
   |  { access_token: "eyJ...", expires_in }   |
   |<------------------------------------------|
   |                                           |
   |  GET /subscriptions/.../metrics           |
   |  Authorization: Bearer eyJ...             |
   |------------------------------------------>|
   |                                           |
   |  200 OK { metrics data }                  |
   |<------------------------------------------|

In Python with the Azure SDK:

from azure.identity import ClientSecretCredential
from azure.mgmt.monitor import MonitorManagementClient

def get_capacity_metrics(tenant_id: str, client_id: str, client_secret: str, 
                         subscription_id: str, capacity_resource_id: str):
    # Create credential for customer's tenant
    credential = ClientSecretCredential(
        tenant_id=tenant_id,
        client_id=client_id,
        client_secret=client_secret
    )
    
    # Create monitor client for their subscription
    monitor_client = MonitorManagementClient(
        credential=credential,
        subscription_id=subscription_id
    )
    
    # Query capacity metrics
    metrics = monitor_client.metrics.list(
        resource_uri=capacity_resource_id,
        timespan="PT1H",  # Last hour
        interval="PT5M",  # 5-minute granularity
        metricnames="CapacityUtilization",
        aggregation="Average"
    )
    
    return metrics

The key part: you pass the customer's tenant ID to ClientSecretCredential. This tells Azure to authenticate against their tenant, not yours.

Token caching and refresh

The Azure SDK handles token caching automatically. But if you're making raw HTTP calls, you need to manage this yourself:

import requests
from datetime import datetime, timedelta

class TokenManager:
    def __init__(self, tenant_id: str, client_id: str, client_secret: str):
        self.tenant_id = tenant_id
        self.client_id = client_id
        self.client_secret = client_secret
        self.token = None
        self.expires_at = None
    
    def get_token(self) -> str:
        # Return cached token if still valid (with 5 min buffer)
        if self.token and self.expires_at > datetime.utcnow() + timedelta(minutes=5):
            return self.token
        
        # Request new token
        url = f"https://login.microsoftonline.com/{self.tenant_id}/oauth2/v2.0/token"
        data = {
            "grant_type": "client_credentials",
            "client_id": self.client_id,
            "client_secret": self.client_secret,
            "scope": "https://management.azure.com/.default"
        }
        
        response = requests.post(url, data=data)
        response.raise_for_status()
        
        token_data = response.json()
        self.token = token_data["access_token"]
        self.expires_at = datetime.utcnow() + timedelta(seconds=token_data["expires_in"])
        
        return self.token

Tokens typically expire after 1 hour. Always check expiration before using and refresh proactively.

Storing credentials securely

The client secret is the only true secret here. Tenant IDs and client IDs are identifiers, not secrets. They can be stored in configuration files or database tables.

Never store client secrets in:

  • Environment variables (visible in process listings)
  • Config files (end up in git, backups, logs)
  • Database tables (unless encrypted at rest with proper key management)

Use Azure Key Vault for client secrets only. See the security architecture for how this fits into the overall system.

from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

def get_customer_credentials(customer_id: str, tenant_id: str, client_id: str) -> dict:
    """Retrieve customer service principal credentials.
    
    Tenant ID and client ID come from config/database (not secrets).
    Only client_secret comes from Key Vault (actual secret).
    """
    credential = DefaultAzureCredential()
    kv_client = SecretClient(
        vault_url="https://your-keyvault.vault.azure.net",
        credential=credential
    )
    
    # Only the secret goes in Key Vault
    client_secret = kv_client.get_secret(f"customer-{customer_id}-secret").value
    
    return {
        "tenant_id": tenant_id,      # From config/database
        "client_id": client_id,       # From config/database
        "client_secret": client_secret # From Key Vault
    }

Your monitoring app uses managed identity to access Key Vault. No credentials in code at all.

Error handling

Things will break. Secrets expire. Customers revoke access. Network hiccups happen. Here's how to handle common errors.

Expired client secret

Error response:

{
  "error": "invalid_client",
  "error_description": "AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID."
}

This means the secret expired or was regenerated. Customer needs to create a new secret and share it with you. Set up monitoring to catch these before they cause outages.

from azure.core.exceptions import ClientAuthenticationError

try:
    credential = ClientSecretCredential(tenant_id, client_id, client_secret)
    # Force token acquisition to test credentials
    credential.get_token("https://management.azure.com/.default")
except ClientAuthenticationError as e:
    if "AADSTS7000215" in str(e):
        notify_customer_secret_expired(customer_id)
        raise CredentialExpiredError(f"Customer {customer_id} secret expired")
    raise

Wrong tenant ID

Error response:

{
  "error": "invalid_request",
  "error_description": "AADSTS90002: Tenant '12345678-...' not found. Check to make sure you have the correct tenant ID."
}

Typo in the tenant ID, or the customer gave you the wrong one. Have them verify by running:

(Get-AzContext).Tenant.Id

Insufficient permissions

The auth succeeds but the API call fails:

{
  "error": {
    "code": "AuthorizationFailed",
    "message": "The client '...' with object id '...' does not have authorization to perform action 'Microsoft.Insights/metrics/read' over scope '...'."
  }
}

The service principal exists but doesn't have Reader role on the capacity. Customer needs to run the role assignment again:

# Verify current role assignments
Get-AzRoleAssignment -ObjectId $spObjectId -Scope $capacityId

# If empty, add the Reader role
New-AzRoleAssignment -ObjectId $spObjectId -RoleDefinitionName "Reader" -Scope $capacityId

App registration deleted

Error response:

{
  "error": "unauthorized_client",
  "error_description": "AADSTS700016: Application with identifier '...' was not found in the directory '...'."
}

Customer deleted the app registration (intentionally or accidentally). They need to create a new one and share new credentials.

Service principal disabled

{
  "error": "invalid_client", 
  "error_description": "AADSTS7000112: Application '...' is disabled."
}

Customer disabled the service principal in Azure AD > Enterprise applications. They can re-enable it or create a new one.

Testing the connection

Before going live, validate the credentials work. Here's a simple test script:

#!/usr/bin/env python3
"""Test cross-tenant authentication for a customer."""

import sys
from azure.identity import ClientSecretCredential
from azure.mgmt.monitor import MonitorManagementClient
from azure.core.exceptions import ClientAuthenticationError, HttpResponseError

def test_connection(tenant_id: str, client_id: str, client_secret: str,
                    subscription_id: str, capacity_resource_id: str) -> bool:
    print(f"Testing connection to tenant {tenant_id[:8]}...")
    
    # Step 1: Test authentication
    try:
        credential = ClientSecretCredential(tenant_id, client_id, client_secret)
        token = credential.get_token("https://management.azure.com/.default")
        print(f"  Auth: OK (token expires in {token.expires_on - int(time.time())}s)")
    except ClientAuthenticationError as e:
        print(f"  Auth: FAILED - {e}")
        return False
    
    # Step 2: Test API access
    try:
        monitor = MonitorManagementClient(credential, subscription_id)
        metrics = monitor.metrics.list(
            resource_uri=capacity_resource_id,
            timespan="PT5M",
            metricnames="CapacityUtilization"
        )
        # Force evaluation of the iterator
        metric_list = list(metrics.value)
        print(f"  API: OK (retrieved {len(metric_list)} metrics)")
    except HttpResponseError as e:
        print(f"  API: FAILED - {e.message}")
        return False
    
    print("  Connection test PASSED")
    return True

if __name__ == "__main__":
    # Load from Key Vault or env vars for testing
    import os
    test_connection(
        tenant_id=os.environ["TEST_TENANT_ID"],
        client_id=os.environ["TEST_CLIENT_ID"],
        client_secret=os.environ["TEST_CLIENT_SECRET"],
        subscription_id=os.environ["TEST_SUBSCRIPTION_ID"],
        capacity_resource_id=os.environ["TEST_CAPACITY_ID"]
    )

Run this after onboarding each customer to catch permission issues early.

Customer audit trail

One thing customers appreciate: they can see exactly what you're doing.

All Azure API calls show up in their Activity Log. They can filter by the service principal name and see:

  • What resources you accessed
  • What operations you performed
  • When each call happened
  • The source IP address
# Customer can run this to see your monitoring activity
Get-AzActivityLog -ResourceId $capacityId -StartTime (Get-Date).AddDays(-7) |
    Where-Object { $_.Caller -like "*YourCompany-CapacityMonitor*" } |
    Select-Object EventTimestamp, OperationName, Status

This builds trust. They're not blindly giving you access. They can verify you're only reading metrics, not making changes.

Revoking access

Customer needs to cut off your access immediately? Two options:

Option 1: Delete the app registration

Remove-AzADApplication -ObjectId $app.Id -Force

Access stops immediately. All tokens become invalid.

Option 2: Remove the role assignment

Remove-AzRoleAssignment -ObjectId $spObjectId -RoleDefinitionName "Reader" -Scope $capacityId

Auth still works but API calls fail with authorization errors. Useful if they want to temporarily pause access.

Putting it all together

The full implementation is in the Fabric Capacity Monitor repo. Key files:

  • src/auth/cross_tenant.py - Token management and credential handling
  • src/services/customer_onboarding.py - Adding new customers
  • infrastructure/keyvault.bicep - Key Vault setup for credential storage
  • docs/customer-setup.md - Instructions to send customers

Cross-tenant auth looks complicated at first. But once you understand the service principal model, it's actually cleaner than alternatives. Customer controls everything. You get exactly the access you need. Both sides have full visibility.

The setup takes about 10 minutes per customer. After that, the monitoring just works.

share:

frequently asked questions

Yari Bouwman

Written by

Data Engineer and Solution Designer specializing in scalable data platforms and modern cloud solutions. More about me

related posts