Azure Monitor Overview
Azure Monitor is Microsoft's unified observability platform that collects, analyses, and acts on telemetry from cloud and on-premises environments. It integrates natively with all Azure services and supports hybrid scenarios.
Azure Monitor Architecture
┌──────────────────────────────────────────────────────────────────┐
│ AZURE MONITOR ARCHITECTURE │
└──────────────────────────────────────────────────────────────────┘
Data Sources Azure Monitor Consumers
┌─────────────────┐ ┌──────────────────────┐ ┌─────────────────┐
│ Applications │───────────▶│ │ │ Dashboards │
│ (App Insights) │ │ ┌────────────┐ │ │ (Workbooks) │
├─────────────────┤ │ │ Metrics │ │ ├─────────────────┤
│ Infrastructure │───────────▶│ └────────────┘ │─────▶│ Alerts │
│ (VMs, AKS, etc.)│ │ │ │ (Action Groups)│
├─────────────────┤ │ ┌────────────┐ │ ├─────────────────┤
│ Azure Platform │───────────▶│ │ Logs │ │─────▶│ Autoscale │
│ (Activity Logs)│ │ │(Log Analytics) │ │ │
├─────────────────┤ │ └────────────┘ │ ├─────────────────┤
│ Custom Sources │───────────▶│ │─────▶│ Power BI / │
│ (REST API, SDK) │ │ ┌────────────┐ │ │ Logic Apps │
└─────────────────┘ │ │ Traces │ │ └─────────────────┘
│ └────────────┘ │
└──────────────────────┘Data Platform
- Metrics: Time-series numerical data, stored in optimised database
- Logs: Structured and unstructured data in Log Analytics workspaces
- Traces: Distributed traces from Application Insights
Application Insights Setup
Application Insights is an APM service that provides deep visibility into application performance, including request rates, response times, failure rates, and dependency tracking.
Creating Application Insights
# Azure CLI# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection string# Get connection stringGet connection string
az monitor app-insights component show \
--app my-app-insights \
--resource-group my-rg \
--query connectionStringSDK Integration (Node.js)
// Application Insights SDK setup// Enable Azure Monitor with OpenTelemetry// Enable Azure Monitor with OpenTelemetry// Enable Azure Monitor with OpenTelemetry// Enable Azure Monitor with OpenTelemetry// Traditional SDK approach (still supported)// Traditional SDK approach (still supported)// Traditional SDK approach (still supported)// Traditional SDK approach (still supported)// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom event// Track custom metric// Track custom metric// Track custom metric// Track custom metric// Track custom metric// Track custom metric// Track custom metrictric
client.trackMetric({
name: 'OrderProcessingTime',
value: 1234
});Auto-instrumentation
For many platforms, Azure provides auto-instrumentation that requires minimal code changes:
- .NET: Enable via Azure portal or ApplicationInsights NuGet
- Java: Attach agent JAR file
- Node.js: Use applicationinsights package
- Python: Use opencensus-ext-azure package
Log Analytics Workspace
Log Analytics is the central log repository for Azure Monitor. It stores logs from Azure resources, applications, and custom sources, enabling powerful queries using Kusto Query Language (KQL).
Creating a Workspace
# Azure CLI
az monitor log-analytics workspace create \
--resource-group my-rg \
--workspace-name my-workspace \
--location uksouth \
--retention-time 30 \
--sku PerGB2018KQL Query Basics
// Basic query - filter and project// Aggregate - requests per hour// Aggregate - requests per hour// Aggregate - requests per hour// Aggregate - requests per hour// Aggregate - requests per hour// Join traces with requests// Join traces with requests// Join traces with requests// Join traces with requests// Join traces with requests// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentiles// Percentilesles
requests
| where timestamp > ago(24h)
| summarize
p50 = percentile(duration, 50),
p95 = percentile(duration, 95),
p99 = percentile(duration, 99)
by bin(timestamp, 1h)
| render timechartCommon KQL Patterns
// Error rate// Slow requests// Slow requests// Slow requests// Slow requests// Slow requests// Slow requests// Slow requests// Slow requests// Slow requests// Slow requests// Dependency failures// Dependency failures// Dependency failures// Dependency failures// Dependency failures// Dependency failures// User sessions// User sessions// User sessions// User sessions// User sessions// User sessions// User sessions// User sessions// User sessions// Application map data// Application map data// Application map data// Application map data// Application map data// Application map data// Application map datapplication map data
requests
| where timestamp > ago(1h)
| summarize count() by cloud_RoleName
| join kind=inner (
dependencies
| where timestamp > ago(1h)
| summarize count() by cloud_RoleName, target
) on cloud_RoleNameLog Retention
Retention Best Practice
Set retention per table based on compliance and operational needs. Archive older data to Storage Account for cost savings while maintaining compliance.
Container Insights for AKS
Container Insights provides performance monitoring for AKS clusters, collecting metrics from nodes, pods, and containers.
Enable Container Insights
# Enable for existing AKS cluster# Enable Prometheus metrics collection# Enable Prometheus metrics collection# Enable Prometheus metrics collection# Enable Prometheus metrics collection# Enable Prometheus metrics collection# Enable Prometheus metrics collectionnable Prometheus metrics collection
az aks update \
--resource-group my-rg \
--name my-aks-cluster \
--enable-azure-monitor-metricsContainer Insights KQL Queries
// Pod CPU usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Pod memory usage// Container restarts// Container restarts// Container restarts// Container restarts// Container restarts// Container restarts// Container restarts// Container restarts// Container restarts// Container restarts// Node status// Node status// Node status// Node status// Node status// Node status// Node status// Node status// Node status// Node status// Node status// Node status Node status
KubeNodeInventory
| where TimeGenerated > ago(1h)
| summarize arg_max(TimeGenerated, *) by Computer
| project Computer, Status, Labels, KubeletVersionPrometheus Integration
# ConfigMap for Prometheus scraping
apiVersion: v1
kind: ConfigMap
metadata:
name: ama-metrics-prometheus-config
namespace: kube-system
data:
prometheus-config: |
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__Distributed Tracing
Application Insights provides end-to-end distributed tracing, allowing you to follow requests across services and identify performance bottlenecks.
Application Map
The Application Map automatically discovers dependencies between services and visualises request flow, showing latency and failure rates.
Custom Telemetry Correlation
// Node.js - Correlating custom telemetry// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlation// Track operation with correlationcorrelation
async function processOrder(orderId) {
const correlationContext = appInsights.getCorrelationContext();
client.trackDependency({
name: 'ProcessPayment',
data: `/api/payments/${orderId}`,
duration: 1234,
success: true,
dependencyTypeName: 'HTTP',
target: 'payment-service',
// Correlation is automatic when using the SDK
});
}Sampling Configuration
// Configure adaptive sampling// Adjust sampling// Adjust sampling// Adjust sampling// Adjust sampling// Adjust sampling// Exclude specific telemetry from sampling// Exclude specific telemetry from sampling Exclude specific telemetry from sampling
appInsights.defaultClient.addTelemetryProcessor((envelope) => {
// Don't sample errors
if (envelope.data.baseType === 'ExceptionData') {
return true;
}
// Don't sample slow requests
if (envelope.data.baseType === 'RequestData' &&
envelope.data.baseData.duration > 5000) {
return true;
}
return true; // Let adaptive sampling handle the rest
});Infrastructure as Code Setup
Terraform Configuration
# main.tf - Azure Monitor Infrastructure# Log Analytics Workspace# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Application Insights# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Action Group for alerts# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Metric Alert - High Error Rate# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Log Alert - Exception spike# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resources# Diagnostic settings for Azure resourcesttings for Azure resources
resource "azurerm_monitor_diagnostic_setting" "aks" {
name = "aks-diagnostics"
target_resource_id = azurerm_kubernetes_cluster.main.id
log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
enabled_log {
category = "kube-apiserver"
}
enabled_log {
category = "kube-controller-manager"
}
enabled_log {
category = "kube-scheduler"
}
metric {
category = "AllMetrics"
enabled = true
}
}Workbooks and Dashboards
Azure Workbooks provide interactive reports combining metrics, logs, and visualisations into shareable documents.
Creating Custom Workbooks
// Workbook template (JSON)
{
"version": "Notebook/1.0",
"items": [
{
"type": 1,
"content": {
"json": "## Application Health Dashboard"
}
},
{
"type": 3,
"content": {
"version": "KqlItem/1.0",
"query": "requests | where timestamp > ago(24h) | summarize count() by bin(timestamp, 1h)",
"size": 0,
"title": "Requests Over Time",
"timeContext": {
"durationMs": 86400000
},
"queryType": 0,
"resourceType": "microsoft.insights/components",
"visualization": "areachart"
}
},
{
"type": 3,
"content": {
"version": "KqlItem/1.0",
"query": "requests | where timestamp > ago(24h) | summarize percentile(duration, 95) by bin(timestamp, 1h)",
"size": 0,
"title": "P95 Latency",
"queryType": 0,
"resourceType": "microsoft.insights/components",
"visualization": "linechart"
}
}
]
}OpenTelemetry with Azure Monitor
Azure Monitor now supports OpenTelemetry, allowing vendor-neutral instrumentation while using Azure as the backend.
Azure Monitor OpenTelemetry Exporter
// Node.js - OpenTelemetry with Azure Monitor// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collected// Your application code - traces are automatically collectede automatically collected
import express from 'express';
const app = express();
app.get('/api/data', async (req, res) => {
// HTTP calls, database queries, etc. are automatically traced
const data = await fetchData();
res.json(data);
});Migration Tip
When migrating from the classic Application Insights SDK to OpenTelemetry, start with auto-instrumentation and gradually add custom spans. Both can run side-by-side during migration.
Alerting and Action Groups
Alert Types
- Metric alerts: Based on numeric thresholds (CPU, memory, error count)
- Log alerts: Based on KQL query results
- Activity log alerts: Based on Azure resource events
- Smart detection: AI-powered anomaly detection
Action Groups
# Terraform - Action Group with multiple receivers
resource "azurerm_monitor_action_group" "critical" {
name = "critical-alerts"
resource_group_name = azurerm_resource_group.main.name
short_name = "critical"
email_receiver {
name = "platform-team"
email_address = "platform@example.com"
use_common_alert_schema = true
}
sms_receiver {
name = "oncall"
country_code = "44"
phone_number = "7123456789"
}
webhook_receiver {
name = "pagerduty"
service_uri = "https://events.pagerduty.com/integration/{key}/enqueue"
use_common_alert_schema = true
}
logic_app_receiver {
name = "incident-automation"
resource_id = azurerm_logic_app_workflow.incident.id
callback_url = azurerm_logic_app_trigger_http_request.incident.callback_url
use_common_alert_schema = true
}
}Cost Management
Azure Monitor Pricing
- Log Analytics: Per GB ingested (commitment tiers available)
- Application Insights: Per GB ingested + data retention
- Metrics: Custom metrics charged per time series
- Alerts: Per metric/log alert rule
Cost Reduction Strategies
# 1. Use commitment tiers for predictable workloads# 100 GB/day tier provides significant discount# 2. Configure data collection rules to reduce ingestion# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 3. Set per-table retention# 4. Export cold data to storage# 4. Export cold data to storage# 4. Export cold data to storage# 4. Export cold data to storage# 4. Export cold data to storage# 4. Export cold data to storage# 4. Export cold data to storage Export cold data to storage
resource "azurerm_log_analytics_data_export_rule" "archive" {
name = "archive-to-storage"
resource_group_name = azurerm_resource_group.main.name
workspace_resource_id = azurerm_log_analytics_workspace.main.id
destination_resource_id = azurerm_storage_account.archive.id
table_names = ["ContainerLog", "Perf"]
enabled = true
}Troubleshooting
Common issues and solutions when setting up Azure observability.
Application Insights Not Receiving Telemetry
Symptom: Application deployed but no data appearing in Application Insights.
Common causes:
- Incorrect connection string or instrumentation key
- SDK not properly initialised
- Network blocking outbound connections to Azure
- Sampling rate set too low
Solution:
# Verify connection string is set correctly echo $APPLICATIONINSIGHTS_CONNECTION_STRING # Test connectivity to ingestion endpoint curl -v https://dc.services.visualstudio.com/v2/track # Enable verbose logging in SDK (.NET example) TelemetryConfiguration.Active.TelemetryChannel.DeveloperMode = true; # For Node.js, enable debug logging APPLICATIONINSIGHTS_DEBUG=true node app.js # Check Live Metrics Stream for real-time validation # Azure Portal > Application Insights > Live Metrics
Log Analytics Queries Returning No Results
Symptom: KQL queries return empty results despite logs being ingested.
Common causes:
- Wrong workspace selected
- Time range not covering log ingestion period
- Table name or column name typos
- Data collection rules not forwarding to correct workspace
Solution:
// Check which tables have data in the workspace search * | summarize count() by $table | order by count_ desc // Verify ingestion is happening (check last 24 hours) Usage | where TimeGenerated > ago(24h) | summarize TotalGB = sum(Quantity)/1024 by DataType | order by TotalGB desc // Check for data collection rule issues Heartbeat | where TimeGenerated > ago(1h) | distinct Computer // Verify Application Insights data is flowing AppRequests | where TimeGenerated > ago(1h) | summarize count() by bin(TimeGenerated, 5m)
Container Insights Not Showing AKS Metrics
Symptom: AKS cluster enabled for monitoring but container metrics missing.
Common causes:
- Azure Monitor agent pods not running
- Workspace not linked correctly to cluster
- Data collection rules misconfigured
- RBAC permissions for agent ServiceAccount
Solution:
# Check if monitoring addon is enabled az aks show -g myResourceGroup -n myCluster \ --query addonProfiles.omsagent # Verify Azure Monitor agent pods are running kubectl get pods -n kube-system | grep ama- # Check agent logs for errors kubectl logs -n kube-system -l component=ama-logs --tail=100 # Re-enable monitoring if needed az aks enable-addons -a monitoring \ -g myResourceGroup -n myCluster \ --workspace-resource-id /subscriptions/.../workspaces/myWorkspace
Alert Rules Not Firing
Symptom: Alert conditions are met but no notifications received.
Common causes:
- Action group not configured or linked
- Alert rule disabled or in “Fired” state waiting for resolution
- Query evaluation frequency too long
- Notification channel (email/SMS) not confirmed
Solution:
# Check alert rule status
az monitor scheduled-query list -g myResourceGroup \
--query "[].{name:name, enabled:enabled, severity:severity}"
# Verify action group is properly configured
az monitor action-group show -n myActionGroup -g myResourceGroup
# Test action group to verify notifications work
az monitor action-group test-notifications create \
-g myResourceGroup --action-group myActionGroup \
--alert-type budget
# Check fired alerts
az monitor activity-log alert list -g myResourceGroupDistributed Tracing Gaps
Symptom: Trace map shows broken connections or missing service dependencies.
Common causes:
- Services using different Application Insights resources
- Trace context headers not propagated between services
- Sampling dropping related spans
- Async operations not correlated properly
Solution:
# Ensure all services use same connection string
# Or configure cross-component correlation
# Verify W3C Trace Context headers are forwarded:
# traceparent, tracestate
# Query for orphan requests (no parent)
AppDependencies
| where TimeGenerated > ago(1h)
| where isempty(ParentId)
| summarize count() by Target, DependencyType
# Enable auto-instrumentation for complete traces
# .NET: Add Application Insights NuGet package
# Node.js: require('applicationinsights').start()
# Python: configure_azure_monitor()Conclusion
Azure Monitor provides a comprehensive observability platform that integrates seamlessly with Azure services. Application Insights delivers powerful APM capabilities with distributed tracing, while Log Analytics with KQL enables sophisticated log analysis.
For organisations standardising on OpenTelemetry, Azure now provides native support through the Azure Monitor OpenTelemetry exporter, enabling vendor-neutral instrumentation while leveraging Azure's powerful analytics capabilities.
Start with the fundamentals: Application Insights for your applications, Log Analytics for centralised logging, and Container Insights for AKS. Use Infrastructure as Code to ensure your observability setup is reproducible and manage costs with commitment tiers and data collection rules.

