A comprehensive infrastructure-as-code project for implementing disaster recovery and backup protection for Azure VMs using Bicep, Azure Backup, and Azure Site Recovery.
┌──────────────────────────┐
│ PRIMARY REGION │
│ (East US) │
│ │
│ ┌────────────────────┐ │
│ │ Production VM │ │
│ │ │ │
│ │ • Applications │ │
│ │ • Databases │ │
│ │ • Services │ │
│ └────────┬───────────┘ │
└───────────┼───────────────┘
│
┌───────┴────────┐
│ │
▼ ▼
DAILY SNAPSHOTS CONTINUOUS REPLICATION
(Azure Backup) (Site Recovery)
│ │
▼ ▼
RECOVERY VAULT REPLICA VM (Secondary)
│ │
│ ▼
│ ┌──────────────────────────┐
│ │ SECONDARY REGION │
│ │ (West US) │
│ │ │
│ │ ┌────────────────────┐ │
│ │ │ Standby VM │ │
│ │ │ │ │
│ │ │ • Synchronized │ │
│ │ │ • Ready to activate │
│ │ │ • In recovery vault │
│ │ └────────────────────┘ │
│ └──────────────────────────┘
│ │
└────────────┬───┘
│
▼
[FAILOVER]
On Demand or
Automatic
Backup Flow:
- VM generates data changes throughout the day
- Azure Backup agent captures snapshots
- Snapshots stored in Recovery Services Vault
- Daily backup runs at configured time (2 AM UTC)
- Recovery points maintained per retention policy
- 30 days of recovery points available
Replication Flow:
- VM runs in primary region (East US)
- Azure Site Recovery continuously replicates disks
- Replica VMs maintained in secondary region (West US)
- Every 5-15 minutes, replication syncs
- On failover, replica VM is activated
- Traffic redirects to recovery region
Recovery Flow:
- Detect outage/failure in primary region
- Initiate failover (planned or unplanned)
- Replica VM starts in secondary region
- Applications launch and connect to users
- RTO: 10-15 minutes, RPO: 5-15 minutes
- When primary recovers, failback to original region
| Component | Role | Purpose |
|---|---|---|
| Azure Backup | Point-in-time snapshots | Protect against data loss, ransomware |
| Azure Site Recovery | Continuous replication | Failover to secondary region |
| Recovery Services Vault | Centralized management | Store backup/replication config |
| Recovery Point | Point-in-time copy | Restore state from specific time |
| Replica VM | Secondary copy | Standby VM in recovery region |
| RTO | Recovery Time Objective | Time to restore (goal: 10-15 min) |
| RPO | Recovery Point Objective | Data loss acceptable (goal: 5-15 min) |
cd cloudshell/
bash deploy.shCreates:
- Resource Group
- Virtual Network (10.0.0.0/16)
- Subnet (10.0.1.0/24)
- Windows Server 2019 VM
- Public IP & Network Interface
- Network Security Group (RDP/SSH enabled)
- Recovery Services Vault
- Daily Backup Policy
- Log Analytics workspace (optional)
- Key Vault for encryption
Follow portal/backupconfig.md:
- Understand backup jobs
- View recovery points
- Trigger on-demand backup
- Monitor backup health
- Configure retention policies
Follow portal/siterecoveryconfig.md:
- Enable Site Recovery on vault
- Select VM for protection
- Configure secondary region
- Set up replication policy
- Enable continuous replication
bash cloudshell/failovertest.shPerforms:
- Creates test failover VM
- Launches in recovery region
- Validates connectivity
- Tests application functionality
- Cleans up test VM
| Technology | Purpose | Version |
|---|---|---|
| Bicep | Infrastructure as Code | Latest |
| Azure CLI | Deployment & management | 2.40+ |
| Azure Backup | Snapshots & point-in-time recovery | Current |
| Azure Site Recovery | Continuous replication & failover | Current |
| Recovery Services Vault | Centralized management | Current |
| Azure Key Vault | Encryption key management | Current |
-
Single File Loss:
- Use Backup → File Recovery
- No need to failover entire VM
-
VM Corrupted:
- Use Backup → Restore to new VM
- Creates clean copy from snapshot
-
Regional Outage:
- Use Site Recovery → Failover
- VM activates in secondary region
-
Ransomware Attack:
- Use Backup → Restore from clean point
- Or failover to pre-infection state