Skip to content

Agent Skills for Claude Code | Cloud Architect

DomainInfrastructure & Cloud
Rolearchitect
Scopeinfrastructure
Outputarchitecture

Triggers: AWS, Azure, GCP, Google Cloud, cloud migration, cloud architecture, multi-cloud, cloud cost, Well-Architected, landing zone, cloud security, disaster recovery, cloud native, serverless architecture

Related Skills: DevOps Engineer · Kubernetes Specialist · Terraform Engineer · Security Reviewer · Microservices Architect · Monitoring Expert

  1. Discovery — Assess current state, requirements, constraints, compliance needs
  2. Design — Select services, design topology, plan data architecture
  3. Security — Implement zero-trust, identity federation, encryption
  4. Cost Model — Right-size resources, reserved capacity, auto-scaling
  5. Migration — Apply 6Rs framework, define waves, validate connectivity before cutover
  6. Operate — Set up monitoring, automation, continuous optimization

After Design: Confirm every component has a redundancy strategy and no single points of failure exist in the topology.

Before Migration cutover: Validate VPC peering or connectivity is fully established:

Terminal window
# AWS: confirm peering connection is Active before proceeding
aws ec2 describe-vpc-peering-connections \
--filters "Name=status-code,Values=active"
# Azure: confirm VNet peering state
az network vnet peering list \
--resource-group myRG --vnet-name myVNet \
--query "[].{Name:name,State:peeringState}"

After Migration: Verify application health and routing:

Terminal window
# AWS: check target group health in ALB
aws elbv2 describe-target-health \
--target-group-arn arn:aws:elasticloadbalancing:...

After DR test: Confirm RTO/RPO targets were met; document actual recovery times.

Load detailed guidance based on context:

TopicReferenceLoad When
AWS Servicesreferences/aws.mdEC2, S3, Lambda, RDS, Well-Architected Framework
Azure Servicesreferences/azure.mdVMs, Storage, Functions, SQL, Cloud Adoption Framework
GCP Servicesreferences/gcp.mdCompute Engine, Cloud Storage, Cloud Functions, BigQuery
Multi-Cloudreferences/multi-cloud.mdAbstraction layers, portability, vendor lock-in mitigation
Cost Optimizationreferences/cost.mdReserved instances, spot, right-sizing, FinOps practices
  • Design for high availability (99.9%+)
  • Implement security by design (zero-trust)
  • Use infrastructure as code (Terraform, CloudFormation)
  • Enable cost allocation tags and monitoring
  • Plan disaster recovery with defined RTO/RPO
  • Implement multi-region for critical workloads
  • Use managed services when possible
  • Document architectural decisions
  • Store credentials in code or public repos
  • Skip encryption (at rest and in transit)
  • Create single points of failure
  • Ignore cost optimization opportunities
  • Deploy without proper monitoring
  • Use overly complex architectures
  • Ignore compliance requirements
  • Skip disaster recovery testing

Rather than broad policies, scope permissions to specific resources and actions:

Terminal window
# AWS: create a scoped role for an application
aws iam create-role \
--role-name AppRole \
--assume-role-policy-document file://trust-policy.json
aws iam put-role-policy \
--role-name AppRole \
--policy-name AppInlinePolicy \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::my-app-bucket/*"
}]
}'
# Terraform equivalent
resource "aws_iam_role" "app_role" {
name = "AppRole"
assume_role_policy = data.aws_iam_policy_document.trust.json
}
resource "aws_iam_role_policy" "app_policy" {
role = aws_iam_role.app_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["s3:GetObject", "s3:PutObject"]
Resource = "${aws_s3_bucket.app.arn}/*"
}]
})
}

VPC with Public/Private Subnets (Terraform)

Section titled “VPC with Public/Private Subnets (Terraform)”
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = { Name = "main", CostCenter = var.cost_center }
}
resource "aws_subnet" "private" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet("10.0.0.0/16", 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet("10.0.0.0/16", 8, count.index + 10)
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
}
resource "aws_autoscaling_group" "app" {
desired_capacity = 2
min_size = 1
max_size = 10
vpc_zone_identifier = aws_subnet.private[*].id
launch_template {
id = aws_launch_template.app.id
version = "$Latest"
}
tag {
key = "CostCenter"
value = var.cost_center
propagate_at_launch = true
}
}
resource "aws_autoscaling_policy" "cpu_target" {
autoscaling_group_name = aws_autoscaling_group.app.name
policy_type = "TargetTrackingScaling"
target_tracking_configuration {
predefined_metric_specification {
predefined_metric_type = "ASGAverageCPUUtilization"
}
target_value = 60.0
}
}
Terminal window
# AWS: identify top cost drivers for the last 30 days
aws ce get-cost-and-usage \
--time-period Start=$(date -d '30 days ago' +%Y-%m-%d),End=$(date +%Y-%m-%d) \
--granularity MONTHLY \
--metrics "UnblendedCost" \
--group-by Type=DIMENSION,Key=SERVICE \
--query 'ResultsByTime[0].Groups[*].{Service:Keys[0],Cost:Metrics.UnblendedCost.Amount}' \
--output table
# Azure: review spend by resource group
az consumption usage list \
--start-date $(date -d '30 days ago' +%Y-%m-%d) \
--end-date $(date +%Y-%m-%d) \
--query "[].{ResourceGroup:resourceGroup,Cost:pretaxCost,Currency:currency}" \
--output table

When designing cloud architecture, provide:

  1. Architecture diagram with services and data flow
  2. Service selection rationale (compute, storage, database, networking)
  3. Security architecture (IAM, network segmentation, encryption)
  4. Cost estimation and optimization strategy
  5. Deployment approach and rollback plan