Skip to content
dawalnut
Menu

Building a Multi-Account AWS Infrastructure with CDK

awscdkinfrastructuretypescript

Why Multi-Account

Running everything in a single AWS account works until it doesn't. Separate accounts give you hard isolation between environments, blast radius containment when things go wrong, and clean cost tracking per workload. AWS Organizations makes this manageable.

Organization Structure

The organization uses ALL features enabled — this unlocks Service Control Policies (SCPs), trusted access for AWS services, and centralized billing.

The OU hierarchy is straightforward:

  • Root (management account)
    • portfolio-dev — dev/staging workloads
    • portfolio-prod — production workloads

Each member account is provisioned through a lightweight AccountFactory construct that wraps CfnAccount with sensible defaults for the cross-account access role.

IAM Identity Center (SSO)

Access to all accounts is managed through IAM Identity Center. A single SSO user gets permission set assignments across accounts — AdministratorAccess and PowerUserAccess for the child accounts, plus ReadOnlyAccess for the management account.

The bootstrap credential (AdministratorAccess on the management account) is manually managed to avoid a chicken-and-egg problem — you need admin access to deploy the CDK stacks that would create that same access. Everything else is CDK-managed.

Governance with SCPs

Two SCPs are attached to both OUs. The first restricts all API calls to allowed regions, with a carve-out for AWS service-linked roles that need global access:

{
  "Statement": [{
    "Sid": "DenyOutsideAllowedRegions",
    "Effect": "Deny",
    "Action": "*",
    "Resource": "*",
    "Condition": {
      "StringNotEquals": {
        "aws:RequestedRegion": ["eu-central-1", "us-east-1"]
      },
      "ForAllValues:StringNotLike": {
        "aws:PrincipalArn": ["arn:aws:iam::*:role/aws-service-role/*"]
      }
    }
  }]
}

The second denies all actions by the root user in member accounts:

{
  "Statement": [{
    "Sid": "DenyRootActions",
    "Effect": "Deny",
    "Action": "*",
    "Resource": "*",
    "Condition": {
      "StringEquals": { "aws:PrincipalType": "Root" }
    }
  }]
}

Delegated DNS

The management account owns only the root hosted zone for the domain. Each child account creates its own subdomain zone and registers NS delegation records in the root zone via a cross-account IAM role.

The delegation role grants route53:ChangeResourceRecordSets, route53:ListResourceRecordSets, and route53:GetHostedZone — scoped to the root hosted zone ARN. This means a child account can only write NS records into the root zone for its subdomain delegation, nothing else.

CDK's CrossAccountZoneDelegationRecord construct handles the delegation from the child account side, creating the NS records in the root zone automatically.

Certificates

ACM certificates for CloudFront must be in us-east-1, but the portfolio stack deploys to eu-central-1. CDK's crossRegionReferences feature handles this natively — no custom Lambda resources needed.

Root Domain Certificate

The management account creates a RootCertificateStack in us-east-1 with a certificate for the root domain (dawalnut.com) and a wildcard SAN (*.dawalnut.com). This covers any future services hosted directly under the root domain. The certificate validates against the root hosted zone in the same account.

Child Account Certificates

Each child account gets two stacks per stage:

  1. CertificateStack (us-east-1) — creates the subdomain zone, NS delegation, and ACM certificate with CertificateValidation.fromDns(zone)
  2. PortfolioStack (eu-central-1) — references the certificate and zone ID cross-region

All certificates include a wildcard SAN (e.g. *.portfolio.dawalnut.com) alongside the apex domain, covering future subdomains without needing new certificates.

CDK automatically creates SSM parameters to pass the certificate ARN and zone ID between regions. CloudFormation handles DNS validation natively and will wait up to 72 hours for the certificate to be issued — far more reliable than a Lambda with a 30-minute timeout.

One subtlety: the ACM certificate must have an explicit dependency on the NS delegation record. CloudFormation creates resources in parallel by default, and certificate validation queries public DNS — if delegation hasn't propagated yet, validation stalls indefinitely. Adding node.addDependency(delegation) ensures the right ordering.

Serverless Hosting

The SsrSite construct bundles the TanStack Start output into a serverless deployment:

  • CloudFront sits in front with two cache behaviors
  • /_assets/* routes to an S3 bucket via OAC with optimized caching
  • /* (everything else) routes to API Gateway → Lambda for server-side rendering with caching disabled

The Lambda runs Node.js 22.x with the TanStack Start server handler. Static assets are deployed separately to S3 with a _assets/ prefix, so CloudFront can serve them directly without hitting Lambda.

When a custom domain is configured, the construct creates Route53 alias records (A + AAAA) in the local subdomain zone.

Config Validation

All CDK context values pass through Zod schemas before any stack is created. No as casts, no runtime surprises:

const managementConfigSchema = z.object({
  accountId: z.string().min(1),
  region: z.string().min(1),
  organizationRootId: z.string().min(1),
  allowedRegions: z.array(z.string().min(1)).min(1),
  billingAlertEmail: z.string().email(),
  billingThresholds: z.array(z.number().positive()),
  accounts: z.array(accountSchema).default([]),
  domain: z.string().min(1).optional(),
  subdomains: z.array(subdomainSchema).default([]),
  sso: ssoSchema.optional(),
});

If someone passes an invalid email or forgets a required field, cdk synth fails immediately with a clear validation error instead of deploying broken infrastructure.

Billing Alerts

CloudWatch alarms monitor AWS/Billing estimated charges at configurable USD thresholds. Each threshold gets its own alarm backed by an SNS topic with email notifications. These always deploy to us-east-1 since billing metrics only exist in that region.

Simple, but it catches runaway costs before they become a problem.

Related Projects