Amazon AWS

AWS Solutions Architect Associate (SAA-C03)

The most popular AWS certification — and one of the highest-paying in cloud. This course walks you through every domain on the SAA-C03 exam: resilient multi-tier architectures, high-performance compute and databases, security best practices, and cost optimization. Scenario-first, no fluff.

Amazon AWS Intermediate 8 modules ~40 hours Free
Design Resilient Architectures30%
High-Performing Architectures28%
Secure Applications & Architectures24%
Cost-Optimized Architectures18%
🎧
Study on the go — AWS architecture podcast Listen to scenario walkthroughs and exam strategies while commuting. Our Spotify show covers SAA-C03 domains in bite-sized episodes.
▶ Listen free
01 AWS Fundamentals & the Exam Blueprint ~3h

Before building architectures, you need a solid mental model of AWS's global infrastructure and the exam's grading logic. SAA-C03 is a judgment exam — most questions have two reasonable-sounding answers, and the right one depends on context clues like "cost-effective", "least operational overhead", or "minimum downtime".

AWS Global Infrastructure

  • Region: geographic cluster of 2+ AZs (e.g., us-east-1, eu-west-2). Choose based on: latency, data residency, service availability.
  • Availability Zone (AZ): one or more physically separate data centers in a region. AZs are connected by low-latency fiber. Your app's HA depends on spanning multiple AZs.
  • Edge Locations: CloudFront and Route 53 endpoints globally (400+). Used for content caching, DNS, and DDoS protection — not for running compute.
  • Local Zones: AWS infrastructure placed in a metro area for ultra-low latency (e.g., LA, Chicago). Extension of a region.

How to Read SAA-C03 Questions

  • Identify the constraint: "most cost-effective", "least operational overhead", "minimal downtime", "no code changes"
  • Eliminate answers that violate the constraint first
  • Between two valid options: prefer managed services over self-managed, prefer existing AWS features over custom Lambda code
  • Watch for "RTO" (Recovery Time Objective) and "RPO" (Recovery Point Objective) — they point to specific architecture patterns
Exam tip: "Least operational overhead" almost always points to fully managed services (Fargate over EC2, Aurora Serverless over RDS, API Gateway over ALB for Lambda). "Most cost-effective" points to auto-scaling, Spot Instances, Reserved Instances, or S3 lifecycle policies.

The Shared Responsibility Model

  • AWS manages: physical hardware, hypervisor, network infrastructure, managed service patches (RDS, Lambda runtime)
  • You manage: guest OS patching (on EC2), application code, data encryption, IAM permissions, security group rules, S3 bucket policies
02 Compute — EC2, Auto Scaling & Load Balancers ~6h

Compute is the biggest domain slice. The exam tests your ability to pick the right EC2 purchasing option, design Auto Scaling policies that respond correctly to load, and choose the right load balancer type for the use case.

EC2 Purchasing Options

  • On-Demand: pay per second/hour, no commitment. For unpredictable workloads or short-term spikes.
  • Reserved (1 or 3 year): 30-72% discount. For steady-state, predictable workloads. Standard vs Convertible (can change instance type).
  • Savings Plans: flexible commitment to $/hour compute spend. Covers EC2, Lambda, Fargate.
  • Spot: up to 90% discount, but AWS can reclaim with 2-minute notice. For fault-tolerant batch jobs, stateless processing, flexible HPC.
  • Dedicated Hosts: physical server for your exclusive use. For software licensing compliance (per-core or per-socket), regulatory requirements.
Exam tip: If a question mentions "fault-tolerant", "batch", or "can handle interruption" → Spot Instances. If it mentions "consistent workload" or "predictable" → Reserved or Savings Plans. If it mentions "licensing" or "compliance" → Dedicated Hosts.

Auto Scaling Policies

  • Target Tracking: maintain a metric at a target value (e.g., CPU at 50%). Simplest, recommended default.
  • Step Scaling: add/remove instances in steps based on alarm severity. Faster response than simple scaling.
  • Scheduled Scaling: pre-scale before known traffic events (e.g., market open at 9am).
  • Predictive Scaling: ML-based forecast + proactive pre-scaling. Best for recurring patterns.
  • Warm Pool: pre-initialized stopped instances ready for near-instant launch. Critical for sub-minute spike response.

Elastic Load Balancers

  • ALB (Application LB): Layer 7 HTTP/HTTPS. Path-based routing, host-based routing, Lambda targets, redirects, WAF integration.
  • NLB (Network LB): Layer 4 TCP/UDP. Ultra-low latency, static IP, Elastic IP support. PrivateLink provider.
  • GWLB (Gateway LB): Layer 3 network appliances (IDS/IPS, firewalls). Not for app load balancing.
Architecture pattern: ALB is the default choice for web applications. Use NLB when you need a static IP, extreme performance, or to front a PrivateLink service. ALB supports sticky sessions (duration or application-based), cross-zone load balancing (enabled by default), and connection draining.

EC2 Instance Store vs EBS

  • Instance Store: physically attached NVMe, extremely fast, ephemeral (data lost on stop/terminate). For temp files, caches, scratch space.
  • EBS: network-attached block storage, persists independently of instance lifecycle. Types: gp3 (general purpose), io2 (high IOPS), st1 (throughput HDD), sc1 (cold HDD).
03 Storage — S3, EFS, Storage Gateway & Snowball ~5h

S3 is the most tested service in the entire SAA-C03 exam. You need to know every storage class, lifecycle transition rule, security feature, and architecture pattern cold. This module covers everything from basic CRUD to complex cross-account access patterns.

S3 Storage Classes (know these for cost optimization questions)

  • S3 Standard: 99.99% availability, 3+ AZ, instant retrieval. For frequently accessed data.
  • S3 Intelligent-Tiering: auto-moves objects between tiers based on access patterns. Monthly monitoring fee per object. Good for unknown access patterns.
  • S3 Standard-IA: lower storage cost, higher retrieval cost. Min 128KB, 30-day minimum. For infrequently accessed data needing rapid retrieval.
  • S3 One Zone-IA: same as Standard-IA but single AZ. Cheaper but loses data if AZ fails. For reproducible infrequent data.
  • S3 Glacier Instant Retrieval: lowest-cost archive with millisecond retrieval. Min 90 days.
  • S3 Glacier Flexible Retrieval: minutes to hours retrieval. 90-day minimum. For backups.
  • S3 Glacier Deep Archive: ~$0.001/GB-month, cheapest storage in AWS. 12-hour retrieval. 180-day minimum. For 7-10+ year compliance archives.
Exam tip: "Accessed once a year" or "long-term compliance archive" → S3 Glacier Deep Archive. "Rarely accessed but need immediate retrieval" → S3 Standard-IA. "Don't know access pattern" → Intelligent-Tiering.

S3 Security

  • Bucket Policy: resource-based policy on the bucket. Controls cross-account access, public access, VPC endpoint access.
  • IAM Policy: identity-based, attached to users/roles. Controls what that identity can do on S3.
  • S3 Block Public Access: overrides all bucket policies and ACLs to prevent any public access. Enable by default on all buckets.
  • Presigned URLs: time-limited access using the creator's credentials. For user uploads (PUT) or downloads (GET) without exposing credentials.
  • aws:PrincipalOrgID: restrict bucket access to principals within your AWS Organization only.
  • S3 Object Lock: WORM (Write Once Read Many). Governance mode: override with special permission. Compliance mode: NOBODY can delete, including root.

S3 Encryption Options

  • SSE-S3: AWS manages keys. AES-256. Transparent.
  • SSE-KMS: Customer controls key via KMS. Full audit trail in CloudTrail. Can disable key to block access.
  • SSE-C: you provide the encryption key with every request. AWS doesn't store the key.
  • Client-side encryption: encrypt before sending to S3. You manage everything.

Data Migration Services

  • AWS DataSync: online transfer of NFS/SMB/HDFS data to S3, EFS, or FSx over the network. Automated scheduling, data validation.
  • AWS Snowball Edge: physical appliance for 10TB-100PB migrations. Use when internet bandwidth makes online transfer take more than 5-7 days.
  • AWS Snowmobile: 100PB per truck for exabyte-scale migrations. Extremely rare exam scenario.
Decision framework: Calculate the transfer time at your bandwidth. If it exceeds 5-7 days at 100% utilization, choose Snowball. For 50TB at 100Mbps = ~46 days → use Snowball.
04 Networking — VPC, Security Groups & Connectivity ~6h

VPC networking is heavily tested. You need to understand how traffic flows from the internet to a private EC2 instance, why a NAT Gateway beats an Internet Gateway for private subnets, and when to use Transit Gateway vs VPC Peering.

VPC Core Components

  • Internet Gateway (IGW): enables two-way internet communication for resources with public IPs. Attached to a VPC, not a subnet.
  • Public Subnet: has a route 0.0.0.0/0 → IGW. Resources need a public or Elastic IP to communicate with the internet.
  • Private Subnet: no direct route to the internet. Resources access internet via NAT Gateway (outbound only).
  • NAT Gateway: in a public subnet, allows private subnet instances to initiate outbound internet connections. Blocks all inbound. Managed, scales automatically.
  • Route Table: controls where subnet traffic is forwarded. Each subnet has one route table.
Classic architecture: Web tier in public subnet (behind ALB) → App tier in private subnet → DB tier in private subnet. ALB accepts internet traffic; app tier reaches internet via NAT Gateway; DB has no internet route.

Security Groups vs NACLs

  • Security Groups: stateful (return traffic allowed automatically), applied to ENI/instance level, allow-only rules, support SG-to-SG references.
  • NACLs (Network ACL): stateless (must explicitly allow return traffic), applied at subnet level, support allow and deny rules, numbered rule order.

VPC Connectivity Options

  • VPC Peering: direct, low-latency connection between two VPCs (same or different accounts/regions). Not transitive — if A peers B and B peers C, A cannot reach C via B.
  • AWS Transit Gateway: hub-and-spoke router connecting hundreds of VPCs and on-premises networks. Supports transitive routing. Best for large-scale multi-VPC architectures.
  • AWS PrivateLink: expose a service from one VPC to others without peering. Backed by an NLB. No CIDR overlap concerns. Traffic stays on AWS network.
  • AWS Site-to-Site VPN: encrypted tunnel over the public internet between on-premises and VPC. Fast to provision, lower cost. Variable latency.
  • AWS Direct Connect: dedicated private link from on-premises to AWS. Consistent bandwidth, low latency. Takes weeks to provision.

VPC Endpoints (reduce NAT Gateway costs)

  • Gateway Endpoint: free. Only for S3 and DynamoDB. Add a route to the route table. Traffic stays on AWS network without going through NAT.
  • Interface Endpoint: paid (hourly + GB). For most other AWS services (SSM, STS, ECR, Secrets Manager, etc.). Creates an ENI in your subnet.
Cost trap: EC2 accessing S3 via NAT Gateway pays NAT data processing fees ($0.045/GB). EC2 accessing S3 via a Gateway VPC Endpoint pays nothing. A common exam scenario asks how to reduce costs for heavy S3 traffic from private subnets.
05 Databases — RDS, Aurora, DynamoDB & ElastiCache ~6h

The database module covers one of the richest decision trees in AWS architecture. You need to know which database to pick for a workload, and how to configure it for HA, performance, and disaster recovery.

RDS — Relational Database Service

  • Multi-AZ: synchronous standby in another AZ. Automatic failover (~60-120s). Same endpoint. For HA, not performance. Zero code change needed.
  • Read Replicas: asynchronous copy. Reduces load on primary. Different endpoint — app must route reads there. Supports cross-region replicas. Can be promoted to standalone (planned migration).
  • RDS Proxy: connection pooling for Lambda and serverless. Reduces DB connection storms. Supports IAM auth and Secrets Manager integration.
Multi-AZ vs Read Replica: "Available if AZ fails" = Multi-AZ. "Read performance / offload reporting" = Read Replica. They can be combined.

Amazon Aurora

  • AWS-proprietary engine: MySQL and PostgreSQL compatible. 5x faster than MySQL, 3x faster than PostgreSQL.
  • Storage auto-scales from 10GB to 128TB in 10GB increments across 6 copies in 3 AZs.
  • Up to 15 read replicas sharing the same storage (vs RDS's 5 replicas with separate storage). Failover promotes a replica in ~30s.
  • Aurora Serverless v2: auto-scales compute capacity in fractions of an ACU (Aurora Capacity Unit). Ideal for variable or unpredictable workloads.
  • Aurora Global Database: single Aurora cluster spanning multiple regions with <1 second replication lag. Supports cross-region disaster recovery and global reads.

Amazon DynamoDB

  • Fully managed serverless NoSQL — key-value and document model.
  • Partition key: uniquely identifies items and determines which partition stores them. Choose high-cardinality keys for even distribution.
  • GSI (Global Secondary Index): query on non-key attributes. Own provisioned throughput. Eventually consistent only.
  • LSI (Local Secondary Index): alternate sort key. Must be created at table creation. Uses table's throughput.
  • DAX (DynamoDB Accelerator): in-memory cache for DynamoDB. Microsecond latency for cached reads. API-compatible — no code changes.
  • DynamoDB Streams: time-ordered change log of every item modification. Triggers Lambda for real-time processing or audit trail.
  • ConsistentRead=true: strongly consistent reads (costs 2x RCU). Cannot be used on GSIs.

Amazon ElastiCache

  • Redis: persistent, replication, Multi-AZ, pub/sub, sorted sets. For session caching, leaderboards, real-time analytics.
  • Memcached: simple, multi-threaded, horizontal scaling with sharding. No persistence. For simple key-value caching.
Caching pattern: Session state should go in ElastiCache Redis, not in EC2 instance memory. This makes the app stateless and enables Auto Scaling to add/remove instances without losing sessions.
06 Security — IAM, KMS, WAF & Compliance Services ~5h

Security is 24% of the exam. The focus is on IAM architecture patterns (roles > users), encryption choices, and knowing which AWS security service to use for which threat.

IAM Best Practices

  • Never use the root account for daily tasks. Enable MFA on root. Lock away the root access keys.
  • Use IAM roles for EC2, Lambda, ECS — never embed access keys in code or instances.
  • Cross-account roles: use sts:AssumeRole for temporary cross-account access instead of IAM users.
  • Least privilege: grant the minimum permissions required. Use condition keys (aws:RequestedRegion, aws:SourceIp, aws:MultiFactorAuthPresent) to further restrict.
  • Permission Boundaries: set a maximum set of permissions an IAM role can have (even if the attached policy grants more). Used when delegating IAM management.
MFA enforcement: Use an IAM policy with Deny on all actions with condition aws:MultiFactorAuthPresent: false. Users can log in but can't do anything until they complete MFA.

AWS KMS

  • AWS-managed keys: AWS creates and rotates. You see CloudTrail entries but can't control the key.
  • Customer-managed keys (CMK): you create, define key policy, enable/disable, rotate (automatic annual or manual). Full audit trail.
  • Envelope encryption: KMS encrypts a data encryption key (DEK), the DEK encrypts your data. You only make KMS API calls for the DEK, not for each byte of data.

Security Services — Match to Threat

  • AWS WAF: Layer 7 HTTP protection. SQL injection, XSS, rate limiting. Attaches to ALB, CloudFront, API Gateway.
  • AWS Shield Standard: automatic, free DDoS protection at Layers 3/4 for all AWS customers.
  • AWS Shield Advanced: paid, 24/7 DDoS response team, cost protection, detailed attack diagnostics.
  • Amazon GuardDuty: ML-powered threat detection analyzing VPC Flow Logs, CloudTrail, DNS logs. Detects compromised instances, unusual API calls, crypto-mining.
  • Amazon Inspector: automated vulnerability scanning for EC2 instances and Lambda functions. Checks against CVE database.
  • AWS Macie: uses ML to discover and protect sensitive data (PII, credentials) in S3 buckets.
  • AWS Config: continuous resource configuration tracking and compliance evaluation. Automatic remediation via SSM Automation.
  • AWS Secrets Manager: stores and automatically rotates database credentials, API keys. Native integration with RDS, Redshift, DocumentDB.
  • AWS Systems Manager Parameter Store: configuration data and secrets. SecureString type uses KMS. Free for standard tier.
07 Serverless & Messaging — Lambda, SQS, SNS, API Gateway ~5h

Serverless architectures and asynchronous messaging patterns are heavily tested. The SAA-C03 exam loves asking about "decoupling" — whenever you see that word, think SQS, SNS, or EventBridge.

AWS Lambda

  • Max execution: 15 minutes. Max memory: 10GB. Max deployment: 250MB (unzipped), or use container images up to 10GB.
  • Lambda scales automatically: initial burst limit then +500 instances/minute per region.
  • Lambda in VPC: function gets an ENI in your VPC. Requires private subnet with NAT Gateway for internet access. Needed to access RDS, ElastiCache in private subnets.
  • Lambda@Edge: run Lambda at CloudFront edge locations for viewer request/response customization. Max 30s execution.
  • Provisioned Concurrency: pre-warms Lambda instances to eliminate cold starts for latency-sensitive workloads.

Amazon SQS

  • Standard Queue: unlimited throughput, at-least-once delivery, best-effort ordering. For most decoupling use cases.
  • FIFO Queue: exactly-once processing, strict ordering. Max 3000 msg/s with batching. For financial transactions, order processing.
  • Visibility Timeout: time a message is hidden after being received. If processing fails, message reappears after timeout.
  • Dead Letter Queue (DLQ): receives messages that failed maxReceiveCount times. Prevents poison pills from blocking the queue.
  • Long Polling: wait up to 20s for messages. Reduces empty receive calls and cost.
Fan-out pattern: Producer → SNS Topic → Multiple SQS Queues (one per consumer). Each consumer service independently processes from its own queue. This decouples producers from consumers and allows independent scaling.

Amazon SNS

  • Pub/sub: publish once to a topic, deliver to multiple subscribers (SQS, Lambda, HTTP, email, SMS).
  • No message retention — if a subscriber is unavailable, the message is lost (use SQS + SNS for durability).
  • SNS FIFO: ordered delivery to SQS FIFO queues only. For strict ordering fan-out.

Amazon API Gateway

  • REST API (v1): full feature set, request/response transformation, usage plans.
  • HTTP API (v2): lower latency, lower cost (~70% cheaper). Best for Lambda proxy and JWT authentication.
  • WebSocket API: bidirectional persistent connections for real-time apps (chat, games, live feeds).
  • Max timeout: 29 seconds — never use API Gateway for long-running jobs.
  • Authorizers: Lambda Authorizer (custom auth logic), Cognito User Pool Authorizer (JWT validation).
Architecture decision: Jobs taking more than 29 seconds cannot be synchronous through API Gateway. Submit to SQS, process async, notify via SNS or webhook when done.

AWS Step Functions

  • Orchestrate multi-step workflows with state management. Visual workflow editor.
  • Handles retries, error catching, parallel execution, wait states.
  • Use when Lambda + SQS isn't enough — when you need explicit state, conditional branching, or workflow visualization.
08 Cost Optimization, DR Strategies & Exam Prep ~4h

Cost optimization is 18% of the exam — don't underestimate it. This module also covers disaster recovery patterns (know your RTO/RPO), and gives you a final exam strategy checklist.

Cost Optimization Pillars

  • Right-sizing: use AWS Compute Optimizer recommendations. Don't over-provision EC2.
  • Pricing model matching: On-Demand for variable, Reserved/Savings Plans for steady-state, Spot for fault-tolerant batch.
  • Storage lifecycle: transition S3 objects to cheaper tiers as they age. Delete objects when no longer needed.
  • Data transfer: use S3 Gateway Endpoints to eliminate NAT Gateway fees for S3 traffic. Keep traffic within the same region/AZ where possible (AZ cross-traffic is charged).
  • Serverless: pay only for what you use. Lambda + Fargate + DynamoDB On-Demand are ideal for spiky workloads.

Disaster Recovery Strategies (RTO / RPO)

  • Backup & Restore: cheapest. Highest RTO (hours). Highest RPO (hours). Restore from S3/Glacier backups.
  • Pilot Light: core services (DB) always running in secondary region. Scale up on failover. Medium cost, medium RTO (minutes-hours).
  • Warm Standby: scaled-down but fully functional system running in secondary. Fast failover (minutes). Higher cost.
  • Multi-Site Active-Active: full capacity in both regions, Route 53 splits traffic. Near-zero RTO/RPO. Most expensive.
RTO/RPO matching: "RPO of 1 minute" → Aurora Global Database (sub-second lag). "RPO of 24 hours OK" → daily snapshots. "RTO of 1 hour OK" → Pilot Light. "RTO near-zero" → Multi-site active-active or Aurora Global.

Key AWS Services for Analytics & Migration

  • Amazon Athena: serverless SQL on S3. Pay per TB scanned. Use Parquet format + partitioning to reduce scan size.
  • Amazon Redshift: petabyte-scale columnar data warehouse. For complex OLAP queries. Redshift Spectrum: query S3 directly from Redshift.
  • AWS DMS: Database Migration Service. Online migration with CDC (Change Data Capture) for minimal downtime. Schema Conversion Tool for heterogeneous migrations (Oracle → Aurora).
  • Amazon Kinesis: Data Streams (real-time ingestion), Data Firehose (delivery to S3/Redshift/OpenSearch), Analytics (real-time SQL on streams).

Final Exam Checklist

  • Know the 6 S3 storage classes and when to use each
  • Know Multi-AZ vs Read Replica for RDS
  • Know Gateway Endpoint (S3/DynamoDB, free) vs Interface Endpoint (everything else, paid)
  • Know VPC Peering (non-transitive) vs Transit Gateway (transitive, multi-account)
  • Know PrivateLink for exposing services without full VPC peering
  • Know the 4 DR strategies and their RTO/RPO tradeoffs
  • Know IAM role = EC2/Lambda credentials, never embed access keys
  • Know SQS DLQ prevents poison pills from blocking the queue
  • Know SNS + SQS fan-out for multiple consumers
  • Know Spot Instances for fault-tolerant batch; Reserved/Savings Plans for steady-state
Timing strategy: You have 130 minutes for 65 questions (~2 min/question). Flag uncertain questions and return to them. On the real exam, 2-3 questions per domain may reference the same scenario (case study format).
🎧
Reinforce what you've learned — AWS podcast Our Spotify show breaks down the hardest SAA-C03 topics in 10-15 minute episodes. Perfect for commutes, gym sessions, or a second pass before exam day.
▶ Open on Spotify

Ready to test your SAA-C03 knowledge?

60 scenario-based practice questions covering all 4 exam domains. Your progress is saved locally — no signup required.