AWS Networking: VPC, Subnets, and Transit Gateway — Complete Guide
A thorough, practical tutorial for architects and engineers — from VPC basics to multi-VPC, hybrid connectivity, HA, costs, security and troubleshooting.
- Amazon VPC gives you complete control over a logically isolated virtual network in AWS.
- Subnets, route tables, IGW, NAT and security groups form the building blocks of secure VPC designs.
- AWS Transit Gateway simplifies multi-VPC connectivity with a scalable hub-and-spoke model.
1. Introduction to AWS Networking
AWS networking is the collection of services and constructs that enable secure, reliable, and performant connectivity between resources running on Amazon Web Services and the outside world. At its core stands the Virtual Private Cloud (VPC) — an isolated virtual network you design and operate inside AWS.
Whether you migrate an on-premises datacenter or build cloud-native applications, networking in AWS encourages a shift from physical appliances to software-defined network design patterns (subnets, route tables, gateway services and native security controls).
Why this matters: cloud networking isn't just connectivity — it's where security, cost, and performance intersect. Good designs scale; poor ones cost time, money, and outages.
2. Understanding Amazon VPC (Virtual Private Cloud)
Definition: A VPC is a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define. You control IP blocks, route tables, network gateways and security settings.
VPCs mirror many constructs from on-premises networks but with cloud-native benefits: elastic scaling, managed gateways, built-in multi-AZ capabilities and deep integration with IAM and PaaS services.
Quick fact: For authoritative details on Transit Gateway and VPC fundamentals, refer to the official AWS docs.
3. VPC Components Overview
Key VPC elements you will use in almost every architecture:
- Subnets — Logical subdivisions of a VPC (public, private, isolated).
- Route tables — Direct traffic within and outside the VPC.
- Internet Gateway (IGW) — Enables internet access for resources in public subnets.
- NAT Gateway / NAT instance — Provide outbound internet for private subnets without exposing inbound access.
- Security groups — Stateful firewall attached to ENIs.
- Network ACLs (NACLs) — Stateless ACLs applied at the subnet level.
- Endpoints (Gateway & Interface) — Private access to AWS services (S3, DynamoDB) without crossing the public internet.
- VPC Peering & Transit Gateway — Interconnect multiple VPCs.
Each component interacts to create network topology that enforces segmentation, controls traffic flow, and connects to other networks.
4. VPC IP Addressing and CIDR Blocks
VPCs use CIDR notation for IP address allocation (for example, 10.0.0.0/16). When planning CIDR space:
- Reserve larger blocks at the start for growth (e.g., /16 vs /20).
- Avoid overlapping CIDR with on-prem networks unless you intend to NAT or re-IP when connecting.
- Prefer RFC1918 space for private addressing: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16.
- Use smaller subnet CIDRs (/24 or /26) per AZ corresponding to expected instance counts and elastic scaling.
Tip: think multi-region and multi-account early. If you will connect multiple VPCs or extend to on-prem, leave non-contiguous IP ranges to avoid future collisions.
5. Subnets in AWS
What is a subnet? A subnet is a subdivision of a VPC's IP address range. Each subnet maps to a single Availability Zone (AZ).
Public vs Private Subnets
Public subnets have a route to an Internet Gateway (IGW) and typically host load balancers, bastion hosts or resources that must receive inbound internet traffic.
Private subnets do not have direct routes to the IGW; resources there access the internet via NAT (for updates, patches) while remaining inaccessible from the internet.
Subnet sizing & best practices
- Use /24 subnets for a rack-of-instances (256 addresses minus AWS-reserved addresses).
- Distribute subnets across at least two AZs to protect against AZ failures.
- Don’t carve too many tiny subnets; it complicates routing and IP management. Use subnet tagging and naming conventions for clarity.
6. Internet Gateway (IGW)
The Internet Gateway enables resources in public subnets to communicate with the internet. To make a subnet public:
- Attach an IGW to the VPC.
- Add a route in the subnet's route table pointing 0.0.0.0/0 to the IGW.
- Ensure instances have public IPs or elastic IPs and security group rules allow required traffic.
aws ec2 create-internet-gateway --query 'InternetGateway.InternetGatewayId'
aws ec2 attach-internet-gateway --internet-gateway-id igw-0123456789abcdef0 --vpc-id vpc-0123456789abcdef0
7. NAT Gateway and NAT Instances
Private subnet resources often need outbound internet (to download packages or call external APIs) while staying unreachable from the internet. Two main options exist:
- NAT Gateway — managed, highly available in a single AZ, charged per hour and per GB. Scales automatically and is the recommended option for most architectures.
- NAT Instance — EC2 instance acting as NAT. Offers more control (custom packet inspection) but requires management, scaling, and patching.
High availability: If you use NAT Gateways, deploy one per AZ and configure route tables per AZ to avoid cross-AZ data transfer charges and to maintain resiliency in case an AZ goes down.
Cost tip: NAT Gateway charges can add up for high egress. Consider VPC endpoints for S3/DynamoDB to avoid NAT egress charges when accessing AWS services.
8. Route Tables and Routing Mechanisms
Route tables define how traffic is routed from a subnet. Each subnet is associated with one route table (explicitly or via the main table).
Routing types:
- Static routing: Typical in cloud setups — you configure specific prefixes and next hops.
- Dynamic routing: For hybrid links (BGP with Direct Connect or VPN) where routes are learned via BGP.
Example routes: IGW (0.0.0.0/0), NAT Gateway (0.0.0.0/0 in private table), VPC peering (peer VPC CIDR), TGW route (to transit gateway attachment).
9. Security Groups vs Network ACLs
Layered security is essential:
| Feature | Security Group | Network ACL (NACL) |
|---|---|---|
| State | Stateful — return traffic automatically allowed | Stateless — return traffic must be explicitly allowed |
| Level | Instance/ENI level | Subnet level |
| Best use | Host-level firewall for EC2 & services | Extra layer for edge rules, logging or broad blocking |
Best practice: use security groups for positive allow rules and NACLs for broad "deny" patterns where necessary. Enable VPC Flow Logs to monitor traffic patterns for policy tuning.
10. VPC Peering
VPC peering creates a direct network route between two VPCs. It's low-latency and useful for simple, point-to-point connectivity.
Characteristics & limitations
- Peering is non-transitive: A peered VPC will not forward traffic to a third VPC through the peer connection.
- Works across accounts and regions (with inter-region peering) but managing many peers becomes complex.
- No single central control plane: each peering requires route table updates on both sides.
Use peering for a small number of VPCs or when point-to-point connectivity is sufficient. For many VPCs, Transit Gateway is easier to scale.
11. AWS Transit Gateway (TGW) Overview
What is Transit Gateway? Transit Gateway is a managed network transit hub that interconnects VPCs and on-premises networks. It provides a hub-and-spoke model so each VPC only attaches to the TGW instead of peering each VPC to every other VPC. This significantly reduces routing complexity at scale.
Transit Gateway supports route propagation, route tables per attachment, and can scale across AWS regions (inter-Region peering of TGWs).
12. Architecture of Transit Gateway
Transit Gateway uses a hub-and-spoke topology:
- Attach VPCs as spokes using TGW attachments.
- Attach on-premises links via Site-to-Site VPN or Direct Connect.
- Use route tables and route propagation to control which attachments can talk to each other.
Architectural considerations:
- Use multiple TGW route tables to enforce segmentation (e.g., separate prod, test, shared services).
- Control attachment association and propagation strictly to implement least-privilege networking.
- Use Transit Gateway Connect for SD-WAN integrations and third-party appliances.
Reference: AWS prescriptive guidance and whitepapers on building multi-VPC networks provide patterns and considerations when deploying TGW.
13. Benefits of AWS Transit Gateway
- Centralized connectivity management — Attach once, communicate with many.
- Simplified routing — Reduce number of peering connections and manual route churn.
- Scalability — Built to scale with managed bandwidth and distributed architecture.
- Security — Per-attachment route tables and route propagation help enforce topology.
14. Transit Gateway vs VPC Peering
| Aspect | VPC Peering | Transit Gateway |
|---|---|---|
| Scale | Best for small number of VPCs (N*(N-1)/2 grows fast) | Designed for large numbers of VPCs |
| Transitive | No | Yes (via TGW routing) |
| Complexity | Route tables per pair | Central route control |
| Cost | Only data transfer charges between peered VPCs | TGW hourly and per-GB charges (evaluate at scale) |
Choose peering when you have a few VPCs and low operational complexity. Choose Transit Gateway for multi-account, multi-VPC topologies and when central control and segmentation are required.
15. AWS VPN and Direct Connect Integration
Transit Gateway can attach Site-to-Site VPN connections and Direct Connect connections, enabling a single target for hybrid connectivity. When using Direct Connect, you can create virtual interfaces that connect through the TGW for simplified routing and performance.
Use BGP for dynamic route exchange on hybrid links. Dynamic routing simplifies route updates and failover handling.
Example use case: a corporate datacenter connects via Direct Connect to a TGW; multiple VPCs attach to the TGW so on-premises services can talk to each VPC without separate VPNs to each VPC.
16. Monitoring and Troubleshooting AWS Networking
Monitoring is essential. Key tools:
- VPC Flow Logs — collect IP flow information to/from ENIs. Great for identifying dropped traffic, allowed flows, and unexpected patterns.
- CloudWatch — for metrics and TGW/VPC alarms.
- CloudTrail — audit configuration changes and API activity.
- Network Access Analyzer — identify unintended network access paths.
Common troubleshooting checklist
- Check route tables for the subnet and ENI.
- Confirm security groups allow the traffic (remember: stateful).
- Check NACLs for stateless denies.
- Review VPC Flow Logs for source/destination and drop reasons.
- For hybrid links, verify BGP session and advertised routes.
17. High Availability and Fault Tolerance Design
Design HA by using multiple Availability Zones and redundant components:
- Deploy subnets across at least two AZs.
- Use one NAT Gateway per AZ to avoid single-AZ failure and to minimize cross-AZ charges.
- For TGW, it is a managed multi-AZ service — ensure route tables and attachments distribute traffic sensibly.
- For on-prem links, use redundant Direct Connect links (with AWS Direct Connect Gateway) or redundant VPN tunnels with failover.
18. Cost Optimization in AWS Networking
Major cost drivers:
- NAT Gateway hourly and per-GB charges.
- Transit Gateway per-hour and per-GB processing fees when used for inter-VPC communication.
- Data transfer between AZs and across regions.
Cost reduction tips
- Use VPC endpoints (Gateway endpoints for S3/DynamoDB) to bypass NAT for AWS service access — reduces NAT egress and public internet transfers.
- Consolidate egress to central egress VPC only when necessary; sometimes distributed NAT (per AZ) is cheaper.
- Evaluate traffic patterns — if lots of intra-VPC traffic is going through TGW, compare peering vs TGW costs for expected bandwidth.
Always model expected GB data transfer and hourly TGW/NAT usage to compare cost tradeoffs before large-scale deployment.
19. Security and Compliance Best Practices
- Apply least privilege to network paths — only allow required CIDRs and ports.
- Use security groups as the primary host firewall and keep them small and named consistently.
- Enable VPC Flow Logs and ingest into SIEM for threat detection.
- Use AWS Network Firewall, Security Hub, and Network Access Analyzer for continuous posture checking.
- Encrypt traffic where needed (TLS for app traffic; AWS encrypts inter-data-center traffic at the physical layer for some services — see AWS docs for details).
20. Real-World Design Scenarios
Small-scale app (single VPC)
1 VPC, public subnets for ALB, private subnets for app & DB, NAT for outbound updates, flow logs enabled, security groups per tier.
Multi-account environment with Transit Gateway
Central Transit Gateway in a shared services account. Multiple accounts attach VPCs to the TGW; route tables segregate prod/test/dev and route propagation controls which VPCs can communicate.
Hybrid cloud (VPN + Direct Connect)
Direct Connect for steady-state high throughput; Site-to-Site VPN as backup or for remote sites. Both attach to the TGW so multiple VPCs can use the same on-premises connection.
Reference designs and step-by-step patterns appear in AWS whitepapers — use them as templates and adapt to organizational constraints.
✅ Bonus: Diagrams, Comparison Table & Quick Setup
Diagram idea (Multi-VPC with Transit Gateway)
Recommended diagram sources (royalty-free / public):
Peering vs Transit Gateway — Quick Decision Table
| When to pick | VPC Peering | Transit Gateway |
|---|---|---|
| Few VPCs (<= 3-5) | Yes — simple & cheap | Overkill |
| Many VPCs / accounts | Operationally heavy | Preferred — central control |
| Need transitive routing | No | Yes |
| Cost for heavy traffic | Potentially cheaper for small volume | Can be more efficient at scale but includes per-GB fees |
Quick CLI snippet — create TGW and attach a VPC (example)
aws ec2 create-transit-gateway-vpc-attachment --transit-gateway-id tgw-0123456789abcdef0 --vpc-id vpc-0123456789abcdef0 --subnet-ids subnet-aaa subnet-bbb
Adjust options & tagging to match your policies. Use AWS IAM least-privilege for any automation account.
— End of guide —













Leave a Reply