Dynamic Subnet Management Like a Boss with Terraform
Background
Networking remains a core concern and enabler or disabler in Cloud Native and Kubernetes native platforms.
Getting networking right can be a great enabler or a blocker for DevOps/Platform/Security/ Site Reliability Engineers.
I often see AWS VPCs provisioned statically and subdivided in a static fashion, which always concerns me about future-proofing and making modules usable in multiple regions.
The second thing I see, which worries me is large public / private subnets in which lots of different concerns (VPN, ingress) are placed together.
This can lead to very complex security rules and not comply with network segregation requirements required set out by many Security Standards.
This also means that changes to subnet tags, security groups, and other security functions have to be co-ordinated and become a bottle-neck.
I believe both of these are very easily solvable, if we let Terraform do the thinking for us!
After all, that’s what it’s there to do !
Desirable Properties
- Not clumping everything in to one set of public / private subnets. This enables different teams to own subnet configurations and change as needed without having to “consult” other teams
- Utilise the maximum amount of availability zones for public and private subnets without planning in advance.
- Automatically calculate subnet mask and IP ranges and expose these to other modules which may need them.
- Minimise the human involvement in rolling out subnets for different regions and availability zones.
- Be future proof without having to make wholesale changes to the initial implementation.
Mental Model: Implementation Approach
- We will use a 10.x.x.x/16 CIDR for the VPC
- We will use /20 Subnets which we call “large-subnets”. These are suitable for K8s node networks and subdivision for smaller networks.
- We will use /26 Subnets which we call “small-subnets”. These will be used for things like ALB ingresses, VPN landing zones and Databases.
Phase 1: Reserve A Subnet for Subdivision
- NOTE: we could reserve multiple networks for subdivision.
- Reserving first networks works well with the Terraform module which we will use later on.
Phase 2: Dynamic Subdivision based on Number of Availability Zones
- US-EAST-1 has 6x Availability Zones
- AP-SOUTHEAST-2 has 3x Availability Zones
Terraform Module Selection
There are several options for modules that can do the heavy lifting.
- https://registry.terraform.io/modules/drewmullen/subnets/cidr/latest
- https://registry.terraform.io/modules/hashicorp/subnets/cidr/latest
I have chosen drewmullens module for a couple of reasons
- Working with network masks is a bit easier than network bits
- Ability to group subnets dynamically is useful for output parsing
Code
Results
Small and Large Subnet Division in AP-SOUTHEAST-2
Small and Large Subnet Division in US-EAST-1
Adding VPNs to Small Subnets in US-EAST-1
Conclusion
- Although this example focused on AWS, the subnetting module (and subnetting itself) is fairly vendor neutral. The code demonstrated here should be useful for Azure, GCP and other cloud providers.
- I didn’t manipulate the outputs too much, but having a squashed down CIDR list would be useful. This can be achieved fairly easily as the grouped_by_separator output is a very useful map
- There is a small chance that more availability zones become available in areas like Oregon (4 currently) that could undo the maths. One possible fix is to provision subnets in factors of 3x which will give you a good balance in everywhere except Oregon.
- There are a whole lot of ways to make the dynamic map generation smarter. I have not done this in the example for code readability.
- I generally think keeping the subnet mask the same for the small-subnets and just reserving another large subnet is a better choice. Remember want to grab a network that is close to our needs and provision without overthinking about subnet math!