software engineering Aug 27, 2018

Deploying Multiple Environments with Terraform

Terraform is a great tool for provisioning immutable infrastructure. It allows you to create infrastructure programmatically in a declarative manner while tracking the state of the infrastructure. For more complex deployments, and for more reusable code, one has to make Terraform work for them. Luckily, Terraform provides most of the tools needed to make this possible. In this post I will show how I made Terraform work for me while successfully deploying infrastructure to multiple environments.

Background

When I started this Kubernetes infrastructure project I had never used Terraform before, though I was familiar with it. Terraform is a tool by HashiCorp offered in both open source and enterprise versions. It is written in Go and has a proprietary DSL for user interaction.

In the past, I have used numerous other tools such as Puppet, Ansible, The Foreman, and CloudFormation as well as other “roll your own” tooling around various SDKs and libraries. With this project though I wanted to learn something new; enter Terraform.

This project had a few simple requirements, specifically that we needed to be able to deploy infrastructure to the following environments:

  • Development
  • QA
  • Staging
  • Production

It also needed to be able to run from a developers’ machine, deploy personal infrastructure, and lastly, the user experience needed to be simple. This meant that the command line arguments needed to be kept to a minimum and no complex configurations files could be required.

Building the Tooling

As stated above, I wanted to learn a new tool for this project, something I was familiar with but had never used hands on. Another challenge I had for myself was to try and implement DRY principles wherever possible.

DRY principles are aimed at reducing repetition in patterns and code. Adhering to this meant that I would not have duplicate Terraform code for each environment I wanted to deploy. I would write the minimal amount of code necessary to achieve the goal by working toward maximum reusability.

After doing a lot of reading on Terraform it turns out there are a handful of features I decided to use to meet the requirements. My Terraform project was going to be built around the idea of using remote stateworkspacesmoduleslocals, and variables.

Workspaces

Terraform workspaces are the successor to Terraform environments. workspaces allow you to separate your state and infrastructure without changing anything in your code. This was a great start to building my tool since I wanted the same exact code base to deploy to multiple environments without overlap. I decided that all workspace names would be supported by the tool and each workspace would be considered an environment.

For the developers’ personal infrastructure I was not going to put restrictions on naming based on some arbitrary convention I came up with. I decided that any workspace name a developer wanted to use would be supported and would be considered a development environment.

Remote State

Keeping Terraform state in a remote file is a must. I shouldn’t have to go into the reasons why. Take it from me, just do it. I decided to store my project’s state in S3 since it was quick and easy to set up. There are two key points to keep in mind when using a remote state file.

  1. This is the file Terraform uses to calculate changes in the infrastructure. If this isn’t kept in a single centralized location then you aren’t going to be managing your infrastructure responsibly.
  2. Each of my environments required separate state files to avoid collision. With the use of workspaces I was able to prepend the workspace name to the path of the state file, ensuring each workspace (environment) had its own state. In turn, this meant that each environment had its own separate infrastructure; which also included developers’ personal clusters.

Locals

Terraform has a feature called local values (locals). locals have a name that is assigned to an expression; a map lookup, or ternary, for example. These values can be considered comparable to local variables of a function. I’m sure there are great use cases for them, however I mainly used them as inputs for other modules.

Variables

Just like any tool or language, Terraform supports variables. All of the typical data types are supported which makes them really useful and flexible. One nice thing I found in Terraform is the concept of an input variable. An input variable is essentially a parameter for a Terraform module. These input variables can be used to populate configuration input for resources or even determine the value of other variables.

In my project I decided that I was going to make unique use of variables in Terraform. I was going to make variables for determining the environment and the size of the infrastructure, as well as a dedicated module to house all configuration variables for the resources.

Modules

If Terraform code couldn’t be reusable and modular it wouldn’t be worth using. So, like every other configuration tool out there today, Terraform supports modules. Modules are nothing more than generic, highly parameterized code that can easily be reused in multiple use cases. For my use case, I made a variables module as mentioned above.

Putting It All Together

The first thing I did was create a set of variables for the project. These variables lived in a variables.tf file. I created a variable mapping the Terraform workspace name to the environment name. This created a set of "known environments" which allowed the support for dev, qa, staging, and production, but did not allow the support of personal developer infrastructure.

variable “workspace_to_environment_map” {
  type = “map”
  default = {
    dev     = “dev”
    qa      = “qa”
    staging = “staging”
    prod    = “prod”
  }
}

With the environment now being known, the next variable was to map the environment to the size of the infrastructure. For this I went with the good old “t-shirt sizes”. Each of these sizes will map to different sized instances for the infrastructure. For example, small might be t2.small and medium might be t2.medium.

variable “environment_to_size_map” {
  type = “map”
  default = {
    dev     = “small”
    qa      = “medium”
    staging = “large”
    prod    = “xlarge”
  }
}

Now that the “known environments” were supported, I needed to support developers’ personal infrastructure. To do that I create a variable mapping Terraform workspace to size. Without surrounding context this doesn’t make much sense, but trust me, it’s coming.

variable “workspace_to_size_map” {
  type = “map”
  default = {
    dev = “small"
  }
}

Now that the “known environments” were supported, I needed to support developers’ personal infrastructure. To do that I create a variable mapping Terraform workspace to size. Without surrounding context this doesn’t make much sense, but trust me, it’s coming.

variable “workspace_to_size_map” {
  type = “map”
  default = {
    dev = “small”
  }
}

Now that I had the base variables, I needed to find a way to use them in the code base…enter locals.

I created two locals in the main.tf for environment and size. Each of these has their own expression to calculate the correct value.

locals {
  environment = “${lookup(var.workspace_to_environment_map, terraform.workspace, “dev”)}”
  size = “${local.environment == “dev” ? lookup(var.workspace_to_size_map, terraform.workspace, “small”) : var.environment_to_size_map[local.environment]}”
}

Let’s break this down a bit before moving on…

The environment local value is a lookup of the workspace_to_environment_map using the Terraform workspace name. If the Terraform workspace name does not exist in the map, it will default to dev for the environment.

The size local value is a ternary with a lookup and uses the aforementioned environment local value. If the environment local value is dev, then it will look up the workspace_to_size_map using the Terraform workspace name. If it does not exist in the map, it will default to size small. If the environment local value is not dev, then it will get the value of the environment_to_size_map for the value of the environment local value.

Just like that, we now have support for all “known environments” as well as any workspace name a developer wants to give their personal infrastructure. Believe it or not, that was the hardest part of this. So the hard work is done and now the only thing left is building the variables module.

Since all of the variables for my project were configuration values, I chose to keep them completely separate from the project code. This allows the configuration to be updated without touching the actual code base.

In the main.tf I declared a variables module using a GitHub URL as the source. The module has two inputs configured, environment and size which are the values of the locals described above.

module “variables” {
  source = “git::https://github.com/project/config//variables"
  environment = “${local.environment}”
  size        = “${local.size}”
}

Once this is declared, any variable in the module can be used in the project by referencing it as module.variables.<variable name>. This allows the project to dynamically retrieve the proper values for all variables thanks to the inputs that are passed to the module.

In the variables module code there is an inputs.tf which declared the input variables for the module. These inputs are used to lookup the proper values.

variable “environment” {
  description = “The cluster deployment environment”
}
variable “size” {
  description = “The size of the instances”
}

When building out the variables in the module I needed to take into consideration which variables needed to be retrieved based on environment, vs which variables needed to be retrieved based on size.

An example of writing a variable based on environment could be subnet ids.

variable “subnet_map” {
  description = “A map from environment to a comma-delimited list of     the subnets”
  type = “map”
default = {
    dev     = “subnet-c59403abe,subnet-69483bdb33c”
    qa      = “subnet-e48unjd9a1,subnet-c085uhd93a4”
    staging = “subnet-65489uuhfn9,subnet-448hjdh86b”,
    prod    = “subnet-6dfjn2344f,subnet-0f4u3bjbd47”
  }
}
output “subnets” {
  value = [“${split(“,”, var.subnet_map[var.environment])}”]
}

An example of writing a variable based on size could be instance type.

variable “instance_type_map” {
  description = “A map from environment to the type of EC2 instance”
  type = “map”
  default = {
    small  = “t2.large”
    medium = “t2.xlarge”
    large  = “m4.large”
    xlarge = “m4.xlarge”
  }
}
output “instance_type” {
  value = “${var.instance_type_map[var.size]}”
}

Once all of the variables are created in the module, they can now be referenced in the Terraform code as inputs to resources and other modules. This makes the core code base completely dynamic when deploying infrastructure to different environments that have different configurations and requirements.

The Outcome

After building out this design and running the Kubernetes cluster through some development cycles it appears to be working well and reliably for all environments. The Kubernetes cluster has been running in production with infrastructure built using Terraform in a CI/CD environment for a little over four months without issue.

Maybe you’re using Terraform and this is a new design pattern that you could implement into your code base. Maybe you’re using another tool and this has sparked your interest in exploring how Terraform could make its way into your environment. Either way, I would encourage you to explore Terraform and how others are using it, you never know when adopting a new tool will pay off dividends in the long run.

Chris Pisano
Senior Software Engineer

DISCLOSURE STATEMENT: These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2018 Capital One.