Introduction to Terraform

September 30, 2018 Off By kex

Basics

Terraform is a technology used to create cloud infrastructure from code. It is quickly becoming the industry standard tool and doesn’t have much in the way of competition at this time. Let’s go over some key points:

  • Created and maintained by Hashicorp, known for making great quality tools in the DevOps space
  • Open source (written in Golang)
  • Written and operates in a declarative style
  • Uses the Hashicorp Configuration Language (a dsl with fairly simple syntax)

As we mentioned before, we will use Terraform to define and deploy our infrastructure as code. Defining our infrastructure as code gives us significant benefits over the ‘old’ manual ways, to name a few:

  • We can version control our infrastructure
  • We can create repeatable patterns for infrastructure the same way we do with software
  • A common language for code lowers the overhead when introducing new staff to an environment

The Terraform binary acts as an interpreter to all the different cloud provider APIs, in very basic terms it takes instructions written in HCL and translates them into code that is executed against the specified cloud providers API endpoint, which creates the resources specified in the HCL. Here is a high level example of how this process looks:

So far there are two components we’re interested in, our HCL files, and the Terraform binary. Let’s look a bit closer at these, starting with the Terraform binary.

When we execute the Terraform binary we will get an output like this:

Terraform binary options

These are all the commands supported by Terraform, the main ones we need to be concerned with are ‘fmt’, ‘init’, ‘plan’, ‘destroy’ and ‘apply’. Here is a brief overview:

  • Init: Parses your HCL files and pulls in any libraries/tools that Terraform needs to execute the scripts (stores them in  a .terraform directory in the same directory you are running the command)
  • Fmt: Will format your code to make sure it conforms to HCL standards (whitespace and indentation etc)
  • Plan: Parses your HCL files and prints out what changes Terraform will make when you run ‘apply’
  • Apply: Applies changes specified in your HCL files to the cloud target
  • Destroy: Destroys all the resources created by the current project (or referenced in the state file)

Your workflow will run in the same order, typically you will download the HCL files from version control, navigate to the root directory, run terraform init to gather dependencies followed by terraform fmt to ensure our whitespacing is consisent, then we run terraform plan, inspect the output of plan and run terraform apply, wait to see if there are any errors then finally check your cloud service to see if the infrastructure has been created successfully.

Let’s move on to write our actual HCL files so we have something to execute. At the time of writing this tutorial Terraform makes use of HCL version 1 but is scheduled to be replaced by HCL version 2 at some point in the future which may bring significant changes to the language.
To start with let’s create a directory to hold all our files, I will just call mine terraform.

The first thing we need to create is our ‘provider’ definition, in Terraform a provider is the cloud platform that we are building our infrastructure for. For now we will add this to it’s own file, I will call mine provider.tf and add the following:

provider "aws" {
region                  = "eu-west-1"
shared_credentials_file = "~/.aws/credentials"
profile                 = "personal"
}

Note that the ‘profile’ section relates to my credentials definition in my aws credentials file and is not necersary to include. There are other ways to store secrets/credentials in Terraform which we will discuss another time. Now we’ve set up the provisioner, we need Terraform to pull in everything it needs to run using the AWS provider. To instruct Terraform to pull in these requirements run terraform init from our project directory, you should see something like the following:

Terraform init output

We can see that Terraform has now created a directory called .terraform that contains everything it needs to run against AWS.

The result of running terraform init

Now that we have our project initiated, and have defined our provider we can start to write our actual code. First off we will create a VPC, so create another file called vpc.tf and populate it with the following:

resource "aws_vpc" "core" {
cidr_block            = "10.0.0.0/16"
enable_dns_hostnames  = true

tags {

Name                = "core"

terraform          = "true"

}

}

This code block requires a bit of explaining, I’ve colour coded the sections to help make it clearer:

  • Orange: This is where we define our object type, this object is an example of type ‘resource’ but it could be ‘variable’, ‘data’ etc
  • Red: This is our resource type, or how terraform sets the context for what we are trying to create, in this example it will tell Terraform we need to create a VPC in AWS
  • Olive: This text implies that our ‘core’ object is a nested object within ‘aws_vpc’, this is how we will reference our object in the overarching dataset
  • Blue: These are the parameters or properties of our object, or where we define the details of the resource we plan to deploy.
  • Green: Another nested object, this one will define a tag for our AWS VPC resource

When writing HCL files we will normally use the Terraform documentation as a reference, it would be almost impossible to remember the sheer number of objects, resources and configurations you can put into your files. For this reason the Terraform documentation is laid out more like reference material than an actual usage guide, here is a link to the AWS provider documentation for you to have a look at.

Now we can attempt to execute our Terraform file, but before we do that we need to run a plan, save the file, go back to the project root and run terraform plan. You should see something like the following:

Expected output of running terraform plan

Great, we know our code will now create a VPC and subnet. Let’s tidy up our code a bit, run terraform fmt to ensure all the files our directory fit the standard HCL formatting. Check the files after running the command to see how they ‘should’ look. Now that we’re happy with our file formatting, let’s execute the code for real and create our AWS resources. Run terraform apply. You should get something like this as the output, make sure to type ‘yes’ in full if you are ready to have the resources be created:

Expected output of 'terraform apply'

After entering ‘yes’ Terraform will connect to the AWS api using the credentials it finds in your ~/.aws/credentials file, it will then create the resources which can take a while depending on the current usage of AWS. When the resources have been created we can check the status of our resources using the AWS cli or the console, we should see our resources like so:

Shows the VPC we created

At this point we should note that Terraform is a declarative language, that means if we run the ‘apply’ script again Terraform will not take any action, this is expected behaviour. However if we rename our VPC in the HCL file then Terraform will delete the VPC and recreate it with the new name, this is true for any resource.

So how does Terraform know what I have in the cloud? The answer is in the ‘state file’, this will have been created in the same directory which you ran terraform apply with the name terraform.tfstate. The state file contains JSON content that represents what resources we currently have in AWS that Terraform is aware of. Here is an example of state file contents:

Terraform state file contents

The state file is an incredibly important concept in Terraform, here are a few things to note and best practises:

  • The state file can be edited manually to represent content that is already in your cloud environment that you want Terraform to be aware of without having to recreate it
  • You can and should have multiple state files for a project, as your project grows you don’t necersarily want environments to have shared state, for example if you have one state file representing ‘production’ and ‘nonprod’ then a change to nonprod could end up causing issues in prod. Read Charitys blog here to find out a real world example of this issue
  • If you lose your state file Terraform has no way to recognise what resources it has created/manages in the cloud
  • Normally you want to use remote state files, this is where the state file is stored in some kind of remote/shared storage (such as an s3 bucket), this enables multiple engineers to work using Terraform without having to worry about dirty state files being put into version control
  • The state file stores secrets in plaintext, you should never put any kind of secrets into your state file – or at the very least secure the state file somehow to ensure nobody can access it in its plaintext form

Now that we’ve gained a basic understanding of these concepts we should wipe out the VPC we just created. To do this just run terraform destroy, you will see something like the following:

Example of Terraform destroy output

Enter yes and Terraform will destroy the VPC we created earlier. Running destroy is normally something you do during testing, such as when you go home for the evening to shutdown environments you won’t be using to save money since it will wipe out everything it finds in the current project. If we check our state file we can now see that Terraform has cleaned itself up:

An example of the empty Terraform state file

Now we’re back where we started, with a blank statefile and no resources running in AWS. That’s it for the introduction, this guide will be followed up by cloud provider specific guides but this should have prepared you to use Terraform in any environment.