RDS provisioning: A comparision between Ansible and Terraform

Pythian Marketing

March 20, 2019

I recently had to compare these two configuration managers to provision MySQL environments in RDS. I had previous Ansible knowledge but I was fairly new to Terraform. With that in mind, I put together a list of resources to see how complicated it would be to create them with Terraform compared to Ansible:

A security group, allowing traffic on TCP 3306
An RDS instance
A read replica for the instance created in the previous step
A read-only and a read-write user for the MySQL pair

Below are some highlights from this experiment.

Installation

To install Ansible on Linux (or Ubuntu in this case), I first added the Ansible repositories, and then used apt to install the tool. To install Terraform, I just downloaded the appropriate binary from www.terraform.io.

Dependencies

In order for Ansible to interact with the AWS APIs, the boto3 Python module needs to be present on your system. MySQL-python module should also be present to interact with MySQL. pip can be used to install both and it should be fairly straightforward, unless you don't have pip installed or, as in my case, MySQL-python complains about a missing specific libmysqlclient library version, which I had to install using apt. Terraform handles requirements a bit more elegantly, as it will attempt to get them automatically when running terraform init. It will first check the terraform configuration for specific plugins to then download the files from Hashicorp servers. If there is no internet connection, the plugins can be downloaded manually from releases.hashicorp.com and unzipped where the terraform binary is located, under .terraform/plugins/<os>_<architecture>.

Code

To keep the post size under control, I've uploaded most of the sample code to Github. The Ansible play can be found here. Terraform configuration files can be found here. You will notice that the code is fairly similar in structure (input variables like instance_name or VPC ID, output variables for resources are registered on each execution and are used as input for the next task, etc). Download the Ansible play from the link above to a local directory and execute it using a simple bash script similar to the following:

export EC2_REGION=us-east-1 export AWS_REGION=us-east-1 export AWS_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxx export AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxx

ansible-playbook -vv --extra-vars "instance_name=ansible-rds-poc admin_username=foo admin_password=Foobar!9 environment_name=dev vpcid=vpc-xxxxx" create_rds.yml

To execute the Terraform configuration, download both .tf files from the link above and put them into a local directory called rds_instance. Then, you should create a test.tf file with the following content:

provider "aws" { access_key = "xxxxxxxxxxxxxxxxxxxxxx" secret_key = "xxxxxxxxxxxxxxxxxxxxxxxxx" region = "us-east-1" }

module "rds-instance" { source = "./rds_instance" instance_name = "tf-rds-poc" vpcid = "vpc-xxxxx" admin_username = "foo" admin_password = "Foobar!9" }

It is worth mentioning that in a production environment, we would probably avoid including the AWS keys into the configuration file directly. Storing them on a .tfvar file is the right thing to do as these can be encrypted and they are usually registered in .gitignore. The directory structure should look like this:

./test.tf ./rds_instance/main.tf ./rds_instance/variables.tf

Finally, just execute terraform init and then terraform apply to create the resources. You can also run terraform plan to preview what terraform is planning to do.

Default behavior

A few points to highlight in favor of Terraform: first, it will automatically detect implicit resources dependencies from how variables are passed between them. Based on that information, it will decide which resources can be created in parallel and which need to be created sequentially. Furthermore, it will wait for a resource to be active by checking its status recurrently as opposed to Ansible, which requires that the wait clause be included explicitly.

Pulling code from a repo

Another interesting Terraform feature is the possibility of pulling your configuration directly from a repository such as Github, Bitbucket or an S3 bucket. If you replace the source parameter for something like git::ssh://git@github.com/gabocic/gitests.git//rds_instance, the code for the terraform module will be pulled from my public Github repository automatically. If any changes to the files are made, you can update the local code copy using terraform get -update.

Destroying resources

Terraform also gives you the ability to destroy the resources listed in a configuration without having to create any additional code. By just running terraform destroy, the steps executed to create your resources will be reverted to eliminate them, also taking dependencies into consideration.

Provisioning similar environments

Another evaluation point was the ability of each technology to provision similar environments by running the same code with different input variables automatically. To achieve this with Ansible, and based on the example provided above, the following inventory file was used:

[rds_instances] enviro1 ansible_connection=local instance_name=ansible-poc-1 environment_name=enviro1 enviro2 ansible_connection=local instance_name=ansible-poc-2 environment_name=enviro2 enviro3 ansible_connection=local instance_name=ansible-poc-3 environment_name=enviro3

[rds_instances:vars] admin_username=foo admin_password=Foobar vpcid=vpc-xxxxx

By just setting hosts: rds_instances within the Ansible play and making sure the inventory file is on the default path or it is passed explicitly, we can provision three copies of the environment using different instance_name values. Of course, passing the variables values to ansible-playbook explicitly is no longer required. To obtain something equivalent with Terraform, you can make use of workspaces. A full explanation of how they work exceeds this blog post but it is basically a way to keep different deployments independent, by maintaining separate state data. You can add workspaces by running terraform workspace new <workspace_name>. You can then move between them by running terraform workspace select <workspace_name>. Once the workspaces are created, you just need to check under which one your configuration is running and change the input variables accordingly. There are two modifications needed for this: first, we need to create a variables.tf file at the same level where the test.tf is and declare a map holding all workspaces names:

variable "workspaces" { type = "map" default = { enviro1 = "enviro1" enviro2 = "enviro2" enviro3 = "enviro3" dev = "dev" } }

In test.tf, we need to add some logic to detect which workspace the configuration is being launched from:

... region = "us-east-1" }

locals { environment = "${lookup(var.workspaces, terraform.workspace, "dev")}" }

module "rds-instance" { ... instance_name = "${local.environment}" } Once the changes are implemented, the instance_name variable will match the workspace name at runtime. Using this as a key, you can access different maps holding values for each environment.

Wrapping up

As a first point, and as mentioned in its own documentation, Terraform was not meant to replace configuration managers and some functionality overlaps. Terraform describes itself as a "tool for building, changing, and versioning infrastructure safely and efficiently". And that's where its sweet spot is: infrastructure as code. We can say that perhaps configuration managers like Ansible are more flexible in the sense that you can both provision and configure with them, but you can't configure a resource properly with Terraform. You can, however, integrate it with configuration managers through provisioners to set up resources after they were created. Now, specifically around provisioning RDS environments, using Terraform seems simpler to set up, it handles dependencies and parallelism automatically, and it allows you to destroy the resources created without writing additional code.

Insight and analysis of technology and business strategy