Creating your serverless static website in terraform, part 2

Let's review the architecture we put together in Part 1.

Check out Part 1
- a main website that is hosted at a www. subdomain
- an html/js/css app that is hosted in an S3 bucket
- a cloudfront distribution to give us an https URL
- a certificate that will be used with the cloudfront distribution for that domain
- a series of vanity URLs that will redirect to the main website

Let's start building some terraform

create your project directory

  • you need a main.tf to hold all your resources
  • you need a variables.tf to hold your inputs to your infrastructure
  • you need a var-file to hold the values of your inputs to your infrastructure

create your initial tf files

The main website we want to build is www.example.com.

resource "aws_s3_bucket" "logs" {
  bucket = "${var.site_name}-site-logs"
  acl = "log-delivery-write"
}

resource "aws_s3_bucket" "www_site" {
  bucket = "www.${var.site_name}"

  logging {
    target_bucket = "${aws_s3_bucket.logs.bucket}"
    target_prefix = "www.${var.site_name}/"
  }

  website {
    index_document = "index.html"
  }
}

variables.tf

variable "site_name" {
  description = "My site"
}

var-files/my-site.tf

site_name = "mysite.com"

I believe we learn best when we see how things break! And there's an opportunity to break things here.

Why put access logs in a separate bucket?

Let's make things "easier" by putting access logs in the same bucket as the website. Why should I need 2 buckets?

resource "aws_s3_bucket" "www_site" {
  bucket = "www.${var.site_name}"

  logging {
    target_bucket = "www.${var.site_name}"
  }

  website {
    index_document = "index.html"
  }
}

Plan and apply your terraform, then visit your website directly via the S3 website URL. Refresh the page a few times and then look at the contents of your S3 bucket.

  • Your access logs are now also public because they can be accessed via your website URL.
  • Your access logs have lengthy and obscure file names and they are clobbering your directory. If you deploy using aws s3 sync to your bucket, you'll wind up deleting your access logs every time you deploy.
  • Wait 10 minutes and review the activity in your access logs. You'll notice that the act of creating a file containing your access logs will create an access log of the act of creating a file containing your access logs, and etc, etc, etc.

I've done it before. I was so confused. Use a separate bucket.

Create your certificate

We need a certificate which we can get for free via ACM, but there's a catch

resource "aws_acm_certificate" "cert" {
  domain = "www.example.com"
}

Applying this terraform resource will begin the act of requesting a certificate, but the certificate needs to be approved and created out-of-band. There's 2 ways of handling this scenario

  • Don't manage the certificate with terraform. Create it manually and drop the ARN of the certificate wherever you need to use it.
  • Manage the certificate with terraform, but create it manually and then use terraform import to import the resource AFTER its been approved and created.

Now we can create the Cloudfront distribution and finish connecting the pieces together:

resource "aws_cloudfront_origin_access_identity" "origin_access_identity" {
  comment = "cloudfront origin access identity"
}

resource "aws_cloudfront_distribution" "website_cdn" {
  enabled      = true
  price_class  = "PriceClass_200"
  http_version = "http1.1"
  aliases = ["www.${var.site_name}"]

  origin {
    origin_id   = "origin-bucket-${aws_s3_bucket.www_site.id}"
    domain_name = "www.${var.site_name}.s3.us-east-2.amazonaws.com"

    s3_origin_config {
      origin_access_identity = "${aws_cloudfront_origin_access_identity.origin_access_identity.cloudfront_access_identity_path}"
    }
  }

  default_root_object = "index.html"

  default_cache_behavior {
    allowed_methods = ["GET", "HEAD"]
    cached_methods  = ["GET", "HEAD"]
    target_origin_id = "origin-bucket-${aws_s3_bucket.www_site.id}"

    min_ttl          = "0"
    default_ttl      = "300"                                              //3600
    max_ttl          = "1200"                                             //86400

    // This redirects any HTTP request to HTTPS. Security first!
    viewer_protocol_policy = "redirect-to-https"
    compress               = true

    forwarded_values {
      query_string = false

      cookies {
        forward = "none"
      }
    }
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  viewer_certificate {
    acm_certificate_arn      = "${aws_acm_certificate.cert.arn}"
    ssl_support_method       = "sni-only"
  }

}

A few important items here

  • aws_cloudfront_origin_access_identity will create and manage that for you in terraform. You can either create and share an origin access identity across multiple distributions, or you can use one origin access identity per distribution.
  • A viewer protocol policy redirect-http-to-https to enforce https to site visitors

Lock down the S3 bucket

The bucket policy lets us define the security on the bucket. In this scenario, we can keep the bucket private and only accessible by Cloudfront by using an Origin Access Identity. We can express this in the bucket policy:

main.tf

data "template_file" "bucket_policy" {
  template = "${file("bucket_policy.json")}
  vars {
    origin_access_identity_arn = "${aws_cloudfront_origin_access_identity.origin_access_identity.cloudfront_access_identity_path}"
    bucket = "${aws_s3_bucket.www_site.arn}"
  }
}

resource "aws_s3_bucket" "www_site" {
  bucket = "www.${var.site_name}"
  policy = "${data.template_file.bucket_policy.rendered}"
  ...
}

bucket_policy.json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "OnlyCloudfrontReadAccess",
      "Principal": {
        "AWS": "${origin_access_identity_arn}"
      },
      "Effect": "Allow",
      "Action": [
        "s3:GetObject"
      ],
      "Resource": "arn:aws:s3:::${bucket}/*"
    }
  ]
}

Create the DNS record

main.tf

resource "aws_route53_record" "www_site" {
  zone_id = "${data.aws_route53_zone.site.zone_id}"
  name = "www.${var.site_name}"
  type = "A"
  alias {
    name = "${aws_cloudfront_distribution.website_cdn.domain_name}"
    zone_id  = "${aws_cloudfront_distribution.website_cdn.hosted_zone_id}"
    evaluate_target_health = false
  }
}

Apply your terraform and test it out.

Writing automated tests for your infrastructure

Like any programming project, you want to make sure you have tests for your code. But we're writing terraform and there's no test runner for terraform, or is there?

All we need to be able to test our terraform is just any language that lets us write web requests. We'll create 2 sets of tests to make sure everything is working for mysite.com

Create the tests to verify your bucket configuration

s3_bucket_spec.rb

require 'serverspec'

context 's3 bucket' do

  describe command('aws s3 ls s3://dperez-test-bucket') do
    its(:stdout) { should_match /Access Denied/ }
  end

  describe command('curl -i https://s3.amazonaws.com/dperez-test-bucket') do
    its(:stdout) { should_match /Access Denied/ }
  end
end

context 'cloudfront' do
  describe command('curl -i https://www.mysite.com') do
    its(:stdout) { should_match /200 OK/ }
  end

  describe command('curl -i http://www.mysite.com') do
    its(:stdout) { should_match /301 Redirect/ }
  end
end

Run the tests with rspec to validate the infrastructure you just created.

Tried to learn DevOps on AWS and gotten lost?

If you liked the content of this post and want to learn more - I’m putting together an online course where I'll walk you through real examples of how I’ve practiced devops from beginning to end using a static website deploying to AWS S3 with a framework to help you apply it to your workflows.

Check it out here

You might also like

More Similar Posts

Menu