This problem landed on my lap awhile ago, and I've seen it come up time and time again so I wanted to share how we went about it. Our marketing team was submitting one of our products for a review for a client, it has a ton of content. There was a long list of deep-links in the product that we wanted the reviewers to see, but we needed to create shortlinks so they could embed them on some doc.
There wound up being a total of 20k shortlinks, which meant that:
- I needed to be able to upload 20,000 in one go, it needed a scriptable API
- whatever we used needed to be around for at least 6 months to be available during a certain review period
- the links needed to work, and be available
- we needed to be able to create the redirects from a Google Sheet
- they wanted to know if/when links were clicked
We ultimately landed on a solution with just AWS that comes out dirt cheap - but there were a few things we had to consider first.
Why not just use something like bit.ly?
For this specific case - based on their pricing, we'd be in the enterprise range under the "Contact Us for a Quote" plan which puts us north of $400/year (and a "quick call" with an account manager) for something I only need once.
$400/year sounds steep for something that ...should.... be simple and cheap. 301 redirects have been around since 1999 (20 years ago!) - why isnt this a solved problem! (it kinda is, but it isnt)
Even outside of enterprisey considerations,
- bitly comes with its own limits for the free tier, and even doing something like bulk redirects needs to be on their enterprise plan.
- 30/month for their basic plan is a bit steep for some 301s, and we can do it for cheaper. (you trade off some features though, primarily analytics - if you dont care about that, fantastic!)
Apart from that, its a good opportunity to learn the ins & outs of AWS, it doesnt involve a lot of services. If you're familiar with Cloudformation or Terraform, you can set it all up that way.
How do you build it?
We worked out a solution using Cloudfront + S3, where your Objects have a Website-Redirect-Location metadata attached (see docs here). We built it out in an afternoon, and had all 30k redirects generated and working, and it costs just a few bucks/month to run.
Let's say you own short.url and you want to be able to have short.url/2019barbecue redirect to a Facebook page.
At a high level, take a look at this architecture (images courtesy of cloudcraft.co):
You point your domain (which can be hosted anywhere, doesnt need to be route53) at a Cloudfront distribution that sits in front an S3 bucket served as a website. In that bucket are many objects, with the Website Redirect Location metadata set to redirect to our target url - short.url/2019barbecue redirects to example.com
Are your redirects business critical?
Maybe you're running a launch and you need people to click a link. You are now responsible for uptime and so AWS' uptime is your uptime.
In this case, your availability is the minimum of either Cloudfront or S3 (99.9%) - you do, however, have the benefit of a globally distributed CDN… and a highly durable and replicated object store… with failover to another region in case it craps out - not to mention its cheap-ish to implement.
Do you want analytics?
Bitly analytics are pretty neat - thats a tradeoff since its something you'll need to do yourself - but if you already have an analytics/monitoring platform, you can get your data through there. For things on my personal site, I don’t care for seeing analytics - i just want to share a customized short link.
Out of the box, you get some level of Cloudwatch monitoring around Cloudfront including error rates, data in/out, and total requests. (full list here)
If you want more info, then you need to dump Cloudfront access logs to another S3 bucket and use Athena (the docs for that here) for some quick analytics to see things like source IP, referer, requested path, response code
Any gotchas i should be aware of?
Be mindful of updating where a redirect points to. 301 redirects CAN be cached indefinitely by your browser since the spec indicates a 301 means Permanent Redirect.
If someone has already clicked your link before, it's possible that they may be taken to the old location for some period of time. In practice, I've seen a long tail of requests to the old value of the redirect kinda like this:
302 redirects are meant to be temporarily cached, so it has less issues. Some architectures limit you to 301 redirects, others give you the option to do a 302. In this case, Cloudfront + S3 only lets you do 301s and thats just a constraint we need to deal with unless we want to throw Lambda in the mix as well.
If you find yourself needing to create some customized short links or redirects, consider using Cloudfront + S3 to serve them using the architecture above - you get all the benefits of an enterprise-grade solution at a fraction of the cost if you're already familiar with AWS.
I'm working on a series of posts to thoroughly explain how to set up short links on AWS. Next up, I'll be covering how to set up the short links in the console, and then how you could do it in terraform.