Here's some coverage of our recent deal with Akamai to use their DSA product. iProperty…
(This is a paraphrasing of a talk I did at WebCamp KL in October 2012.).
(This presentation is intended to give a brief overview of some of the core AWS services and how iProperty has made use of them. It’s not particularly technical.)
What is AWS? Amazon publicly launched their web services division in 2006 – the first service out of the door was S3 (Simple Storage Service) which was quickly followed by EC2 (Elastic Compute Cloud). Although the base services have been around for 6 years or more, they’ve really only become ubiquitous in the last 3 or 4 years with many major sites (such as Instagram, Pinterest, Netflix and Reddit) relying on them.
By some estimates earlier this year, there’s over 450,000 servers worldwide powering the AWS infrastructure and it’s reckoned to be a US$1b+ business.
At iProperty we’re in the process of moving many of our hosting services from our own managed infrastructure in KL to Amazon Web Services in Singapore. This talk follows the journey we’ve taken to get into AWS and some of the services we’re planning to adopt soon.
Our first step into AWS was to use S3 for storing and delivering images. As a property site we store lots of images (nearly every listing has an image and we never delete them). We used to store all of our images on network storage and ran a number of webservers to deliver those images to visitors. The problem we faced was that our disk usage kept growing and the only way to keep pace was to add more disks. However when you add capacity you tend to add a lot more in phases. This leads to a disparity between storage demand and storage supply. The graph below illustrates this:
You can see that you may end up in a situation where you’ve not expanded fast enough and your demand is greater than your available storage. Now you have a problem – either people cannot upload new images because the storage is full or you need to delete some images that you wanted to keep. However, when you buy more storage you have the opposite problem – you’ve now paid money for new storage but you’re only using a fraction of that new capacity.
With S3 we pay for the storage we use and whenever we need new capacity it’s available. This makes capacity planning much more predictable. This elasticity of supply is a key theme and feature of many of the AWS products.
S3 also has the advantage that we no longer need to run webservers to deliver images – they’re just served direct from S3. In front of S3 we run Cloudfront which is Amazon’s CDN (content delivery network) product – it gives us access to a number of geographically disparate sites where our images are served from. In our region the two key POPs (points of presence) are in Singapore and Hong Kong.
We have also used Cloudfront for streaming video – it’s a very cost effective way of doing video delivery as the data you use is the data you pay for. No longer do you need to subscribe to a fixed pipe (e.g. 40Mbps) to cope with peak demand, now you just pay for the amount of data you transfer – so if you do lots of video streaming on Sunday but none on Monday, no problem – just pay for Sunday’s data usage.
The next step we took was to move our commodity services over to AWS. Commodity services are those which are non-core to your business (but usually essential to its running).
For me, the most obvious of these is DNS. When I first started with iProperty we were running our own DNS servers which was painful and not something I wanted the team to be spending time and energy on. So we quickly switched over to a service provider who could run DNS on our behalf. A little later Amazon launched their Route53 service which offers a very similar service for a fraction of the cost. So it was really a no-brainer to move all of our DNS zones to AWS.
Similarly to DNS, email routing and delivery is an essential part of the business but not one which is a differentiator for us. Therefore we are moving over to SES (Simple Email Service) as we migrate sites into AWS. This is low-impact as there is an SMTP endpoint and no coding changes should be necessary. The nice thing about SES is that deliverability is handled by Amazon – they deal with all the intricacies of running an internet-facing mail server (making sure the appropriate reverse DNS records exist, for example).
Having already adopted some of Amazon’s services, we then moved into the meat and potatoes of their offering – EC2. Think of it as a virtual machine “in the cloud”. This is the first real illustration of computing as a utility service (comparable to other utilities like gas, power and water). Werner Vogels (Amazon’s CTO) illustrates this with the story of a brewery around the start of the 20th Century which used to employ people to run generators in order to provide power to brew beer. As you can imagine, once the power companies came knocking and offered to hook them up to mains electricity they jumped at the chance. All those staff who were generating electricity could now be employed to brew beer and add value. As an IT team, chances are you don’t want to generate your own power, so why do you want to run your own servers? Is your team’s time best spent fixing motherboards and power supplies?
The danger with EC2 is to think of it as just another server but there’s subtle differences. For example, if you’re running your own servers and it stops responding you’d go through a troubleshooting process to find out why and to recover that server. However, because your insight into what EC2 is doing is limited, a better way to recover an EC2 instance is simply to spin up a new one and turn off the old one. For this reason, it’s important that you can reliably and repeatably deploy your application and infrastructure.
EC2 is an abstraction of a server – you lose some of the insight and control into what your servers are doing but you gain convenience and elasticity in return. One level above EC2 is RDS (Relational Database Service) which allows you to access MySQL, Oracle or SQL Server databases without having to manage an operating system or database software. As with EC2 this does mean that you lose some level of control on how your database is implemented but you gain the ability to bring up a database very rapidly.
As you continue moving further up the stack you start to get into PaaS (platform-as-a-service) territory. Amazon has a number of products which could fit into this bucket. At iProperty we’re trialling and investigating these. So far we have just started using auto-scaling in EC2 – this is where you can set parameters within an auto-scaling group (e.g. a farm of web instances) and have more instances auto-spawn as you start to reach capacity. This can certainly be useful but caution is required to make sure that you’re monitoring how many instances you’re running at any time to avoid unexpected costs.
CloudFormation is a way to template your environment (or a subset of services) to allow you to reliably and repeatably deploy those services. Conceptually this is similar to tools such as Chef or Puppet; however it tends to complement rather than replace those tools.
And Elastic Beanstalk is probably the most interesting of the bunch – a true PaaS service for PHP, Python, .Net (IIS) and Java (Tomcat). By configuring a few configuration details and point it to your code (in a git repository), Elastic Beanstalk can take care of everything else about running your app. As with EC2 and RDS again you gain convenience but you sacrifice control.
And finally, a few of the other tools which we find interesting and relevant.
ElastiCache is a drop-in replacement for memcached (a popular in-memory cache for data). Because it uses the same APIs as memcached, no coding is required to start using it.
However, the same is not true of CloudSearch (full text searching) and DynamoDB (a very efficient NoSQL datastore). Both look like interesting products and we are actively investigating them; however we do treat them with caution because they are proprietary products and there’s a risk of vendor lock-in if you build your application to use only those products.
We are in the process of adopting many of the AWS products and we have had good success so far in running parts of our infrastructure there. We are very excited by the possibilities that AWS offers and are exploring further ways to make use of the services.
And a quick plug for the Malaysia AWS User Group which has a group on Facebook as well as hosting meetups for those interested in more detailed discussions of AWS.
Pingback: Agile Malaysia – 29th November 2012 | Andy Kelk