Demystifying web asset storage and CDN solutions
July 04, 2018
Introduction
Cloud storage and services for the web have been around for a long time and have grown in popularity, with new companies finding a niche digital offering all the time. There are the big ones like Amazon AWS and Google Cloud Platform, and also the smaller contenders like Digital Ocean and many others. The sheer number of cloud services available is astonishing, and if you’re new to cloud systems (or even if you’re not…) this post aims to very briefly cover some of my for some my favourite and most common ones which can be used day-to-day by web developers, along with the main points for each one.
This post was written in March 2018 so the prices or features mentioned may be different at the time of reading.
Object storage
Back in the day it was common to store all web assets on the website’s server. Bandwidth was expensive and sites didn’t have as much rich media as they do now, but that’s changed. Bandwidth is now cheaper in most of the world, and people expect more media and functionality. If your site has high traffic and large files, cloud storage is pretty much a necessity as it’s generally cheaper than hosting on your own server or VPS.
The storage I’ll be discussing here is object storage, which is data that can be accessed using an API or over http/https protocols. The contrast is block storage, which is data on a standard hard drive that can only be accessed when connected to an operating system.
Why would you want to store media files in the cloud? Loads of reasons:
- Cloud storage is generally quite cheap, and can also be cheaper than transferring data with web requests than your hosting company.
- Your website will scale very well becuase of the vast number of servers and technology available.
- Switching your website between different environments will mean you don’t need to worry about changing the location of your assets, as they will all be on the cloud. This can be amazing for speed of development and saving time.
- You can store the files which are needed to make your website run (HTML, CSS, JS, images) as well as large files which users can download from your site, like website downloads for users, videos etc.
- It’s easy! There are many ways to get S3 storing your website assets automatically such as CMS plugins, directly editing your .htaccess file, or you can simply place the S3 URL directly into your code.
So how do you set up object storage?
It’s really simple. Create an account on any of the storage solutions, and each of them will allow you to drag and drop assets in to the bucket. You can then access the asset via the given URL!
Although the above way is easy, it might not be the best solution for web applications. Say for example you have an area of your web app where users upload a profile image. You’d then need a programatic way of sending images to your bucket. That’s where an API will be used.
There’s an API for pretty much all of the object storage solutions. They provide the functionality to upload, download, delete, create buckets etc. It’s common that these API’s become part of the front end or back end process for media management. For example, you may upload directly to asset buckets from a React application, or upload to your own server and transfer from there. It’s then up to you how you store and retrieve your information/assets.
I’ll include some differentiating benefits of each of my chosen storage solutions below, as well as rough pricing at the time of this post (although the pricing changes so often, it’s worth going straight to the site).
AWS S3
Amazon’s S3 (Simple Storage Solutions) storage is probably the most popular, with huge companies like Airbnb, Slack and even Netflix use S3 storage to serve their content to millions of people. S3 allows you to manage a “bucket”, which is an object based storage container for your website assets.
- You can host your own static website. This is a huge deal if you have a personal blog built with a static site generator (like I do!). There’s a great tutorial on Medium which explains exactly how this is done.
- Supports custom URLs. S3 allows you to setup a subdomain to access your S3 bucket via your own domain name, which is nice to look at, whilst also making it easy to update your assets on your site to S3 (just prefix the current images with the S3 subdomain, for example).
- There are CMS plugins to help you store website assets. I currently use Wordpress and Craft CMS, and both of these CMS solutions allow the integration of Amazon S3. Wordpress requires installing a couple of plugins, and Craft CMS has built in support with the Pro version.
- Costs. S3 will set you back around $0.024/GB/m of storage + fees for data requests to and from the S3 account. The costs are seen as a bit misleading by some by containing hidden charges for things you wouldn’t expect, like incrased prices depending on location.
Digital Ocean Places
Digital Ocean have been well known for their Droplets for a while, and have recently (end of 2017 I think) released the new object storage called Spaces. Spaces is an alternative to AWS’s S3, but have focussed their offering to be a lot more simple and easy to use as opposed to AWS, which many say is complicated.
- You can’t host a static website. Or at least you can, but you can’t point your own custom domain to the storage. See this comment thread for an update as it looks like it may be coming soon.
- Does not support custom URLs. URLs are stuck to the standard Digital Ocean domain structure.
- Can be easily used as a drop in replacement for S3. The team at Digital Ocean realise that people may want to transfer their object storage to a cheaper alternative, so they have made their API compatible with S3’s SDK.
- Costs. Spaces is a simple $5/month which gets you 250GB of storage and 1TB of data requests from the server. After that, sotrage is $0.02/GB + $0.01/GB of data requests from the server. Putting data up to the server is free!
Backblaze
Backblaze is perhaps a less known one, but definitely one which is soaring up at the moment. The huge deal about these are the low costs:
- You can host a static site. There is support for the hosting of a static site in a similar way to S3.
- Does now supports custom URLs. Backblaze are able to effectively host a static site now they have recently allowed support for custom URL’s with the use of Cloudflare. Although it can be seen as a little hacky, it does the job! They have a recent blog post on this if you’re interested, and I’d like to give a shout out to Marcello Curto for sending this in!
- Has an brilliantly simple API. Although not a drop-in replacement for S3, it does have a very simple API.
- Costs. Backblaze is also simple at $0.005/GB/m + $0.01/GB of data requests from the server. Uploading to the server is free. Also the first 10GB of storage is free, and 1GB of download per day is free! Perfect for starters and the standard price afterwards is incredibly small (about a fifth of Amazon’s S3 price, with no extras).
CDN solutions
A CDN allows your site to access static website files such as CSS, JS and images, which have been cached on a global network of servers. This means access to these assets will likely load much faster, anywhere in the world. I won’t go into loads of details about specific CDN’s here, but this article from WPMUDEV covers the main ones. Instead, I’ll go in to what the options are for CDN usage with web development in mind.
It’s common for CDN’s to be used alongside object storage solutions mentioned above. The object storage saves your server bandwidth by keeping your files on a server in a single location, whereas CDN is great for allowing quick access to files around the world.
Why would you want to use a CDN:
- Static files are served from cached versions which sit on a global network of servers (number of servers and locations depend on CDN provider).
- Access to cached files mean reduced bandwidth on either your locally hosted files, or from your external storage solution.
- End users will gain a huge benefit in terms of snappy site load and file download times.
- Can be great for security, depending on your CDN provider and setup.
CDN’s can offer “Push” and/or “Pull” options. A Push is where the developer uploads the website content directly to the CDN. In contrast, a Pull is where the developer uploads their files to their server like normal, and the CDN fetches the assets from their server. Once the Pulled assets expire, the CDN will retrieve the file from the server again. Push CDN is said to be good for larger files which don’t change over time, wheras Pull is good for handling smaller, static files like the JS, CSS and images.
So how do you set up a CDN?
In order to set up a standard CDN such as KeyCDN (which offers a 30 day free trial), simply sign up and create a “Zone”. This, like many others, defaults to a Pull Zone, which you must specify as the place where you want the CDN to pull your assets from. Each zone allows you to apply settings such as cache expiry time, which makes sense as you may want to create zones for assets you change more/less frequently than others.
To keep things simple, you can just add your current website URL for the Pull Zone, which creates a cache of your whole site under the same zone:
Alternatively, you can add the zone to be your asset storage bucket at S3 or Backblaze for example!
Creating a Zone will give you a URL to use to access the cached data. If you add your whole site as the pull zone, you can then visit the CDN URL which will give you… your whole site!
The idea now is that you can use this URL to link up your assets on your live site. In my example, my site was at main-b2de.kxcdn.com
, so my CSS would by at main-b2de.kxcdn.com/assets/main.css
. This could be embedded into the head of the site on my server, but it would look odd being from a different domain of main-b2de.kxcdn.com
. CDN’s allow you to create a subdomain of your own domain (i.e. Zone Alias) to access the cached assets. In KeyCDN and others, this is very simple:
So now the domain for my CSS file would be cdn.jaygould.co.uk/assets/main.css
. Perfect! You could now go through and manually add the assets to your live site, served from the new CDN domain, but this might not be the most efficient way.
Instead, it’s common to either:
- Use CMS plugins for things like Wordpress and Craft, which handle everything for you automatically.
- Use Framework techniques for back end frameworks like Express or Laravel, which link to your CDN and gives you access to templating functionality which gets assets from CDNs for you.
- Or do it locally by manually by creating local global variable to store the CDN URL.
It all depends on your setup and project. If you’re running a CMS site, it’s definitely worth using the plugins as they get you sorted in minutes.
Cloudflare is a little different
Cloudflare is almost synonomous with a CDN network, and is probably the most well known one because (as far as I know) it’s the only CDN with a completely FREE tier which offers unlimited bandwidth, a free SSL for personal and blog sites, and takes about 5 minutes to set up.
After setting up Cloudflare on my [Whiskr] web app, it reduced my load time from 2 seconds to just under a second on GtMetrix. That’s pretty incredible for a free solution which takes 5 minutes to set up.
Cloudflare is not like other CDN’s though. It’s more of a package which includes a CDN. Other features of this package include unprecidented protection against DDoS attacks, SQL injections and other malicious attacks - even comment spam on sites. This is possible because, unlike other CDN’s, Cloudflare sits in the middle of the end user and your website, intercepting all traffic. This means that your whole site is cached and distributed globally via the CDN, as opposed messing around with setting up Zones with push and pull etc. for specific files as I mentioned above.
CDN’s and object storage can work hand in hand
Earlier I mentioned that Amazon’s S3 service allows you to use your own domain to access files in a bucket, so requests to S3 from your site are consistent, but Backblaze and DigitalOcean don’t have this feature (yet). A CDN can be a good way to use your own file structure to access assets in your Backblaze or Spaces storage. Backblaze have put together a helpful article to show just that with the use of Cloudflare.
Another example of CDN’s and storage buckets working together is with AWS S3 and AWS Cloudfront. Cloudfront is a traditional CDN solution which can be automatically setup when upgrading an S3 bucket.
Asset services
One final solution I wanted to mention here is full media management services. Where the above storage and CDN solutions are based around fast website access and offloading static files from your server, there are also external services which handle everything (or close to everything) you can want relating to web media - specifically images and videos.
One such service is Cloudinary, and it really is incredible what the service offers, and how much time it can take from developers when building small to large scale applications. Cloudinary is able to store images and videos (similar to S3/DO Spaces/Backblaze B2), offer a lightning fast CDN (like Cloudflare/MaxCDN), whilst providing additional services:
- Simple API to upload and fetch images and videos, written for a huge number of platorms (Node JS included).
- Manipulate images and videos by providing URL parameters to get an instant, on-the-fly updated image
- Manipulate videos in the same way
- Many parts of images and videos can be updated, including size, quality, radius, watermark, and much more!
Images can be uploaded from the server, or from the client. Passing uploads through a back end server allows you to store images on another service manually if needed (for example, an S3 bucket), although Cloudinary offers free backups with paid plans.
Cloudinary also uses AWS S3 buckets behind the scenes!
Thanks for reading
There are new services coming out all the time, even since I drafted this blog post about 4 months ago there are new great contenders offering similar solutions. See which services work best for your project and give them a try! As I mentioned earlier, many services offer a free tier to get you started which is great for hobby dev work.
Drop me an email or tweet if you want to ask a question or correct anything :)
Senior Engineer at Haven