Speeding up Travelfish.org part 1
A common refrain regarding Travelfish has been that the site is slow. People using the site would complain, Google would tell be at every possible opportunity it was slow (apparently about 80% of the sites on the web were faster) so over the last couple of months, as time has allowed, I’ve been reading up on what’s involved and slowly making changes to the site.
This is a bit of a work in process, but I thought I’d blog it as others may have some useful suggestions … plus the power is out so I’ve got nothing else to do but run down the laptop battery.
Broadly speaking there are four main choke points:
A picture says a thousand words
And a picture is about a thousand times the size of a single word.
Getting close to your readers
Using a Content Delivery Network (CDN) to make sure that your files are being served from a data centre close to your readers.
Don’t ask too many questions
An average browser is configured to request no more than 8 files simultaneously.
Doing the loop the loop
Optimising your database code and following best practises in your scripting. Caching is your friend.
But what’s the point?
Well if someone is motivated enough to email me to complain about the site speed, then it just needs to be addressed. How many others just left.
Travelfish.org is in part an advertising-supported website. What this means is that in some cases the more pages people look at, the more they learn and the more we earn. For a content site like Travelfish, the mantra seems to be the faster the pages load, the more pages people will read. Makes sense really — do you look forward to returning to a restaurant with glacial service?
Since we added the destination blogs earlier this year, the overall site bounce rate has increased significantly. In part due to their design, but also reader behaviour, blogs tend to have higher bounce rates than a “normal” site. Again this is something I hoped to address through speeding up the blogs — the faster I can load a page, the more chance I have they’ll see something else of interest. I also made some changes specific to the blogs to assist readership, I’ll be covering that in the coming weeks.
So in summary, in speeding up the site we’re hoping to address an issue readers have complained about to help them get the most out of the site. While simultaneously improving our bottom line.
So what have I done so far?
I signed up with Amazon’s S3 and Cloudfront Content Delivery Network. This allows me to store files on their servers (which are considerably more powerful than my setup) and make use of their CDN.
This is a double whammy in that S3 is fast and Cloudfront, with data centres in the USA, the EU, Tokyo and Singapore means that my files are being served a good deal closer to you. (The actual Travelfish.org server is in Texas, USA.)
While I haven’t shifted all the files yet, I have moved a lot, including many of the images, all the Javascript and all the Stylesheets.
Making this change alone, improved load time on my main testing page (http://www.travelfish.org/country/thailand) from over 11 seconds to under 3 seconds.
Not bad.
S3 and Cloudfront take a bit of getting used to, but Labnol has a fabulous set of S3 tutorials that were of immense help. There’s also a S3 Firefox plugin that is very helpful. And, for WordPress users, Tan Tan Noodles has a near perfect WordPress plugin for S3 image uploads.
I am stuck on one thing though in this area. When I upload an image to S3 I can set an expiry date long in the future (this assists with caching) and when I request the image from S3, the header is correct. BUT when I request it from Cloudfront, no expiry date is set. I’ve been trying to find an answer to this problem to no avail — suggestions welcome
Update at end of the entry regarding the above point.
Don’t ask too many questions
If you look at the output from Pingdom for the subject page, you’ll see there is a stack of files being requested. This was my next target.
I started with the javascript files — there used to be four — and combined them into just two files (one needs to remain separate as it is not always loaded).
That was easy.
I also looked at some of the third party javascripts that were being loaded. A World Nomads one related to their select box was a bit sluggish so I grabbed the script and uploaded it to S3. One note on this is I’ll have to check back occassionally with Nomads to make sure I have the most up-to-date file to keep this working properly. I’ll probably do the same with AWeber which can sometimes really bog down the load, while Reinvigorate I plan to stop using, so will just remove it.
The other third party scripts, Quantcast, Facebook, Google Analytics are better left on their respective servers.
In reducing the number of requests I’m not only building a faster site, I’m also saving myself money. Amazon charges for the S3/Cloudfront service in part by request and my last bill charged me for almost four million requests. (Don’t panic, the bill was under $10). But essentially what Amazon is saying it is in everyone’s interests — mine, Amazon’s and Travelfish.org readers — to keep the total number of requests down.
Talking about requests, the next step (which I’m starting on this week and will write about next week) is all about reducing them even further — with Sprites.
So many pretty pictures
Many of the icons (stars, checkboxes,flags etc) on Travelfish.org could be combined into a single file and displayed using CSS. The process is called sprites and there’s an old but good general wrap on sprites at alistapart.
Basically if you have 30 images that are common on many pages, you combine them into one and reference different parts of the single image to display the icons it contains. In this example this effectively reduces the number of files you are requesting from 30 to 1 and probably results in a lower overall filesize as well.
Optimal images
There are a number of online services you can use to optimise your web images. This generally revolves around stripping out data that isn’t needed (eg EXIF blah blah) and making the image file as compact as possible. Travelfish.org has around 15,000 image files on it, so I’ll be saving this one for the wet season.
Often the file saving is nominal (in hundreds of bytes rather than thousands), but every little bit counts — especially when you have an image being served thousands of times a day, 365 days a year.
Code reworking
Travelfish is all handmade, by me, and it generally works. I’m not bragging, but rather saying I could make a glider as well, that would probably fly — but it sure as hell wouldn’t be the Concorde.
I’ll be the first to say there is significant grounds for improvement in this area, especially with regard to caching — but as this will be different for every site, there’s little point in going into it here. The other points above though are applicable to any website.
Resources
Amazon S3
Amazon Cloudfront
Labnol’s tutorials
S3 Firefox plugin
WordPress plugin for S3 image uploads
alistapart on sprites
Pingdom results on the sample page
Next week, sprites and WordPress readership helpers.
Update
Thanks to Carl Hancock for pointing me to this entry regarding Cloudfront and S3.
It points out that Cloudfront doesn’t rerequest the file from S3 unless the filename has changed. I’d been updating the Expire Headers on the image on S3, but the file had already been called across to Cloudfront – so the revised image wasn’t being called.
What I need to do is upload a renamed version of the file to S3, change the expire header, then upload a new version of the HTML file that requires the image, using the new image name — this (in theory, I’ve not tested it yet) should result in the new image being pulled over.
Tedious, but better to learn this now rather than after I’ve uploaded the other 10,000 images!
Second update
So having tested the above, it does indeed work — so I wish I’d posted this entry before I uploaded all the images I did — as now I have to rename all of them… doh!