Wednesday, December 18, 2013

Breaking the PageSpeed Barrier with Bootstrap

A post by Dan Riti@

The Right Stuff: Breaking the PageSpeed Barrier with Bootstrap

I recently had the pleasure to listen to Ilya Grigorikgive a talk at Velocity in NYC on Breaking the 1000ms Mobile Barrier. During the talk, Ilya usedPageSpeed Insights to demonstrate that several high profile websites had overlooked some very simple and common optimizations and resulted in poor PageSpeed scores. For the unfamiliar,Pagespeed Insights is a web based tool created by Google that analyzes the content of a web page, then generates suggestions to make that page faster.
After Ilya’s talk ended, I started to think more about why performance always seems to be an afterthought with developers. As I pondered this thought, I kept coming back to the following question:
How hard is it to get a perfect PageSpeed Insights score?
It can’t be that hard, right? Well…there is only one way to find out!

Breaking the PageSpeed Barrier

To answer this question, I’ve decided to keep things as simple as possible, yet realistic. So I selected the following constraints based on what I would consider to be a simple website:
  1. Use Amazon EC2 to host
  2. Use a m1.small instance running Ubuntu 12.04 64-bit
  3. Use Apache as a webserver with mod_pagespeed
  4. Use an off the shelf Bootstrap example that depends on external CSS
    and JS (including jQuery)
  5. Modify the Bootstrap example to add a single image (26kb in size)
I picked Apache over Nginx simply because I’m more familiar with setting it up and configuring it. I choose the Bootstrap example because I felt it is composed of the many elements you’ll find on a modern website. These elements include several exteral CSS and JS dependencies (3 CSS & 3 JS), a top oriented navigation bar, and a decent amount of content. Plus, Bootstrap is widely used across the web, so why not look at how we can make it faster?

Going the Distance

I broke the experiment into several steps, where I would:
  1. Get the current PageSpeed score
  2. Pick a single failed optimization from the list of suggested improvements
  3. Research and implement a change to overcome the failed optimization
  4. Rinse and repeat until success
Easy enough, so let’s get started!

1. Bootstrap off the shelf

Let’s begin by generating a PageSpeed score from the off the shelf Bootstrap example. This will act as a baseline for the rest of the test.
CommitMobile ScoreDesktop ScoreDOMContentLoaded
Bootstrap off the shelf7790833 ms
So it seems Bootstrap scores much better in desktop then mobile, along with aDOMContentLoaded of 833ms. Not to shabby to start off with, so let’s see how we can improve.

2. Enable mod_pagespeed

UPDATE: Below is a mod_pagespeed configuration that will get you a perfect PageSpeed score for this Bootstrap example (without any manual work)! A big thanks to all the folks at Google for providing feedback to this article (read the original discussions here and here)!
<IfModule pagespeed_module>
    ModPageSpeed on
    ModPagespeedRewriteLevel CoreFilters
    ModPagespeedEnableFilters prioritize_critical_css
    ModPagespeedEnableFilters defer_javascript
    ModPagespeedEnableFilters sprite_images
    ModPagespeedEnableFilters convert_png_to_jpeg,convert_jpeg_to_webp
    ModPagespeedEnableFilters collapse_whitespace,remove_comments
For the first optimization, we’re simply going to enable the Apache PageSpeed module and let it do all the hard work for us! Well think again, because enabling PageSpeed with its default set of filters only gives a boost of 3 points for both mobile and desktop.
CommitMobile ScoreDesktop ScoreDOMContentLoaded
Enable mod_pagespeed8093660 ms
Not much score improvement, but mod_pagespeed did automatically concatenate all our CSS and apply cache control to both CSS and JS, so that’s kind of nice.
It looks like we’re gonna have to get under the hood and get our hands dirty with some good old manual optimizations. So let’s start with some low hanging fruit.

3. Minify CSS

Bootstrap fortunately ships with minified copies of most of it’s CSS, with the exception of theme.css. So we’ll be using the trusty old yuicompressor to get the job done!
$ cp bower_components/bootstrap/dist/css/bootstrap.min.css app/styles/
$ cp bower_components/bootstrap/dist/css/bootstrap-theme.min.css app/styles/
$ yui-compressor app/styles/theme.css -o app/styles/theme.min.css
CommitMobile ScoreDesktop ScoreDOMContentLoaded
Minify CSS8094843 ms
This change is very straightforward, but doesn’t yield many points. So let’s continue onto the next one.

Enter The Fold

Welcome to The Fold. No not the band, but that imaginary line in a website that divides the top 600 pixels of content a user first sees from the rest of the content they will eventually scroll to.
In the world of The Fold, anything “below the fold” is considered a second-class citizen. And according to Google, things below the fold need to be eliminated from blocking our need for speed.
So optimizing for above the fold is basically:
  1. Prioritize the delivery of any content that is “above the fold”. This ensures the minimal amount of time for content to be rendered in the browser, and ultimately should make users happy.
  2. Defer everything else, especially anything that will block rendering for “below the fold” content.
Let’s get started, shall we?

4. Remove render-blocking Javascript

After reading over Google’s recommendations for removing render-blocking Javascript, it’s clear that we have way to much Javascript to simply inline it all, so we will have to use their inline Javascript snippet to defer the loading.
Immediately, there is a problem. If I defer the loading of all the Javascript files, how can I guarantee that they are loaded in order? Both holder.js and bootstrap.jshave a dependency on jquery.js, and loading them out of order will result in broken Javascript. So I’m forced to manually concatenate (and minify) all the Javascript files before we defer the loading:
$ cat jquery.js bootstrap.js holder.js > all.js
$ yui-compressor all.js -o all.min.js
$ ls -alh
total 476K
drwxrwxr-x 2 ubuntu ubuntu 4.0K Nov 11 03:06 .
drwxrwxr-x 5 ubuntu ubuntu 4.0K Nov 11 02:58 ..
-rw-rw-r-- 1 ubuntu ubuntu 161K Nov 11 02:58 all.js
-rw-rw-r-- 1 ubuntu ubuntu 127K Nov 11 03:06 all.min.js
-rw-rw-r-- 1 ubuntu ubuntu  58K Nov  9 01:27 bootstrap.js
-rwxrwxr-x 1 ubuntu ubuntu  13K Nov  9 01:26 holder.js
-rw-rw-r-- 1 ubuntu ubuntu 2.4K Nov  9 01:25 html5shiv.js
-rw-rw-r-- 1 ubuntu ubuntu  91K Nov  9 01:26 jquery.js
-rw-rw-r-- 1 ubuntu ubuntu 4.0K Nov  9 01:25 respond.min.js
Now we simply include the defer loading snippet into our HTML and see what happens…
CommitMobile ScoreDesktop ScoreDOMContentLoaded
Remove render blocking JS9198286 ms
Not only did our scores jump considerably, but our time to first render (DOMContentLoaded) has improved significantly! However, it looks like defer loading has some side effects, so let’s dig deeper.

5. Leverage browser caching

As a side effect of deferring the loading of Javascript, we are no longer getting the automagic browser caching from mod_pagespeed. So sounds like we need to get a bit hands on with Apache.
  • The HTML Boilerplate has a fantastic example of expires headers for
    cache control.
  • Make sure you’re using a cache busting file name scheme so users
    get served new files.
  • You should also read this article by Steve Souders.
CommitMobile ScoreDesktop ScoreDOMContentLoaded
Leverage browser caching9298231 ms
Only a slight score improvement in mobile, but we’re now tied with Google’s PageSpeed score!

6. Optimize CSS Delivery

Unfortunately, none of Google’s suggestions would help us out much here. So now the only thing left is to remove render blocking CSS.
To frame the situation, external CSS is network bound and the network is slow. Thus, we want to remove this dependency so we can render our above the fold content as fast as possible.
To overcome this hurdle, we’re going to attempt to leverage an extremely experimental technique I learned from Addy Osmani and Paul Kinlan that goes something like this:
  1. Run a Javascript bookmarklet to detect and list CSS that is “above the
  2. Inline the “above the fold” CSS directly into the HTML
  3. Defer loading the rest of the “below the fold” CSS using a simple, yet
    not cross browser compatible solution by Paul Irish
NOTE: For more technical detail on this method, visit the links below:
Like before, let’s concatenate all our CSS, minify and see what happens.
$ cat bootstrap.css bootstrap-theme.css theme.css > all.css
$ yui-compressor all.css -o all.min.css
$ ls -alh
total 516K
drwxrwxr-x 2 ubuntu ubuntu 4.0K Nov 11 04:31 .
drwxrwxr-x 5 ubuntu ubuntu 4.0K Nov 11 04:29 ..
-rw-rw-r-- 1 ubuntu ubuntu 134K Nov 11 04:31 all.css
-rw-rw-r-- 1 ubuntu ubuntu 110K Nov 11 04:31 all.min.css
-rw-rw-r-- 1 ubuntu ubuntu 118K Nov  9 01:24 bootstrap.css
-rw-rw-r-- 1 ubuntu ubuntu  96K Nov 11 02:21 bootstrap.min.css
-rw-rw-r-- 1 ubuntu ubuntu  17K Nov  9 01:24 bootstrap-theme.css
-rw-rw-r-- 1 ubuntu ubuntu  15K Nov 11 02:21 bootstrap-theme.min.css
-rw-rw-r-- 1 ubuntu ubuntu  199 Nov  9 01:23 theme.css
-rw-rw-r-- 1 ubuntu ubuntu  158 Nov 11 02:24 theme.min.css
CommitMobile ScoreDesktop ScoreDOMContentLoaded
Remove render blocking CSS100100151 ms
Did you hear the sonic boom?


#CommitMobile ScoreDesktop ScoreDOMContentLoaded
1Bootstrap off the shelf7790833 ms
2Enable mod_pagespeed8093660 ms
3Minify CSS8094843 ms
4Remove render blocking JS9198286 ms
5Leverage browser caching9298231 ms
6Remove render blocking CSS100100151 ms
For the given constraints of this experiment, I have been able to achieve a perfect PageSpeed score. Over the course of this experiment, I’ve made the following observations:
  • PageSpeed optimizations directly result in an improved time to first render. This can have a significant impact for a mobile site.
  • Only installing mod_pagespeed is not enough. If anything, it’s only the beginning when it comes to tuning performance for your website. It offers an impressive list of configurable filters that you should read about.
  • Asset concatenation is useful for reducing the number of HTTP requests, DNS lookups, and overall round trip times (RTT).
  • Asset minfication is a must and it’s useful for reducing payload size.
  • Browser caching is a must. Seriously, do it.
  • The defer loading for Javascript seems safe to use. However, further experimentation is necessary to determine its effects on Javascript heavy sites, especially those that are built with Javascript MVC frameworks.
  • The defer loading for CSS is definitely not “production ready” and still needs considerable improvement.
  • Implementing any optimization should always be weighed against the many established web performance best practices.
Finally, I can’t help but think there are further improvements that can be made to my solutions. Thus, I’d like to encourage further discussion on this topic by ending on the following question:
How many of my “solutions” are actually anti-patterns? If so, how can they be improved?
Want to read more about what all this actually means to your users? Check outmy post on client latency!