Compression & the Web
September 17th, 2007Technology is like a Koi Carp it grows to fill the space that it’s given.
Now that broadband has become the standard for a large part of the market, web development is using as much of that bandwidth as possible for richer and more immersion user experiences. It means these days we’re facing the same issues we had back in 1996 only now rather than worrying about how a site loads of 56k modem, we’re worrying about how it loads on 5gig FIOS pipe.
Of course as I always say when cable modem users make fun of my pathetic DSL connection at home, the web only gets served as fast as the slowest point in the pipe. So it doesn’t matter how enormous the pipe coming into your house is, if the site is being served on a throttled bandwidth from the host, you’ll only get it as fast as that slow point.
This bandwidth is now being filled with richer content - larger images, more css, large JS libraries, flash content, video files, etc, etc.
So being in the same situation we were 10 years ago with the web, we have to apply the same rules we applied back then (just on a larger scale). The two easiest approaches are:
- Reduce the base size of things you’re serving
- Then compress the things you’re serving
Some might argue those are the same things, but I think there is a difference - You can reduce the amount of images you serve, and then compress the images you’re left with. You can reduce the lines of JS code, and then compress the lines that are left.
Good ole fashioned GZIP
The long standing solution to the compression side of the solution is GZIP.
GZIPed content basically means that you have a server that is configured so that if the client (the browser) requests that content should be sent in a GZIPed format it zips the content up before sending it back in the response. In most instances the browser will supply a list of content types it will accept and you configure your server to respond accordingly to those.
On the server side of things, it’s a fairly simple (if not OOTB configuration) to get GZIP to work.
Apache
With Apache you just use something like mod_deflate or mod_gzip (more info on apache 1.3, for apache 2.0 just use mod_deflate above).
LightHTTP
Lighthttp’s lighttpd.conf file can also be easily edited to allow GZIPed content responses..
-
-
# this assumes that mod_compress is loaded
-
compress.cache-dir = "/var/lighttpd/cache/compress/"
-
compress.filetype = ("text/plain",
-
"text/html",
-
"text/javascript",
-
"text/css",
-
"application/xml")
-
MS IIS
IIS can be configured to handle GZIPed content.
Resin
With Resin you can just add…
-
-
<filter-mapping url-pattern=‘/*’ filter-name=‘com.caucho.http.filter.GzipFilter’/>
-
…inside your hosts or web app and it’ll handle the rest for you.
All pretty straight forward on the server now lets look at the client.
Browsers that like GZIP
Thankfully in these days of people upgrading pretty quickly it all looks rather good. The basic breakdown goes like this…
- Netscape 6.2+
- Mozilla 0.9.9+
- Internet Explorer version 4.0 (although both 5.5 and 6.0 had “issues”)
- Opera 4.0+
- Lynx 2.6+
From schroepl.net
…So aside from some slight quirks from the IE side of things, it would seem that any apart from those that have purposefully avoided updating their OS or browser, or people who like to run Netscape 3.0 Gold Edition for some sort of nostalgia reason will be OK with GZIPed content.
Now just to mention, the neat thing here is that unless the browser says it will accept GZIP as a content type, all of the servers mentioned above won’t serve it that way, they will all default back to the regular file format. So it means even if Mr. Netscape 3.0 Gold comes along, they won’t be impacted by the GZIP work, they just won’t get the benefit - it’s good old fashioned Progressive Enhancement.
So there is next to no reason to have GZIP configured on your server and be serving files that are nicely compressed in size, in fact I’d wager if you’re not hosting your own content, it’s probably already being done for you anyway by your paid host (if it isn’t you should ask them why not, as it would save you and them bandwidth costs…. oh wait you pay for bandwidth don’t you ;) ).
Build Process Integration
Stepping backwards in our 2 step approach we have to reduce the size of the files we are serving, I won’t go over images as that’s old school and anyone who’s been doing web work for any length of time will have crossed this issue many, many times.
Lets instead focus on reducing the size of the text based files that are being served.
There are a number of tools that have been developed over the last few years to compress the text content of code so that it becomes smaller in size. When I say compressed, I mean non file compression compressed (as we’re doing that with GZIP) this is just reducing the size of the text within the file, some might say it’s just code optimization.
JavaScript Compression
Almost every JavaScript toolkit worth it’s salt has some kind of JS optimization/compression/obfuscation engine associated with it. Most of them work by….
Using shortened function names…
-
-
-
someWonderfullyDescriptiveName = function(){
-
-
}
-
-
// becomes something like
-
-
a = function(){
-
-
}
-
Stripping out comments and as much whitespace as possible.
Some take it one step further and compress everything into a sort of regex eval single function (which in all honesty I have no idea how it works) but that usually results in extremely optimized (and obfuscated code).
The best examples I have found are:
- Dean Edward’s Packer
- Dojo’s Shrink Safe
- Julien Lecomte’s YUI Compressor
- JSMinifier
- CompressorRater
- Memtronic
Some are “better” than others, both in terms of how much they break stuff and how much they compress, my only observation is that the best compression usually leads to the most fragility in what code breaks the compressor, and the best compressor compared with the worse the byte size difference seems small. Another important consideration is how easily the compressor fits with your project, the environment and your build steps. Most of my commercial work uses the Dojo toolkit so we’ve used Shrink Safe for almost everything as it just works with the Dojo Build steps which you can add to your ant or maven build process.
There are a ton of alternatives for integrating the compression into your build process, there are Java filters, Maven Plugins and ant tasks/targets.
The Future
All of these compression/optimization tools work by taking the entire JS file and compressing them as a whole.
There is one other approach that I am very interested to see it progress and that’s the Dojo Toolkits JS Linker application. The thing I like about JS Linker is that it takes a whole UI (HTML and JS files) and it looks for relationships and dependencies in the code and then strips out everything that isn’t used in the JS. It means you could take a library like Dojo that might run into the hundreds of KB, run JS linker on it and it would remove all the code that isn’t actually “needed” by any of the code in the project (a task that is usually don’t manually or not at all). Unfortunately JSlinker hasn’t moved much in the last year and although the Dojo Foundation are keen to progress it, it won’t go anywhere with out some volunteers taking ownership and working on it more… but watch this space.

November 5th, 2007 at 10:47 am
Just adding a link to a rather good Max Keisler post on this topic that links to a ton of CSS and JS compression resources.