Sometimes, Poor Design Works
Published: 06/04/2009
Brain Dump, Programming
As programmers, we can get obsessed with the small things. I personally have no problem spending hours trying to optimize a section of code, not for performance or to improve user experience or anything like that, but because there’s just something funny about the algorithm.
The professional in us tries, sometimes futily, to keep this behavior in check. Sometimes though you just have to do “just” one more thing to really make it perfect. This, more often than not, balloons up into a weeklong chore to, in the end, dig yourself out of the hole you made with zero progress on the initial problem.
This is the way of the code monkey.
Which is why it’s a little surprising to see how HTTP cookies are implemented. According to the official RFC (HTTP State Management Mechanism):
The user agent makes a series of requests on the origin server, after each of which it receives a new cookie. All the cookies have the same Path attribute and (default) domain. Because the request URLs all have /acme as a prefix, and that matches the Path attribute, each request contains all the cookies received so far.
Think about that for a second. On every request the cookies are sent to the server. ON EVERY REQUEST.
Try this; install a FireFox plugin called Live HTTP Headers (if you don’t already have it installed). Start the plugin and refresh this page.
You should notice that every request for ANYTHING (images, js, css) sends the cookies up to the server.
I don’t know about you, but I only have one or maybe two, points in an application that can evaluate cookies so to send on every request is a little… wasteful.
I don’t know for sure if there’s more work on the server side to deal with the cookies (though I would imagine the HTTP server has to do something to make the cookies available to a scripting language) but what I focus on is the bandwidth.
Now, I know it’s the 21st century and bandwidth is now fast and cheap. Yay us.
But consider the state of the Internet and bandwidth in 1997 when the specification was first drafted. Most people were using 28.8 and 33.6 baud modems to browse the Internet and 56k was still a year away.
According to the spec:
Practical user agent implementations have limits on the number and size of cookies that they can store. In general, user agents’ cookie support should have no fixed limits. They should strive to store as many frequently-used cookies as possible. Furthermore, general-use user agents should provide each of the following minimum capabilities individually, although not necessarily simultaneously:
* at least 300 cookies
* at least 4096 bytes per cookie (as measured by the size of the characters that comprise the cookie non-terminal in the syntax description of the Set-Cookie header)
* at least 20 cookies per unique host or domain name
User agents created for specific purposes or for limited-capacity
devices should provide at least 20 cookies of 4096 bytes, to ensure
that the user can interact with a session-based origin server.
So, unless I’m crazy, the above makes it acceptable for a site to use 20 cookies, each with a maximum size of 4096 bytes (4Kb). This equals out to a possible 81920 (82Kb) bytes of cookie data being sent on EVERY REQUEST.
This basically means that an image that weighed in at a cool 2Kb comes out to need 82Kb of bandwidth to transfer. Doing the math on a full site with say 20 images, 1 HTML file, 1 CSS file and 3 JS files and it really starts to add up. So at a time when bandwidth was scarce cookie usage was a good way to screw up the user experience if you didn’t pay attention.
Granted, you’d have to be an idiot to write a program that used 20 cookies with each containing 4kb of data, but we’re programmers. Most of us are stupid; some are really stupid.
It’s easy to point at cookies and laugh.
The thing we need to keep in mind, though, is that this isn’t a that big a problem anymore. Sure, once, it may have been an possible issue. Technology almost solved it though. Faster bandwidth, with faster computers made will eventually make any issue just disappear.
Which is kind of the point; something I try to keep in mind.