Slashdot runs out of room for threaded comments

Slashdot ran into a major problem with their comment system today. Apparently they have a column called “parent_id” or some such that they forgot to switch an index to a 32 bit integer. The net result is they have disabled threaded comments until they can alter the table to 32 bit integers. They figure on that table it will take them 3+ hours to re-index it.

As a database guy I find this hilarious on multiple levels. First of all, I have to wonder who’s stupid idea it was to use a 24 bit integer anyway? On a 32 bit machine you don’t gain anything by using 24 bits intead of 32. The biggest thing to think about is your register size. You want your keys to align to the register size because to scan the database it takes fewer operations. Here is an example:

Let’s say you want to scan the ORDER table and find all the ORDER_DETAIL entries for each Order. Assuming your order_id is something reasonable like a 32 bit integer for 32 bit machines, the process would work something like this. You fetch order_id from the Order table into a register, you then fetch an order_id from the ORDER_DETAIL table into another register and you perform one operation to compare them. Then you move to the next order_id in either the ORDER or the ORDER_DETAIL table depending. The thing to note here is that the smallest thing you can load into the register for a comparison is 32 bits, or 4 bytes (again, assuming a 32 bit machine). So you save nothing by using a 24 bit integer, in fact it’s very possible you could waste extra operations aligning the integers into 32 bit sections of memory! Conversely imagine if you used something like a GUID for your order_id, but instead of storing it as a native GUID type and storing it in 16 bytes, you decide to store it as varchar(36), (don’t laugh, I’ve actually seen this). Now to compare one row in ORDERS to one row in ORDER_DETAILS you have to load each character into a corresponding register and compare them. After comparing the first character you then have to load the next character for each order_id into the registers and compare them, and so on. This is obviously going to take you much, much longer. How much longer will depend on how well optimized the code is, but in general I would suspect this is going to take about nine times longer than just comparing an integer key.

I honestly feel slashdot’s pain, but as someone said in one of the comments, “yes, this could happen to anyone, it’s just much funnier when it’s happening to someone else.”

Comments are closed.