Popup div behavior

Posted by apejoy on April 7th, 2008

Well, this is a somewhat windy path…

Recently, I was working on a little project to provide Javascript that could easily be used to to put up a popup div (or many, if desired) centered in the viewport on any page. I wrote some OO Javascript that is, I think, not too bad. I’m still learning about simulating true OO behavior in Javascript “classes”, but it’s a start.

The code worked very well on all the browsers that I care about, and the client was pleased… until they went to use my little two-class library on their site. The placement of the popup went all crazy on all browsers, IE6’s behavior being the wackiest, which just really surprised the crap out of me, of course.

Puzzle lovers, note the behavior on these two links, and see if you can figure out, in 3 hours or less, what’s wrong.
Works | Hosed

Rather than being centered in the viewport, the popup was appearing much lower on the page. Eyeballing things, it looked to me like the popup was now being centered on the page rather than in the viewport. Hmm. A little Firebugging confirmed that notion. In my positioning method, I use Prototype’s viewport.getDimension() method, and in my client’s page, the number coming back from that method was wrong, wrong, wrong. What? Huh?

Ok, after a little headscratching I think to myself that -duh- I’d better see if my client is hooking up to a different version of Prototype than myself. Indeed. When developing, I used scriptaculous-1.8.1 with prototype-1.6.0, but when my client went to use my library on their site, they hooked it up to scriptaculous with prototype-1.6.0.2. Ah ha! I wonder if that could be it? Did the viewport.getDimensions() method change between Prototype 1.6.0 and Prototype 1.6.0.2? I cracked open both of those versions of Prototype in my favorite editor, and indeed, viewport.getDimensions() was significantly changed. (The change to that method actually appeared in prototype-1.6.0.1.)

As far as I know, as of the date all this was going on, Scriptaculous had not yet released a version of its library that included prototype-1.6.0.2. In fact, the current version of Scriptaculous as of 4/7/08 includes prototype-1.6.0.1. Perhaps my client was checking out directly from the Scriptaculous source repository, thereby getting something fresher than the current version?. Perhaps they just chose to stick the freshest version of Prototype into the Scriptaculous library? One shouldn’t do that (and I’m not saying they did, just that they might have). Scriptaculous includes Prototype with its download so that people don’t go breaking things by having an incompatible version of Prototype. At any rate, a possible cause of the trouble looked now like a change to Prototype.

At first a fellow might grumble, “Why would there be anything in the 1.6.0.2 release that would break code written using 1.6.0 and why wouldn’t this be documented?” Ok, well, changes to the viewport.getDimensions() method are documented in the release notes (relevant lines are 75 and 287-292), but perhaps not adequately, and they gave me no insight at all into what might be causing my problem.

Presumably, because version 1.6.0.2 was relatively new when I was working on this project, I didn’t find a whole lot of talk about problems with viewport.getDimensions(). I did find one post that gave me pause. It claimed that issues with the method were not a Prototype bug, but rather almost certainly a doctype issue. Doctype! Yeesh. So misunderstood and so often copied and pasted from who-knows-where. Doctypes are a problem. There are lots of reasons for this and lots of smart people working hard on figuring out the right thing to do.

If you wade through the muck, it’s not too hard to find valid doctypes to put in your HTML documents. What is a little harder to find is some help choosing which of the various valid doctypes is right for you. I confess >cringe< to being one of the ill-informed and have been using an xhtml doctype, served as text/html and including an xml declaration. Doh! It causes no obvious problems most of the time, but the arguments against doing things this way are numerous and pretty much irrefutable.

My client’s page, on which the popup positioning was whacked, had this for a doctype:

< !DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html lang="en">

Well, that’s only half a doctype, and the half that’s there should be updated to 4.01. Yes, yes you can use half a doctype, which causes browsers to slide into quirks mode (those that support it anyway). Some suggest that this is fine for table-based vs. CSS-based layouts, for old browsers, for IE6 if that’s only browser you’re targeting, or various other, largely unsupportable, reasons. It’s definitely not fine if you’re using CSS for layout or if you’re trying to write standards-compliant markup and styles. Oh the pain that has been suffered by people who have written code that validates but don’t know about quirksmode or how it’s triggered. “Why is my site fine here and broken there?” they wail.

This is a great example highlighting what’s annoying about the whole doctype thing. There is no indication that my client’s doctype is old or partial. The page renders fine (ingoring the popup behavior, of course) in all the browsers that my client cared about and, once a few other things are fixed, the W3C validator validates the page! So, can it be that if you use half a doctype it’s still valid as far as the W3C’s validator is concerned, but that there is other code out there (Prototype, for example), which expects HTML to have a similar yet longer and also valid doctype in order for that code to work properly? Golly Beave, that’s a bit fuzzy in’t? Without actually understanding and knowing that you have just the right flavor of valid doctype, it can take something like this situation to lead you down the circuitous path of discovery.

The correct doctype is:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

or

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN"
"http://www.w3.org/TR/html4/strict.dtd">

After getting a valid doctype in place, the code works with Prototype 1.6.0 or 1.6.0.1 or 1.6.0.2, which is really what I’d expect given the minor-ness of the dot release! So, in the end, it was all about the doctype. My suspicion is that people who actually understand doctypes and know which one to use and how to use it would say, “Duh, what do you expect if you don’t even have the right doctype in your HTML?” Ah, but isn’t this always the case? Things are clear as day once you understand them.

My requests, therefore, are two:
1. All writers of HTML, please, please, take the time to read about doctypes, and always use them correctly. I mean, should the Prototype team really be blamed if you can’t get your doctype right? There are several links above that are worth struggling through to get a grasp on this. Even if you’re lucky and the doctype declaration that you cut and pasted from somewhere is actually correct for your situation, it’s still worth understanding something more about it than just CTRL-C and CTRL-v.
2. Prototype contributors, add some documentation if changes to the code base are going to have radically different behavior dependent, not on deprecated library code, but on something external like doctype declarations. Suspecting that varying doctype declarations trigger radically different behavior between minor dot releases of a Javascript library is hardly an obvious, first-thing-that-comes-to-mind conclusion. I discovered that is there is someone thinking about this, though it’s not been done yet.

A pile of useful db stuff

Posted by apejoy on November 30th, 2007

I use MySQL, so it may be that some of these items are specific to that DBMS. For the most part, however, this information should be generally applicable to data modeling and implementation.

Sources:
http://hudzilla.org/phpwiki/index.php?title=Main_Page

When building your tables, think about the data types you choose!

In your ID columns – you know the ones you almost always include in every table – the ones that auto-increment – the ones that can always serve as a primary key (though perhaps not a very meaningful one) – set the data type to INT UNSIGNED. That allows for 4,294,967,295 entries, which ought to be sufficient. This is preferable to just using INT because a) it doesn’t make much sense to use negative numbers in your ID columns (does it?), and b) if you’re only using the positive range, you’ll only be able to have 2,147,483,647 entries : ).

CHAR vs VARCHAR. CHAR is fixed-length, so if you say CHAR(255) the DBMS always uses 255 bytes for this field regardless of how much data is actually stored. You store, “Hi” and the database locks up 255 bytes in spite of the fact that you only need two. VARCHAR(255) would, if you wanted to store “Hi”, only use 2 of the 255 maximum bytes that you specified when you defined the table. Now, here’s what really matters: variable-length fields take longer to search but use relatively less space while fixed-length fields are faster to search but take up relatively more space.

I admit I don’t know what MySQL does exactly, but it’s not hard to illustrate how searching through fixed-length fields is faster than searching through variable-length fields. With fixed-length fields you can always tell where the start of the next field is. In fact, a simple calculation can get to quickly and easily to the Nth field. In an array-based implementation, where each byte is one element in the array, the only operation required to move to the next field is a simple addition. currentFieldStartIndex + sizeOf( dataType) gets you to the next field. Change the array so that it holds one field per element and the addition becomes currentIndex++. Simple, and there’s probably no faster operation a computer can do than addition. In general, we’re taking O(1).

Now, still assuming array-based implementation, to search through variable length fields… One option is to terminate each field with a special character. In moving through the array you look at each and every element until you see the terminator at which point you know you are at the start of a new field. To find the next field requires a number of operations equal to the length of what you are storing + the terminator + 1. In general O(n+2). Another options is to build an index, which can be searched just as quickly as the fixed length example above, but then you have the overhead of building and maintaining the index, and there’s always at least one more operation to do after peeking into the index – you still have to go get the actual field. Both of the above options are obviously slower to search through than fields with fixed-length data types. But, how much worse? So much that we care? Let’s see.

Given a table called Fish with 1,000,000 records, each having one field that stores char data:
If the data type is fixed-length, say CHAR(20), we have a table that requires about 20MB of disk space. The worst case in searching for a particular record is that it’s the last one in the table, so it took 1,000,000 operations to find the correct field. Best case is 1.
If the data type is variable-length, say VARCHAR(20), we have a table that, worst case, requires about 20MB of disk space, the worst case being that each field is filled with 20 characters. The worst case search will take 20,000,000 – 20 operations to get the start of the last field. Ok, that’s 20 times more operations. Best case? Again, 1. But, consider this. The advantage to variable length fields is less space, so let’s say that there are still 1,000,000 records, but that all of them except the last one contain just one character. The disk space required for this table is around 1Mb (vs. 20MB for the fixed length version). Let’s also say that the variable length example is implemented with a special terminating character. The search algorithm will do 2 operation per record now: 1 to look at the single stored character, 1 to see the terminator, indicating that the next byte is start of the next field. We’re still looking at double the number of operations compared to the fixed length implementation.

Perhaps performing this test, even on an average home machine, might produce a shrug of the shoulders, but what if we pump this database up an order or two of magnitude? And what if this query we’ve been looking is requested 1,000,000 times a day? Now the difference in performance becomes noticeable and a potential problem.By using CHAR more often the VARCHAR, you are almost certainly going to use more disk space, but you’ll have better performance. Disk space is cheap. CPU cycles are not. The users of your system will care about speed not about how much disk space is being used by the database. Your boss’ boss’ boss will care about $$$, and unless you are fantastically careless about your use of disk space it is very unlikely that saving space by using VARCHAR will translate into a cost savings that justifies less than the best possible performance.

When you’re going to use an INT type for a field, can you state with certainty what the maximum value could be? In an age field (assuming ages are of humans and you’re not building a database for cryogenic freezing or some other such nonsense), TINYINT UNSIGNED, with its maximum value of 255, will do just fine. TINYINT takes one byte, while INT takes four. If you have 1,000,000 records that include that age field you can store that data in 1MB or in 4 MB. Why waste 3MB (even as cheap as they are) with no gain whatsoever in performance? If you’re running a huge database with many fields in many tables defined lazily like the example above, and if your boss’ boss’ boss understood the example above, you might get in trouble. It’s a straight-up waste of money to use a larger INT type than you need.

Ramping up…

Posted by apejoy on October 9th, 2007

A long, long time ago Audrey and I discussed the whole child-rearing/career/bringing-home-the-bacon dilema. The plan has always been to switch places so that she gets more time with the girls while I “go to the office”. It’s time to start making the change. What sort of mix we’ll end up with depends on all sorts of things, but I will certainly be looking to gainfully occupy myself for at least 30 hours/week, ramping up over the next few months.

To that end, I’m starting this technical blog so that contacts and potential employers can readily see some evidence of my professional skills and interests.