Google Mini KeyMatch and Related Query Imports

The Web is ever changing, and this article is relatively ancient having been published 10 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

Filed under “you mean the documentation doesn’t exist somewhere?”, was my issue trying to import KeyMatches and Related Queries to a Google Mini we have set up for a client. Google’s documentation on them is very straight forward: select the file, press import— but it lacked instructions about how to set up the import file. Since I took the time to sort it all out, here’s a quick rundown.

Import File Specifics

Your import file should be .csv format. With UTF-8 encoding. And, most importantly (especially for those of us using OSX), UNIX line endings.

Your favorite text editor may be able to change those settings if they are not the default when you save a file. I used Coda to do so.

Related Queries

Your related queries import should have two fields (don’t give your columns headings, though):
[search term],[suggested/related query]

Set up your file per the above import file specifics, and import.


Your KeyMatch import will have four fields:
[search term],[KeywordMatch|PhraseMatch|ExactMatch],[URI],[Result Title]

Word of warning: there is a maximum of five keymatch results per URI, but default settings will only show three.

Simple, yet previously undocumented (as far as my Google-fu skills could determine). Hope it helps.

New Project Launch

The Web is ever changing, and this article is relatively ancient having been published 12 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

I launched a new Web project a couple of months ago, but am just now getting a chance to post about it. The project is the Student Loan Facts Page, a site and blog about student loans. The importance of it to this Front-end dev blog, however, is that I want to let you know my thoughts on the HTML5 Boilerplate.

I’ll start with a caveat: this is a personal project, so I made a conscious decision to use elements in the design and coding that won’t work in any current release of IE. That’s not to say that IE users can’t see the site or anything, just that some stuff might be a little wonky. Like, the homepage makes use of the content CSS attribute to display numbers next to the six main links.

But my main goal with this site was to play around with WordPress a little more and start something with HTML 5. HTML5 Boilerplate seemed like a good place to start. I’m a fan of CSS resets, so that’s all there. I’m also intrigued by some of their approaches:

  • CDN hosted jQuery with local fallback failsafe.
  • JS located at the bottom of the page.
  • IE/JS-specific classes on the html tag (or body tag, depending on which version you’re using) to allow for progressive enhancement.
  • Asynchronous Google Analytics use.
  • Forcing a scrollbar in non-IE browsers to for design consistency.
  • Consideration for a:focus.
  • Text input/label alignment.
  • Progressive HTML5 form error classes using the :valid and :invalid selectors.

… and much more.

There’s also some stuff that I’m not as happy with, like non-semantic classes, but I can see why they included it.

As a whole, I think it’s a great start for developers who know what all of the code does. It is a very simple setup to modify as a solid basis for your own personal framework. It’s not really a straight out of the box solution, though.

It’s also not a great solution for a newbie to use. They’ll end up with bloated code and unnecessary stuff. Although, my current code on that site isn’t exactly pretty.

I’m impressed enough to use it as a jumping-off point for a redesign of this site coming later in the year, though. I hope you have a good experience with it as well.

IE Change Event Delay

The Web is ever changing, and this article is relatively ancient having been published 12 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

I recently developed a sign-up form for a client that includes on-page price total calculation using JavaScript (jQuery). The premise is simple: the user provides information and specify options, then clicks a radio button to choose a specific price plan. The initial total price calculation is triggered by the change() event for the radio button elements. But, the client was concerned (and with user testing, it turned out rightly so) because in IE, the price calculation didn’t happen until the user clicked somewhere else on the page. In cases where they first clicked one option, then a different one, the price would seemingly lag behind because of IE’s delayed change event firing. It was confusing to the user, but worse—confusing for me to “fix” IE’s implementation.

The awesome news is that this has been fixed in jQuery 1.4, but the concern is still valid for older versions (which my application was using) and straight-JavaScript implementations.

The problem

In Internet Explorer, the change event fires when focus leaves the form element (an event know as blur). That means that the event happens only once a user has clicked on—or used the keyboard to navigate to—another element on the page (form or other).  In cases like mine, where a user is expecting instant feedback to their click, this causes issues with user experience. Unfortunately, this isn’t exactly a “bug,” as it’s how IE handles this event in 6, 7 and 8.

In other browsers—Firefox, Webkit-based (Safari/Chrome) and Opera—the event fires off immediately, so in order to have consistent, intelligent operation, we have to hack IE’s basic behavior. The easiest solution is to bind your function to a different event, such as the click event, but that’s generally not the right solution. There is a better one.

Why using the click event is wrong

One word: accessibility. Users—whether they have a disability that restricts their use of the mouse or like to tab about the page with the keyboard for speed—don’t always use the mouse to move from form element to form element. So, if you bind your functionality to the click event, you may end up messing with a user’s workflow, which makes for unhappy visitors. In some cases, it may even make a user unable to use your application. So don’t use that as your solution.

The real solution

If IE needs a blur event to know that the change event should fire, give it one when the element is clicked. In jQuery, that looks something like:

	// In IE, we need to do blur then focus to trigger a change event
	if ($.browser.msie) {
}).change(function(){ actionIWantOnChange(); });

This code tricks IE into thinking that focus has been changed away from the element when it is clicked. But since the change event is also triggered, the actions attached to that event also happen. Keyboard navigability still works, because even though there is no click, the change event will fire when they use the keyboard to move to another field meaning the feedback is instantaneous.

Now, you can probably improve my above example by using better feature-sniffing to test for IE instead of the browser object in jQuery, but my time for creating a fix was limited—and this code gets the job done.

Why I Switched to WordPress

The Web is ever changing, and this article is relatively ancient having been published 12 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

I’m pro-intensive-customization, anti-bloat, pro-knowing-what-your-code-does. For over a year, my site ran on a code-base authored at least 90% by me. Don’t get me wrong, I know when to say “hmm, that’s not my strong point, let me use someone else’s code,” but I wanted to have total control over every aspect of my site’s backend. Then, last night I went live with WordPress.

I was resistant to using an out-of-the-box CMS for my own site for a long time. Some of those reasons were tenable, but most were at best me being stubborn and at worst illogical.

Frankly, as often as I get stuck doing backend work at well, work, why do I want to be doing it in my free time as well? There are so many features I want to develop on my site that I just haven’t gotten around to, because at the end of the day, I don’t want to touch that kind of code. Using WP, I can just enable those features, add a plug-in or tweak a little code and quickly have that functionality. It saves me time and headaches.

But wait, why are you using WordPress?

…instead of <ANY OTHER CMS>.

WP isn’t necessarily my favorite CMS out there. I even tweeted an anti-WP haiku once:

A haiku inspired by bad advice. “Wordpress: blog system. / Bloated, not secure, real slow. / Not true CMS. ” (My opinion only)

But I promise I’m not being as hypocritical as it may seem.

  • First and foremost, my site is a blog. Or four blogs, depending on how you look at it. It’s not some gigantic commercial site. A blog needs blogging software. WP is a pretty solid blogging software.
  • My anti-WP tweet was focusing on advice someone gave a client that their 1000+ page absolutely had to be in WP because that’s the only CMS in existence that is any good. Uhh, no.
  • More and more often, I’m finding myself using it for client sites (where the decision to use WP was already made or where it seems like the best solution). In order to be the successful, efficient developer I like to be, I need to become more intimately familiar with the system, and what better way to do that than to use it on my own site?

That last bullet has about 70% of the weight in this decision. Maybe when I start inheriting a bunch of Joomla sites, I’ll develop a different personal project on it. Although I probably won’t port this one to anything else again. That bit was painful.

Great, I’m going to switch to WP too!

…just because of this article.

Hold up there. WP isn’t necessarily the right fit for every site.

If you want to start a blog, sure, go for it. It’s simple, has a ton of themes, and mostly uses pretty good, standards-compliant code.

If you already have a blog using another CMS, stop. Evaluate. Why do you need to switch? Are you unhappy with your current system? Is there some very important feature you want that doesn’t exist in the current system? Why WP? Those are all questions to consider.

If you don’t blog (or your site is separate from your blogs), but have heard that your business absolutely has to have a site built on WordPress, stop. You need to talk to a developer. And I’m not talking about your wife’s niece that took a few design classes at the community college and is now a “Web designer.” WP very likely might not by the right CMS for you. But don’t fret, there’s something out there that will help you meet your goals, and the right developer can help you decide what that is and how to get it going.

What’s your CMS of choice? Are you happy with what you’re using? Think I’m totally bonkers? Everyone has their preferences. When porting the site over, I found quite a few things I’m not 100% happy about with WP, but that’s what custom themes and plug-ins are for, right? So, as I learn more, you’ll surely start seeing some changes around here. And probably some WP-centric posts as well.

Online Retailers: Help Your Customers Find Your Products

The Web is ever changing, and this article is relatively ancient having been published 12 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

Aisle of products One of my favorite (only because it has the best, low-price selection) sources of quality fabrics has horrible UX on their store site. Horrible to the point that I’m only able to do rudimentary filtering because I understand the GET variables they use to display their catalog. In fact, I wrote a post to help non-developer users figure out how to filter the catalog by changing the URL. The fact that I had to do either of those things really bothers me as a front-end developer; your users should never have to resort to manipulating the URL in order to filter or find products.

If a user can’t easily figure out that you carry a certain product, you just lost a sale. Think about that and what it means to your business.

Say a customer walks into your brick-and-mortar store wanting a specific type of whatchamacallit with a budget of $5. They see that there is a whatchamacallit aisle, but it’s just overwhelming: you have have 200 different colors of that specific type of whatchamacallit. There are different sizes, weights, and patterns. The customer’s not too picky about some of the finer points, or maybe they are, but it doesn’t matter because they only have $5 that they want to spend. So, they want to quickly narrow it down to all whatchamacallits under $5. They could go down the aisle looking at every one, but it’s easier for them to find a store employee and ask, “can you show me the whatchamacallits that are under $5?”

Being a store that cares about making a sale, your employee says yes, helps the customer, maybe even upsells them, and the customer makes a purchase. Money in the bank for the store.

Now stop. Why should this interaction be any different in your online store? That’s right, it shouldn’t be. Sure, it’s a bit difficult to have a person there, so to speak (you could have a live chat function). But there are alternatives:

  • An obvious search box that accepts queries like “whatchamacallits under $5” and returns useful, intuitive results.
  • Obvious, intuitive filtering and sorting options for search results with obvious, intuitive controls.
  • Obvious, intuitive filtering and sorting options within categories with obvious, intuitive controls.
  • An obvious, intuitive way to change the number of items shown on a page.

Notice the similarities up there: obvious and intuitive. It’s not simply adequate to provide those controls; people need to be able to find them and use them easily—just like they’d need to be able to easily find a knowledgeable employee. Good online store software will have all of those things out of the box or your developer will set it up as a standard part of your store creation. All of these filters should persist until the user says to remove it, as well: if a user filters to blue whatchamacallits then applies a $5 filter, they should be shown blue, $5 whatchamacallits not just $5 ones in any color. Sounds like common sense, right? Too many online stores don’t seem to get it.

Now, back to the earlier example. What if your employee said to the customer, “no, find it yourself.” Maybe the customer would spend time looking, but more likely they’d realize that you don’t care about their patronage and they’d go somewhere else to find the whatchamacallit. You just lost the sale. You’d probably fire an employee that did that consistently.

That’s exactly what’s happening at the fabric store. And many other online stores with outdated store software that has no filtering. There’s no provided way to easily narrow the products down to a specific one you’re searching for. There’s a search function, but it’s not that great, and there’s no way to narrow down the results. There’s absolutely no way to sort of filter the category item listings at all. They need to fire their store software.

Keep in mind that simple categorization is not sufficient criteria browsing or searching. Just because I want a whatchamacallit doesn’t mean I’m interested any whatchamacallit you carry. Shoppers, whether they’re just browsing or are looking for a very specific item, are more likely to buy from you if they can find what they are looking for. If they can’t find it, they can’t buy it. If they’re browsing a sales listing hoping to find something they think they need, you’ll have more success on a conversion if they can narrow down that sales list to items they are interested in. Sure, they can start at the top—viewing everything—but maybe they see a coolthing that intrigues them, but it’s not the perfect coolthing they want to buy. If they can easily narrow that listing to all coolthings, your ability to sell them something just increased. They know you carry the type of item, and are easily able to drill down to all items of that type. If they can’t, then maybe they keep looking, or maybe they get bored after going through another 2 pages with no more coolthings because all the other coolthings are 12 pages into the sales. A bored visitor leaves, and so does their money.

What do you expect from online shopping user interfaces? What makes you go to another store? Stay at the current store? Feel free to share good or bad experiences and examples in the comments.

Losing Values When Cloning Forms

The Web is ever changing, and this article is relatively ancient having been published 13 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

I’ve finally started development of a book recommendation widget for the musings and reviews on books I read section of my site. The general functionality is pretty simple: visitors have a few fields to complete with info about the book; upon submission, their recommendation is saved to a database; the new recommendation is shown to all and sundry in a “last recommendation” section; rinse & repeat. The whole no-JS needed, server-side scripting processing involved is simple, straight-forward and was quickly completed. Being a front-end developer, however, I want to make sure this can all be done in a smooth JS-enhanced way as well (for some nifty UX). That’s where I encountered yet another annoying JavaScript problem.

Each browser interprets “clone” differently

Due to differences in how each browser implemented cloneNode() in their JavaScript engines, there is an issue with values for form elements not persisting to the copy of an element. Inconsistencies like this are many of the reasons why I’m a fan of using a JavaScript library for most projects. In this case, because the library has usually worked out the issues and can handle copying without losing important data. Unfortunately, jQuery still has some issues to work out; it loses select and textarea values on clone(). Other inputs types don’t seem to have any issues, including hidden fields.

How this affected my widget

In order to display the submitted form data in the “last recommendation” section, I decided the best approach would be to copy the form, then replace each form element with an inline element containing the value based on type of element. It might not be the most elegant solution, but it seemed better than specifying each individual element by name and manipulating it that way.

So, step one: use clone() to copy the form (no need for cloning events and the like). Step two: do the replacement. Step three: realize that regardless of selection, the last recommendation always displays the first option in a select box. And an empty string for any text area.


Honestly, my current solution is just a dirty hack. I only have two affected fields, so I explicitly copy the original values to where they need to go. I’m more concerned about getting this up and running, knowing that there won’t be much in the way of extension in the future. I am about 99% positive I won’t be adding any additional fields, at least.

For other projects though, this could be a major issue for scalability and ongoing maintenance, especially if there are multiple affected elements. A quick search around the Internet shows a lot of inelegant solutions. One that approaches a decent solution for jQuery is this solution at, although it only attacks the select issue (and later inputs as well with a bit overkill [see the comments], but no textarea). That approach could work with some modifications.

Have you seen this before? How’d you solve it/work around it?

Trigger AJAX Error Event

The Web is ever changing, and this article is relatively ancient having been published 13 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

When I was new to working with AJAX functions—especially in the realm of form submission—one hurdle I often encountered was how to handle processing errors in my back-end script and give meaningful feedback to my users. I didn’t understand how to designate a response as an error instead of successful processing. Early on, I sometimes employed a dirty hack of prepending any error with the string ERROR: and then adding in some quick checking to see if that substring existed in my response text. While that may get the job done, it’s not good form. It causes convoluted code usage, thumbs its nose at existing error handling functionality and makes future maintenance a headache. But there is a better way by simply utilizing your processing language’s inherent header and error handling functionality.

N.B. From a JavaScript standpoint, I’m showing code based on the jQuery library, because I use it on a regular basis. The concept of triggering the XMLHttpRequest object error handling with proper headings is applicable to any type of JavaScript coding. Likewise, my server-side processing examples in this article are coded in PHP, but that is not the only applicable language. You can produce similar results with other languages as well, so long as that language allows you to send header information. If you’re willing to translate my examples into another language or non-libraried JavaScript, please do so in the comments or e-mail me (, and I’ll add it into this article (and give you credit, of course).

The information in this article refers to AJAX requests with return dataTypes of html or text. JSON and XML dataTypes are for another day.

The client side of things

Let’s say we’re working with a bare-bones comment form: your users add a name, e-mail address and their comment. Three fields, all required, easy-peasy. For the purposes of this article, we’re going to ignore all of the validation you would want to do on the form and focus solely on the process of sending it via AJAX to your PHP processing script. The resulting AJAX call with jQuery might look something like:

//[warning: most of this is pseudo-code, not code you can copy+paste and expect to immediately work]
    type: "get",
    url: "/processing/process_comment.php",
    data: $("#commentForm").serialize(),
    dataType: "html",
    async: true,
    beforeSend: function(obj) {
        //give user feedback that something is happening
    success: function(msg) {
        //add a success notice
    error: function(obj,text,error) {
       //show error
    complete: function(obj,text) {
        //remove whatever user feedback was shown in beforeSend

Essentially, the above JS expects the server-side processing script to return a message to show the user. We’ll set up such a script next.

The server side of things

Processing our simple comment form is close to trivial. We’d want to do some basic validation, make sure the submitter hasn’t been blacklisted for spamming or other reasons (in this example based on IP address), and then add the comment to a DB. The interesting part, however is how to tell the server that an error occurred during the processing and have that error propagate back to the AJAX properly. This is where the header and exit functions come in handy. Look at this example:

<?php //[warning: the "processing" is pseduo-code functions, however the error throwing parts are valid]
    // perform validation
    if (validValues($_GET)) {
        if (blacklisted()) {
            header('HTTP/1.1 403 Forbidden');
            exit("Uh, hi. Your IP address has been blacklisted for too many spammy attempts. Back away from the keyboard slowly. And go away.");
        if (addComment($_GET)) {
            // We have success!
            print("Your comment has been successfully added.");
        // if the code reaches this point, something went wrong with saving the comment to the db, so we should show an error
        header('HTTP/1.1 500 Internal Server Error');
        exit("Something went wrong when we tried to save your comment. Please try again later. Sorry for any inconvenience");
    } else {
        header('HTTP/1.1 400 Bad Request');
        exit("This is a message about your invalid fields. Please correct before resubmitting.");

In PHP, the header function allows you to send headers to the client. In the syntax used above, it is allowing us to specify error status. For more info on the header function, head to the PHP manual on header. exit is a handy construct that ends script execution while allowing you to print an error. Upon successful completion of the processing, we make sure to call exit(0), which signifies successful completion of the script. Any non-0 value indicates that an error occurred. Learn more at the PHP manual on exit. For errors, you can also use the die construct, which is equivalent to exit.

Putting it all together: View a Demo

Let’s get advanced

The above examples for the error function are pretty simple, but you can create very elegant error handling solutions with it by utilizing other properties of the XMLHttpRequest object. One possibility is to take unique actions based on status code returned. By using the status property of the object, you can customize your error handling based on the status returned. For instance, this error function script would alert a user to modify form information if needed, but completely remove the form if they found the user is a blacklisted IP (using the same server-side script from above).

    error: function(obj,text,error) {
        if (obj.status == "403") {

Have you utilized this method in your AJAX programming before? Let us know how it worked for you.

Q: Where Does JavaScript Go?

The Web is ever changing, and this article is relatively ancient having been published 13 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

A: The bottom of your page, just before the </body> tag is your safest bet. Of course, with Web development, nothing is as easy as a blanket statement like that, right? But, when I’m helping people troubleshoot their JavaScript problems, 95% of the time the first step is to move the JS to the bottom, order the scripts properly and wrap it in some sort of function that starts only after the page is loaded. This not only fixes their problem, but often speeds up content loading. Read on to learn why this is a good rule.

You can’t affect an element that doesn’t yet exist

When attaching events to an object (one of the most often used JavaScript things I see), that object must exist to have an event attached. When you have a bit of script at the top of the page, it is run as soon as the script appears, meaning the object you’re trying to attach the event listener to doesn’t exist. For starters, the DOM, which is the structure that allows you to interact with elements in your page, has not finished loading at this point, nor have your elements inside of the <body>.

If your script is at the bottom, however, even if the DOM isn’t complete, chances are good that your element will at least exist on the page, able to be manipulated. That’s why nothing is happening when your code in the <head> is trying to set up some cool thing to happen when you click an object; the object didn’t “hear” you tell it to do that cool thing—it was in rendering limbo, outside the realm of your script’s reach.

Waiting for the DOM to be ready

Simply moving the code to the bottom doesn’t necessarily mean everything will be loaded, however. The best practice is to explicitly tell your code not to run until either the body is loaded (for older browsers), or until the DOM is ready (for modern browsers). I can’t explain how to do this in every library that exists, because I haven’t used them all (check out their documentation), and that’d make for a very long article, but in jQuery, it can be done with this code:

$(document).ready(function() {      ... all your code here ...    });

Or the shorthand version that looks like:

$(function(){    ... all your code here ...   });

However, if you’re manipulating images, you’ll need to wait until those are downloaded as well. That’s not necessarily done before the DOM says it’s ready. In that situation, you would wrap your code with:

$(window).load(function() {  ... your code ... });

The above is triggered after every piece of the document has been downloaded, including all of your images.

In plain-Jane non-libraried JavaScript, you’d be using something along the lines of

window.onload = function(){  ... your code ... });

This is the essentially the same as the method of <body onload="some code goes here">, which should not be used because you should be separating your function from content. Using window.onload allows you to have your code in a separate file, easily included or changed in multiple documents.

Does any one know how to check if the DOM is ready with plain JavaScript? Forgive me, but the process escapes me at the moment.

On the subject of page speed

Now, a slow-loading page isn’t necessarily a broken page, although some research shows that visitors will quickly bounce if they have to wait too long for the page to load. But page speed can often be improved, and many times it’s hanging because of scripts in the <head>. Most pages consist of a multitude of files in addition to the basic page: CSS, JavaScript, images, etc. To speed up loading times, browsers will try to download these in parallel—multiple files at the same time. JavaScript, however, is not pulled down in parallel. So, once a JavaScript file download starts, everything else is put on hold until the script finishes. If the JavaScript is in the <head> of your document, this means your users are starting at a blank screen while the script finishes. If your JavaScript is at the bottom of the page, your content loads first. Your well-written, interesting content should keep visitors busy while the rest of the scripts finish loading.

How I roll

I use the following order for setting up my scripts at the bottom of the page. I’ve found that it provides the best results for my uses both here on my personal site and on sites I develop for others.

  1. JS Library (usually jQuery)
  2. Plug-ins, if any (such as Lightbox, SimpleModal)
  3. My site-specific code (usually invoking plug-ins, form validation, etc)
  4. Social media plug-in (AddThis)
  5. Tracking code

You can view the source of this page to see all of that code at the bottom. (Or don’t, I really need to clean it up! I’m causing you to load the contact form validation even when there is no form. That’s very naughty of me, and I promise to fix it post haste.) The exceptions are ads and the custom search script which both appear at the point in the code where they show on the page, due to requirements of the code and companies.

When should you not place it at the bottom?

When making such a generalized statement like “at the bottom,” there are going to be exceptions. The number one exception to placing the code at the very bottom of the page is: when the provided documentation, manual, or instructions say otherwise. Now, I rarely come across such instructions except for in outdated code that shouldn’t be used anyhow*. One notable exception is SWFObject, a wonderful script for use with Flash on a page. They say put the code in the <head>. I haven’t done enough testing to say that it’ll work with the script at the bottom, so <head> it is.

The other exceptions I see regularly are ads and widgets. Ad servers, such as Google AdSense, are usually structured so that you place the scripts wherever you want the ad to appear. Unfortunately, these slow down your page load, but there’s nothing to be done until they improve their scripts. Likewise, some widgets require placement at the point in the code where they should appear rather than at the bottom of the page. Try to find alternatives if possible, otherwise, do as the documentation instructs.

A final note, speaking of code that’s supposed to go in the <head> element…: please, please, for the love of all things sacred do not use MM_Preload or MM_Menu or any other old Dreamweaver/Fireworks JavaScript that is prefixed by “MM.” Without exception, I have never seen this code do anything that cannot be accomplished in a better way, often without the use of JavaScript to begin with. </rant>

Do you have any other quick JS debugging tips? Has this strategy fixed your JS issues?

Converting From Named to Numbered Entities

The Web is ever changing, and this article is relatively ancient having been published 13 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

I’ve been having some feed issues lately, thanks to my propensity for using proper(ish) typography, such as real quotes (“=&ldquo;,”=&rdquo;,‘=&lsquo;’=&rsquo;) in my content and headlines. The problem was that XML doesn’t behave very well with some of the named HTML entities. My feed-generation code had some conversion set up using html_decode_entities() and a declared charset of UTF-8 for the document and decoded entities, which can handle them all, but for whatever reason no luck; It was still generating invalid RSS feeds. Matt Robinson’s code for conversion to numbered entities fixed it all up nice and clean-like. Thanks, man.

Server Ignoring PHP 301 Header: IIS Known Bug

The Web is ever changing, and this article is relatively ancient having been published 13 years ago. It is likely out of date or even blatantly incorrect in relation to modern better practices, so proceed at your own risk.

This is a little more back-end/server-side than front-end, but I happened upon an absolutely aggravating bug found with IIS/the FastCGI module when moving a client over to a new site structure. Like any good development company, we wanted to make sure the old site structure URLs redirected permanently (301 Redirect) to the correct new URLs so that off-site links continued to work. Unfortunately, I ran into some roadblocks setting it up.

The Problem and Roadblocks

This is a large site; something on the order of 2500 pages needed redirects for the new structure. Or at least, 2500 URLs. Many of the pages were dynamically generated from a CMS with redirects already set up to resolve SEO-friendly URLs to the server-based variable-laden ones. The fastest and most convenient—not to mention maintainable and scaleable—method of doing this seemed to be writing a quick PHP script (as it is the scripting language used by the site and CMS) to map the old URLs to the new. We’d simply redirect all requests to that script and then let it handle the mapings with a header statement:

header("Location: ".$newURL,true,301);

Trivial, quick, workable. At least on Apache. I traced my headers though, and IIS was sending a 302 (Temporary) redirect despite the explicit 301 call. I threw out some very naughty words at that point.

Roadblock #1: Using a Microsoft Server

Ok, so, maybe I’m being a little flippant. I suppose there’s nothing wrong with IIS. It’s just that I, personally, prefer LAMP development if I have to do back-end/server-based stuff.

But facetiousness aside, IIS really is the cause of our problems in this case. Ultimately, two issues were affecting what should have been a simple issue. Specifically, a bug in the IIS FastCGI module and a size limit on web.config files.

Roadblock #1.1: IIS/FastCGI and PHP 301 Redirects

Two dirty words: “Known Bug.” Essentially, IIS/FastCGI module somehow lose track of the fact that you said “I want a 301 redirect” and handles it as a 302. Every. Single. Time.

So, our quick, easy, maintainable, scalable script idea goes out the window.

Roadblock #2: web.config filesize limit

Evidently there is a max size for your web.config filesizes. Don’t ask me what it is though: I can’t figure that out. My Google-fu isn’t strong enough to find anything about it, evidently.

I found this out when moving to Redirect Mapping Plan B: Spit Out web.config Redirects for Every URL Mapping In One File. That, my friends, spit out a wonderful 500: Internal Server Error error when I uploaded my new web.config file.

Guess it was time for a Plan C.

Our Workaround


I don’t even want to mention it. It’s depressing. Hopefully if you run into this, you can do something better.

We wrote a script that generated the URLs we needed and the mapping to the new URLs. Then, we formatted them for web.config redirects (everything to this point was actually part of Plan B. Moving on:) and made individual web.config files for each. Individual. Folder. under the old structure. This meant leaving old, useless folders online with nothing by web.config files in them. Like I said. Yuck.

Why not use another language, like ASP to get around the bug?,” you ask. In theory, that would work. But it would also take more time. See, we needed to interface with the CMS to get the correct URLs, so not only would we have to author the quick mapping script, but we’d have to write a more involved script to actually retrieve the correct URLs from the PHP-based CMS, and that was determined to be a non-option. (Hey, it all comes down to business: time==money && faster_workaround == less_time == more_money_per_time.)

Will This Ever Be Fixed?

According to user ruslany (Microsoft employee?) on the IIS forums:

This is a bug in IIS FastCGI module. It will be fixed in Windows 7 RTM. We are also looking into possible ways for making this fix available for IIS 7.

So, short answer, no, not really, unless you upgrade. Maybe they’ll get around to fixing it in IIS 7 with a patch. Maybe. They’re looking. He says. Here’s the full thread.

As for the max web.config file size, it actually makes a lot of sense, so I can’t imagine them fixing changing it. It was just frustrating to hit that unknown amount after the other bug. If anyone reading this happens to know that mysterious byte size cap, though, please let me know. It’d be useful info in the future, I suppose, and I am too lazy to experiment to determine the answer.