[Home] | [Articles Index]

Case Sensitive URLs

Does capitalization matter?

by Ted Kuik

Does capitalization matter in web page URLs? I recently did some research on this topic when I switched most of my websites from Windows-based hosting to Linux-based. I figured this would be a good time to impose a little more order and uniformity on the way I named my URLs. I was just beginning to embark on a massive binge of URL "case standardization", when it struck me that it might be a good idea to see how this might effect search engines, bookmarks, and other links to my pages.

Well, it seems if your page is hosted in Windows, links will get to your pages regardless of capitalization (or lack of it). So if the search engine indexed your page when it was www.example.com/page01.htm, it will still find it even if you renamed it www.example.com/PAGE01.htm (but read on, as there ARE ways that capitalization might effect you).

In a Linux or Unix-based environment, things are a little different. The good news is that the base URL (www.example.com) will resolve correctly regardless of capitalization. The (potentially) bad news is that the other pages will not. So if your site is hosted in a Linux/Unix environment and you rename your page www.example.com/PopularPage.htm to www.example.com/popularpage.htm, you could suddenly find yourself facing a drop in traffic as people clicking on the search engine link get a 404 error telling them that the page can not be found! Of course, search engines do crawl through the web and update their indices from time to time, so your page would probably get corrected in most search engines ... eventually. In the mean time, however, you lose traffic, potential revenue, and possibly links from other sites, all because people just gave up on your page when they got a 404 error. (And of course any existing user bookmarks or links from sites other than search engines will have the same problem as the search engines and might take longer ... or forever .. to get fixed).

So what's a webmaster to do? Well, ideally decide on a "case convention" you can stick with at the start. Think long and hard before changing the case of a page, particularly if it draws a lot of visitors who come directly to it from search engines/links/bookmarks. If you feel you must change the case of these pages, you might want to make sure that you have a custom 404 error page with a link to enable visitors to find your main index page and perhaps links to several of your other important pages as well. For a really important page, you might want to create another page with the URL in the original case, giving your visitors a link to the "new" page, perhaps automatically forwarding them there. (www.example.com/PAGE01.htm forwards to www.example.com/page01.htm and so on).

"But I'm hosted in a Windows environment," you say. "How does this affect me?" Ah, not at all ... as long as you stay with Windows hosting ... FOREVER. A day might come when you decide to move to Unix or Linux in order to be able to utilize features unavailable in a Windows hosting environment (which was a primary factor in my move - that, and some of the unique features available at FutureQuest ). Or your current hosting provider could make the decision for you at some point by dropping Windows hosting and forcing you to migrate to Unix/Linux or find a new host. When and if that day comes, if you've changed the case of pages in the past, you could find yourself in a quandary. You might have one set of search engines or links pointing to www.example/Page01.htm and another to www.example/page01.htm. It's a problem that is best avoided be careful planning in the first place. If you've intentionally (or unintentionally) changed the case of a page name though, don't despair. If you stick with the new name long enough, most search engines will eventually update to the current version. That way, you'll be ready when and if the Linux/Unix move ever comes.

As long as you are consistent, it probably does not matter much which particular convention you adopt for naming your page URLs - There are several several schemes which have some appeal. You could go with all lower case, the advantages being that it is easy to be consistent and easy for someone who types your page URL directly into their browser to get it right (although in most cases users will probably either bookmark a page or just navigate to it from the main index of your site). Lower case does have the disadvantage of being harder to read on long URLs (www.example.com/thisishardtoread.htm as opposed to www.example.com/ThisIsHardToRead.htm), so you might want to consider mixed case for that reason. I'd avoid using all upper case except where it is standing in for a short abbreviation (www.example.com/URL/Information.htm).

There's really no great problem with mixing the conventions either, as long as you don't change a URL once it is in place. (Aesthetically though, it irritates me if I find that I have not been consistent with pages having similar names (Page01.htm, Page02.htm,page03.htm). Incidentally, inconsistency can sometimes make managing your site harder too, as the pages sometimes don't sort alphabetically in the order you would expect.

One final point - In most of my examples, I've used the case of a page for an example, but the principles would apply equally well to folders on your web site.(www.example.com/folder/pg01.htm vs www.example.com/Folder/pg01.htm, etc.). Indeed, an unwise renaming of a folder could be potentially far worse, as it would apply to all of the folders and pages below it, and that could be a problem with a capital P!

 
 

Last Revised October 26, 2004

Copyright 2004, Ted Kuik/Kuik Computer Services. All rights reserved.