During my Search Engine Optimization for ASP.NET Developers presentation at the Western Michigan Day of .NET a young man asked me a question about formatting URLs. Though we were running short on time I tried to answer as many questions as I could, but I didn't completely understand the question at the time and I didn't want to hold up the entire presentation to gain an understanding. Instead I talked to the young man afterwards and then found that he had emailed me yesterday for clarification.
By the way, when I wrote "young man" that is exactly what I meant. The question came from 12-year-old Aaron Gillion who is an ASP.NET developer!!
I had discussed using URL Rewriting to create "pretty" URLs that are ripe for stuffing your keywords into. I then mentioned that Google recommends using hypens as word separators. Afterward Aaron asked [paraphrased by me], "If you capitalize all the words and use no hyphens, well of course it is human readable, but do search engines read it as separate words?"
I actually think that there are two issues here, though Aaron was just asking one, which is what threw me off the first time:
Before trying to answer either question let's first discuss URLs and the purposes of them. At its most simplest a URL is a Uniform Resource Locator that serves as nothing more than a unique identifier to the location of a page on the Internet. As long as it is unique and follows some simple formatting rules a user can get to your site by clicking on a link that points to that URL no matter what it looks like. The following are all examples of valid URLs:
All of the above could successfully point to a page on your web site.
But let's look at the last example a little closer. The last example could actually be the same URL as the one before it, but just written to be "pretty". It seems to insinuate some type of hierarchy to the page, doesn't it? If you did a search for "Purple Petunias" and you saw this link among the the first couple of search results you would probably think that it was the one you were looking for. You would be much more likely to click that link than the one before it (assuming all other factors like page title, description, brand recognition, etc. were the same).
So while the basic purpose of a URL is to serve as a pointer to a page we have the ability to use them for so much more. For one, because the pretty URL contains keywords it serves as another indicator from us to Google about the content of the page. Second, in search results if the URL contains keywords that match the search term then many search engines, including Google, will highlight those terms (just like they do in page title and description).
Is using all capitals in a URL with joined words that are not separated with hyphens (or any other word separators) human readable? Or better yet is it following good usability practice?
Now let's take a look at Aaron's first question, which actually wasn't his question, but I'm going to try to invalidate his assumption. Let's take our last URL example from above and create a couple of different variations:
I'll agree with Aaron that each of the above is human readable, however I think that the first four are far more readable than the last two. Additionally, I think that the all caps version is probably the worst of the bunch given that all of the letters seem to have the same weight and there is nothing protruding above or below the baseline to help break the flow. Of course each of them works fine as purely a URL, but the first four add a little more something as far as human readability goes. Aaron, my advice to you would be to avoid all caps and to try to start using hyphens as word separators.
Do search engines (Google) parse the individual words out of a URL that has joined words that are not separated with hyphens (or any other word separator)?
My general understanding is that Google can understand underscores as word separators but prefers hyphens. Additionally, I believe that I have read that the Google spider will try to break words that are put together without separators, but they cannot guarantee it. Why not take the time to build your URL to exactly the way that you want Google to see and interpret it, especially when all you need to do is add a couple of hyphens?
For instance, I decide to write a post about my imaginary company's imaginary new product, Rockits Hoe, the newest and coolest all-in-one gardening tool. I write a post with the URL: http://www.mysite.com/blog/rockitshoe.aspx. How is Google supposed to know what my keywords in that URL are:
I could let Google try to figure it out or I could explicitly make my URL: http://www.mysite.com/blog/rockits-hoe.aspx and remove any ambiguity.
Enough of my thoughts, how about something from the horse's mouth:
Consider using punctuation in your URLs. The URL http://www.example.com/green-dress.html is much more useful to us than http://www.example.com/greendress.html. We recommend that you use hyphens (-) instead of underscores (_) in your URLs.
Aaron, I hope this helped answer your question in my normal long-winded style. Good luck and keep attending events like Western Michigan Day of .NET and other code camps!