Good blog URL structure

Welcome to version 2.1! I’ve been playing with the site over the last couple of weeks. I’ve done some thinking about the structure of the site and I’ve tried to make better use of space on the home page, adding the last few comments posted. The CSS is a bit busted in Internet Explorer at the moment, but I’m sure you’ll learn to forgive me! For the moment, I want to concentrate on writing instead of playing with CSS!

Right then, URL structures

In the process of tweaking the design and structure of over the past couple of weekends, I’ve modded textpattern to use what I believe to be the most usable and future-proof URL structure. We now have year, followed by month, followed by hyphenated entry title: /2007/apr/good-blog-url-structure

I’ve also ensured that textpattern will redirect (permanent 301) links formatted in my previous URL structure to the new locations. For those interested in how I did this, I’ll do a little write-up soon.

Having discussed information architecture with Jon T over recent months and then reading a few posts about good URL structure recently, I got thinking and began tweaking, so I thought I’d post up my thought process behind the changes.

Take time to think

In the interests of link rot, it’s advantageous to think about your URLs before you have too many of them. With a blog, you can quite easily rack up a bunch of posts and decide you want to restructure things, often leading to a furore into htaccess and a bunch of redirects. That is, unless you like to break people’s bookmarks or screw up your results in search engines.

Blogging systems commonly refer to the location of an entry as its permanent link, so it follows that we should avoid changing that link in the future. I have fallen short of that ideal on this blog a couple of times now as I’ve shifted content around. I guess it’s inevitable that we move things about as our websites evolve or as we learn.

If you’re just starting up a blog, or if you’re rehashing your website, my advice is to take a little time to think about the structure of your URLs. Think about the information that will identify the content of the entries you’ll be writing and is most helpful to readers without being verbose. Doing this early on will save you time in the future. Even if you think you may make changes down the line, you should be in a better position to do so having already thought this through.

Unique identifiers

My appreciation for good information architecture has grown over time, especially since my involvement with Grow. Now, I try to think of blog entry URLs as unique identifiers – permalinks, remember – which we should avoid changing in the future. I try to think about what information actually identifies the content of the entry and what does not.

Entry titles

A prime place for summarising the content of your entries is in their title. I try to think of these as one-liner headlines, as you tend to find in the sidebars of the BBC News website.

I try to take a little time to think about my entry titles before I publish, making sure it sums up the content and avoiding the need to make changes later on.

Entry IDs

The database ID of an entry isn’t terribly useful information to anything other than your content management system. It’s certainly unique to the entry, but I think it looks messy and is not valuable to readers.

Date entry published

The date that you publish an entry ages its content. Including that information in your URL structure means visitors can quickly see how old your entry is. As a reader, I find this useful. Also remember that the content of your entry may not necessitate archiving – old content may still be valid, especially if it has updates attached.

Site sections

The site section an entry belongs in is not unique to an entry and is more likely to change than you might think. For example, I started off just posting under a “blog” section, but I decided to expand, which entailed moving some of my entries out of the blog section. Now I see site sections as a means of navigating to and accessing entries.

I also notice when people post their entries to an archive section as soon as it’s published – does this mean the content is not up to date?

Entry tags

The tags an entry is given can tell you a lot about the content of that entry, but again, they are not unique to an entry. However, I think of such information as peripheral – meta data, I suppose – and as with site sections, a means to categorise and browse entries. Remember that many tags may be applied to any one item – especially so in a community environment like Flickr – making them unfeasible for URLs.

Deciding what to use

So, I like the idea of making the most of entry titles, ensuring that they describe the content. You could just use the title alone to identify an entry in the URL. There are two problems with this for me.

Firstly, all your post titles must be different, which I admit is not necessarily a problem. It may become a problem if you post a regular update, say once a month, with the same entry title.

Secondly, if you decide to change the title in the future, having a single reference doesn’t give your readers a contingency plan. You may well publish more than one entry in any one day or month, but offering a secondary reference in your URLs adds a level of redundancy. This allows your blogging system to look up possible entries when an incorrect or out-of-date URL is accessed.

I think knowing when an entry was published is useful information for readers. Adding the date to your URLs in some form is going to help identify entries and keep your URLs unique.

I decided to go with posting my entries under months and to use the three-letter, textual abbreviations. As mentioned in the comments to Chris Shiflett’s post on URL Vanity, using a numeric month can make URLs easier to skim-read. I’ve noted before that I think months are more useful as text than as numbers:

The argument is that, say, through using the name of a month in place of its numerical representation, a date becomes dependent on the language you are using. By that logic, numerical dates are better for internationalisation. Unfortunately, the simple difference between the typical English and American date formats throws a spanner in the works. Something to think about; surely, if your content is dependent on language, there should be no problem with your dates being dependent on language too, even in your URLs?

So, I’ve decided to try out using months as text rather than numbers. One advantage of making textpattern redirect sensibly is that I can change my URLs back to using numeric months without incurring a headache, so feel free to convince me I’m wrong about that!

Related reading