I recently ran into a couple of problems with the wonderful Sage extension for Firefox where it wouldn’t parse some of my feeds. I figured the feed URLs must have been updated, so I went to the sites and grabbed fresh RSS URLs. Some of the feed URLs had indeed been updated, but some of them still wouldn’t work in Sage.
As it turns out, there were two problems occurring:
- The URLs had the “feed” protocol tacked on the front.
- One feed was actually invalid. Despite it validating with feedvalidator.org and the W3C feed validator, the Firefox XML parser failed on it, as did Sage.
The Feed Protocol
Once I had actually figured out what was going wrong here, it was simple enough to fix! Simply remove the feed protocol from the start of the URL to stop Sage choking on it.
Marvellous! But it did leave me wondering what use the feed protocol is to us, and whether feed aggregators should be expected to deal with the protocol just in case somebody uses it. I submitted a Sage bug report about the feed protocol all the same, so I’m hoping Sage will get updated with a fix soon enough.
Some time ago, I began to encounter more and more feed URLs using the protocol (probably as a result of WordPress including it by default at the start of its feed URLs), so I read up on it a little. Without wanting to come into what is an old issue, I do wonder whether it is at all useful any longer, especially when so many of our beloved browsers can now handle feed subscription. We could send commands to applications aware of the feed protocol as we can do with mailto, but what useful commands would we send? Perhaps I’m stuck thinking about how these might be useful to aggregators and not a wider application, but surely aggregators just need to know the location of a feed? With that in mind, it just seems redundant information to me, although it could be useful in identifying a feed without needing to use MIME types.
Named HTML Entities in RSS
The second problem wasn’t so obvious. This time, Firefox was choking on the feed because it found an undefined entity in the RSS – … – a horizontal ellipsis. I wasn’t sure why that’d break the RSS, so I did a little digging.