Last week, Digg announced the launch of the DiggBar which manages to combine two trends that Web geeks can't stand. It is both a URL shortener (whose problems are captured in the excellent post by Joshua Schachter on URL shorteners) and brings back the trend of one website putting another's content in a frame (which has detailed criticism in the wikipedia article on framing on the World Wide Web). 

The increasing popularity of URL shortening services has been fueled by the growth of Twitter. Twitter has a 140 character limit on posts on the site which means users sharing links on the site often have to find some way of shortening URLs to make their content fit within the limit. From my perspective, this is really a problem that Twitter should fix given the amount of collateral damage the growth of these services may end up placing on the Web.

Some Web developers believe this problem can be solved by the judicious use of microformats. One such developer is Chris Shiflett who has written a post entitled Save the Internet with rev="canonical" which states the following

There's a new proposal ("URL shortening that doesn't hurt the Internet") floating around for using rev="canonical" to help put a stop to the URL-shortening madness. It sounds like a pretty good idea, and based on some discussions on IRC this morning, I think a more thorough explanation would be helpful. I'm going to try.

This is easiest to explain with an example. I have an article about CSRF located at the following URL:

http://shiflett.org/articles/cross-site-request-forgeries

I happen to think this URL is beautiful. :-) Unfortunately, it is sure to get mangled into some garbage URL if you try to talk about it on Twitter, because it's not very short. I really hate when that happens. What can I do?

If rev="canonical" gains momentum and support, I can offer my own short URL for people who need one. Perhaps I decide the following is an acceptable alternative:

http://shiflett.org/csrf

Here are some clear advantages this URL has over any TinyURL.com replacement:

  • The URL is mine. If it goes away, it's my fault. (Ma.gnolia reminds us of the potential for data loss when relying on third parties.)
  • The URL has meaning. Both the domain (shiflett.org) and the path (csrf) are meaningful.
  • Because the URL has meaning, visitors who click the link know where they're going.
  • I can search for links to my content; they're not hidden behind an indefinite number of short URLs.

There are other advantages, but these are the few I can think of quickly.

Let's try to walk through how this is expected to work. I type in a long URL like http://www.25hoursaday.com/weblog/2009/03/22/VideoStandardsForAggregatingActivityFeedsAndSocialAggregationServicesAtMIX09Conference.aspx into Twitter. Twitter allows me to post the URL and then crawls the site to see if it has a link tag with a rev="canonical" attribute. It finds one and then replaces the short URL with something like http://www.25hoursaday.com/weblog/MIX09talk which is the alternate short URL I've created for my talk. What could go wrong? Smile

So for this to solve the problem, every site that could potentially be linked to from Twitter (i.e. every website in the world) needs to run their own URL shortening service. Then Twitter needs to make sure to crawl the website behind every URL in every tweet that flows through the system.  Oh yeah, and the fact is that the URLs still aren't as efficient as those created by sites like http://tr.im unless everyone buys a short domain name as well.

Sounds like a lot of stars have to align to make this useful to the general populace and not just a hack that is implemented by a couple dozen web geeks.  

Note Now Playing: Nas - Hate Me Now (feat. Puff Daddy) Note


 

Monday, 13 April 2009 04:14:19 (GMT Daylight Time, UTC+01:00)
So for this to solve the problem, every site that could potentially be linked to from Twitter (i.e. every website in the world) needs to run their own URL shortening service.


...or, I could outsource my short URLs to a third-party service of *my* choice.

Then Twitter needs to make sure to crawl the website behind every URL in every tweet that flows through the system.


...rather than running every URL in every tweet that flows through the system through one single service's API / point-of-failure chosen by Twitter.

Oh yeah, and the fact is that the URLs still aren't as efficient as those created by sites like http://tr.im unless everyone buys a short domain name as well.


...unless, of course, I actually choose to use http://tr.im for my short URLs.

Sounds like a lot of stars have to align to make this useful to the general populace and not just a hack that is implemented by a couple dozen web geeks.


...or a couple dozen web geeks need to write WordPress, Drupal, and Movable Type plugins that get popular and get installed with relative ease in lots of sites.

rev="canonical" (or whatever) doesn't solve the URL shortening issue, but it's a more web-friendly way to enable people to take more control over their own URL spaces and decide their own shortening conventions - and maybe even try their own solutions to URL shortening.
Monday, 13 April 2009 04:19:14 (GMT Daylight Time, UTC+01:00)
Sounds like a lot of stars have to align to make this useful to the general populace and not just a hack that is implemented by a couple dozen web geeks.


Also FWIW, I seem to remember this same sort of thing was said about RSS auto-discovery links when Mark Pilgrim first started talking about it years ago. Not to assume this current thing will catch on as well, but it *is* folded into most popular browsers now.
Monday, 13 April 2009 05:16:22 (GMT Daylight Time, UTC+01:00)
Dare:

There could be another answer to the problem and that is using XRI's (i-names) as relative URLa.

I've set up http://shortxri.net which then only generates an XRI which is short (ie: @si*3*48u). The XRI is just the path of the short url, the XRI doesn't live on the shortxri.net domain. However the XRIcan be attached to say http://xri.net or http://xri.be to be even shorter than http://shortxri.net (so http://shortxri.net/@si*3*48u and http://xri.be/@si*3*48u go to the same place).

Also the XRI specification allows for a query parameter to pull-back just the XRDS (which holds the link for the re-direction) so there is a built in access for backing up your links ... it's not in place but if feedback dictates, one could buy their own i-name or XRI "domain" (the @si) and place it where they wish (with shortxri.net or another host) then they could "own" the links while still using a short domain name (the xri.be domain). The basic system is in place today and is working sans stars *smile*.

Nika
Monday, 13 April 2009 13:34:49 (GMT Daylight Time, UTC+01:00)
The biggest problem with rev="canonical" is that it doesn't change user behavior. As a user, I want don't want to type in a huge URL and hope that it has specified a short URL with rev="canonical" after I've published it. Any sort of user interface changes that Twitter implements to address this (e.g. cutting of URLs with an ellipses but retaining the entire URL in the hyperlink) might as well be the solution to the problem instead of relying on URL shorteners or implying that everyone implement/declare their own URL shortener.

The same goes for Twitter clients which will have to be modified to support rev="canonical" anyway.
Tuesday, 14 April 2009 03:44:50 (GMT Daylight Time, UTC+01:00)
The whole rev="canonical" idea is fundamentally and fatally flawed. For a start it implies that the current page is canonical, which means that it only works for the canonical URL - not any of the potentially infinite number of alternative URLs. As if that's not enough, if you get it wrong ("rel" instead of "rev") then your shortcut url will become your new canonical, which may well knock your entire site off its SEO perch in one fell swoop. It also implies that you're giving a complete list of inbound URLs, which you're not.

Fortunately all these issues are easily solved by using rel="shortlink". And yes, there's applications for this outside of Twitter, microblogging and mobile Internet - pretty much anywhere a URL has to be manually entered (e.g. when it's printed or spoken).

Sam

Comments are closed.