I just read the rather brief Feed Access Control RSS and ATOM specification from the Bloglines team. It defines the access:control element as

<access:restriction> element
Sub element of <rss> or <feed>. Used to indicate the re-distribution restrictions for a feed. The 'relationship' attribute is used to indicate whether a feed will 'allow' or 'deny' access.

To 'allow' access means a feed may be redistributed to other public sources, including search. To allow access, for example:

    <access:restriction relationship="allow" />

To 'deny' access means a feed should not be redistributed to other public sources, including search. To deny access, for example:

    <access:restriction relationship="deny" />

The default relationship is to allow access. However, if a feed is currently set to 'deny', the relationship must be explicitly set back to 'allow' for it to be registered (Simply ommiting it from the feed is not sufficient to turn access back on).

The problem with this 'specification' is that it says nothing about its goals, scenarios or expected use cases. Without these it is hard to tell whether this is a good idea or a bad idea. Danny Ayers points out that this mimics the behavior of the Robots META tag that can be placed in HTML pages. I guess this means it prevents search engines from indexing your page and showing it in search results which makes sense in certain limited scenarios. For example, it makes sense to exclude a search engine from indexing the search results page of another search engine or the RSS feed of some search results. Hints like the Robots META tag and robots.txt are good ways to prevent this from happening for HTML pages. I guess this proposal does the same for RSS and Atom feeds.

On the other hand, it is definitely not an access control mechanism. You wouldn't want your bank to tell you that the way that they prevent anyone from viewing your bank account details is via robots.txt would you?


 

Friday, 04 August 2006 00:14:16 (GMT Daylight Time, UTC+01:00)
The bloglines blog entry (http://www.bloglines.com/about/news#114) actually explains the expected use case a little bit, which is much more limited in scope than the spec lead me to believe. I agree though that
1) Server side aggregators should not index content in password protected feeds.
2) The presence of this spec is likely to decieve users into believing that their content will not be indexed, especially if someone like Blogger or Spaces puts a checkbox in their blog configuration that says "hide my posts from search engines".

Comments are closed.