February 11, 2004
@ 04:02 PM

One of the big problems with arguing about metadata is that one persons data is another person's metadata. I was reading Joshua Allen's blog post entitled Trolling EFNet, or Promiscuous Memories where he wrote

  • Some people deride "metacrap" and complain that "nobody will enter all of that metadata".  These people display a stunning lack of vision and imagination, and should be pitied.  Simply by living their lives, people produce immense amounts of metadata about themselves and their relationships to things, places, and others that can be harvested passively and in a relatively low-tech manner.
  • Being able to remember what we have experienced is very powerful.  Being able to "remember" what other people have experienced is also very powerful.  Language improved our ability to share experiences to others, and written language made it possible to communicate experiences beyond the wall of death, but that was just the beginning.  How will your life change when you can near-instantly "remember" the relevant experiences of millions of other people and in increasingly richer detail and varied modality?
  • From my perspective it seems Joshua is confusing data and metadata. If I had a video camera attached to my forehead recording I saw then the actual audiovisual content of the files on my harddrive are the data while the metadata is information such as what date it was, where I was and who I saw. Basically the metadata is the data about data. The interesting thing about metadata is that if we have enough good quality metadata then we can do things like near-instantly "remember" the relevant experiences of ourselves and millions of other people. It won't matter if all my experiences are cataloged and stored on a hard drive if the retrieval process isn't automated (e.g. I can 'search' for experiences by who they were shared with, where they occured or when they occured) as opposed to me having to fast forward through gigabytes of video data. The metadata ideal would be that all this extra, descriptive information would be attached to my audiovisual experiences stored on disk so I could quickly search for “videos from conversations with my boss in October, 2003”.

    This is where metacrap comes in. From Cory Doctorow's excellent article entitled Metacrap

    A world of exhaustive, reliable metadata would be a utopia. It's also a pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated market opportunities.

    This applies to Joshua's vision as well. Data acquisition is easy, anyone can walk around with a camcorder or digital camera today recording everything they can. Effectively tagging the content so it can be categorized in a way you can do interesting things with it search-wise is unfeasible. Cory's article does a lot better job than I can at explaining the many different ways this is unfeasible, cameras with datestamps and built in GPS are just a tip of the iceberg. I can barely remember dates once the event didn't happen in the recent past and wasn't a special occassion. As for built in GPS, until the software is smart enough to convert longitude and latitude coordinates to “that Chuck E Cheese in Redmond“ then they only solve problems for geeks not regular people.  I'm sure technology will get better but metacrap is and may always be an insurmountable problem on a global network like the World Wide Web without lots of standardization.