The Associated Press has an article entitled Microsoft Changes Blog Shutdown Policies which states

SEATTLE - Microsoft Corp. is tightening its policies regarding shutting down Web journals after its much-publicized shut down of a well-known Chinese blogger at that government's request.

The Redmond software company, which operates a popular blogging technology called MSN Spaces, said Tuesday that the changes will include efforts to make the banned content available to users elsewhere in the world even if Microsoft decides it has a legal duty to block it in a particular country.

The company also pledged to provide users with a clear notice that it has shut down a Web site because it received a legally binding notice that the material violates local laws. Previously, it has simply said the content was unavailable.

Brad Smith, Microsoft's top lawyer, said in an interview that it will depend on the circumstances of the shutdown as to whether the new policy means that an archive of the blog will remain available elsewhere, or that the Web blog's author will be able to continue posting information to users outside the country that ordered the blockage.

"Some of this, I think, we just have to recognize is evolving technology and changing law," said Smith, speaking by phone from a Microsoft-sponsored government conference in Lisbon, Portugal.

MSN Spaces, which allows users to post journals, pictures and other content on the Internet, boasts 35 million users, including 3.3 million in China.

The company has maintained that it is important to be able to provide users in other countries with such tools, even as it insists it is bound by local laws when it operates in those places.

"We think that blogging and similar tools are powerful vehicles for economic development and for creativity and free expression. They are tools that do good," Smith said. "We believe that it's better to make these tools available than not, but that isn't the end of the discussion, either."

This is good to hear. You can also get the news straight from the horses mouth from the press release Microsoft Outlines Policy Framework for Dealing with Government Restrictions on Blog Content.


 

Categories: Windows Live

If you are a regular reader of the Internet Explorer team's blog then you should know that IE7 Beta 2 Preview is now available.

I've used it for about 10 minutes now and I'm still having difficulty getting used to the changes in the user interface. They seem like rather gratuitous changes to me, the browser now seems unfamiliar although I assume that I'll eventually get used to the new look and feel.  My main interest was in checking out the RSS support in the browser and so far I've been unsatisfied by the experience. Below is a screenshot of what it looks like when subscribed to a feed. 



The general RSS reading experience is rather unsatisfactory. There are so many features I take for granted from using RSS Bandit that I find the RSS experience in IE 7 to be disappointing. No search folders, no aggregated views of items within a category, no ability to flag items, no options to email an item or post it to my favorite social bookmarking site.  I couldn't even figure out how to mark individual items as read or unread. I found it to be pretty unusable as a replacement for my current RSS reader. 

PS: For some reason since upgrading to IE 7, all the HTML mail in Outlook now hurts my eyes to look at. Does IE 7 flip on ClearType by default or something?

 

Categories: Web Development

From the official Google Blog we find the post All buttoned up which informs us that

As the Google Toolbar has gotten more popular, the greatest source of ideas about new features has come from our users. The breadth and variety of these requests is so large that it's hard to satisfy everyone. But then we started noticing engineers on the team had cool hacks on their Toolbars for doing customized searches on our internal bugs database, corporate employee directory, etc... We were barely done asking ourselves whether it was possible to offer this capability in the new Google Toolbar beta when one of the engineers started designing a feature called Custom Buttons. Here are some of the coolest aspects of Custom Buttons and why I think they're a big deal:

1) Simple API: The term API is almost a misnomer -- it literally takes seconds to make one of these. I just can't resist the urge to make a new one every time I run into new website. A couple of simple steps and voila - a new button's sitting on your Toolbar (check out the Getting Started Guide).

2) Flexibility: The simple inclusion of RSS & Atom feeds (and particularly allowing the update of toolbar button icons through feeds) has allowed for buttons like a weather button and a mood ring button.

3) Accessibility: Most users don't even need to make buttons. It takes one click on our buttons gallery or on a website that offers them to install a button for your favorite sites. And the custom buttons we built to search our intranet showed us how valuable a customizable toolbar can be to organizations, so now there's an enterprise version of Google Toolbar that can be securely deployed across a company.

I use the Google toolbar quite frequently when performing searches and one of my biggest gripes is that it doesn't give me the option of using Google Music Search for my searches. So when I found out about the new version of the toolbar, I downloaded it out and clicked on "Add Search Type" which took me to the Google Toolbar Button Gallery. Guess what? There's no option for adding Google Music Search to my list of default search types.

So I tried reading the documentation on Getting Started with the Google Toolbar API so I could figure out how to add it myself and came up short.  The entire API seems to assume that some stuff gets installed in my right-click menu in Internet Explorer which doesn't seem to be the case. I wonder if I need to reboot to get the changes to show up? Bah. Now I feel irritated that I just wasted 15 minutes of my time on this. 


 

Categories: Technology

Joe Wilcox, research analyst for Jupiter Research, has a blog post entitled What AOL Explorer Means to Microsoft which touches on a topic I've discussed in previous blog posts. He writes

In IE 7, Microsoft makes revolving a search term through several different search engines a fairly easy process. The approach makes sense for a platform provider, but it may not be the best for MSN Search--or is that Windows Live Search, now?

I'm wondering if maybe Microsoft will be forced to an internal browser war of sorts. Microsoft's IE development clearly is focused more on corporate customers and not introducing too many disruptive changes there. AOL is going after consumers with its browser. I don't see how Microsoft can effectively compete, protect its turf and extend opportunities for MSN and search with IE development so corporate focused.

Microsoft needs to more seriously treat the consumer and corporate browser markets as separate opportunities. In one sense, Windows Live seeks to resolve the consumer problem by offering consumers more products and services. To get there, Microsoft is going to have to draw and even clearer line between IE as a platform and corporate product and IE as throughway for consumer products and services.

It seems pretty obvious to me that if Microsoft is serious about Windows Live, then it doesn't make much sense for our Web browser to be tied to the operating system division with its long ship cycles and focus on making corporate customers happy (i.e. keeping disruption to a minimum). I said as much in my previous post on this topic; Mac IE's Death: A Case for Microsoft Disbanding or Transfering the Windows IE Team. I'm glad to see that some analysts are also beginning to point out this hole in our Windows Live strategy.


 

Categories: Windows Live

A few weeks ago I had a chat with Robert Rebholz about folksonomies, RSS and information overload. It was a rather fun discussion and he let me know about a tool he'd built called the OPML-o-mater. The way the tool works is described below, 

The OPML-o-mater delivers a personalized list of RSS feeds in an xml format called OPML. OPML files can be imported by any competent RSS Reader/Aggregator.

You want the feeds from the OPML-o-mater because they're known quality feeds -- at least they were when we entered them. Even if you already have all the feeds you need, it might be worth a look to discover if we've one or two you didn't know about.

In general it works this way:

  • We've tagged the feeds.
  • You select the tags that describe your interests
  • The OPML-o-mater finds and displays feeds associated with the tags you've selected
  • You pick the feeds you want
  • Press the generate OPML button
  • Save the OPML file to your local machine
  • Import it into your feed reader
More specifically, we've tagged all the feeds. The first column of the OPML-o-mater lists the tags. You select a single tag from column one that describes an area of interest for you. Column two displays the tags that were also used anytime the tag you selected was used to describe a url (bear in mind that a single feed/url may have many tags associated with it). In column three the feeds associated with the selection made in column one are displayed.

I think this is a very interesting way to solve the "How Do I Find Interesting Blogs?" problem which plagues users of RSS readers today. I currently am subscribed to 158 feeds in RSS Bandit. Given that there are tens of millions of blogs out there, I am sure that there are literally thousands of blogs I'd love to read if I just knew about them. The tough question for me has been how to find them and then how to integrate that process into RSS Bandit in an automated way. 

What would be cool would be for the OPML-o-mater to evolve into the equivalent of http://del.icio.us/popular/ for RSS feeds and then for it to expose an API that tools such as RSS Bandit could integrate into part of their user experience. This idea is interesting enough that I wish I had the time and dedicated server resources to build it myself.  


 

Categories: RSS Bandit | Social Software

The live.com folks recently blogged about a recent change to the site to support inline images which states

we've been listening to your feedback and one of the main things you've been asking for has been more pizzazz on the page. we just shipped something that hopefully adds a little bit of that :)
now you can view embedded images in rss feeds inline on your live.com page:
- if you have 5+ headlines you get a smaller image that rotates every 20 seconds
- if you have 1 headline you get a larger image
 
we'll let a picture do the rest of the talking :)

As you can tell from the screenshot, the change definitely jazzes up the look of the page.


 

Categories: Windows Live

January 29, 2006
@ 12:30 PM

Sometimes I've seen the U.S. media take the simplistic view that "democracy" is the answer to all of a country's problems. I often chuckle to myself when I notice that in many cases the term "democracy" when used by the American press is really a euphemism for an American friendly government and way of life.  This is one of the reasons why I am unsurprised by the inherent contradiction in stories like Bush Says U.S. Won't Deal With Hamas which is excerpted below

Stunned by Hamas' decisive election victory, President Bush said Thursday the United States will not deal with the militant Palestinian group as long as it seeks Israel's destruction.

"If your platform is the destruction of Israel it means you're not a partner in peace," the president said. "And we're interested in peace." He urged Hamas to reverse course.

Hamas has taken responsibility for dozens of suicide attacks on Israel over the past five years but has largely observed a cease-fire since the election of Fatah leader Mahmoud Abbas as Palestinian president last year.

Bush left open the possibility of cutting off U.S. aid to the Palestinians. He called on Abbas, a U.S. ally, to remain in office despite Fatah's defeat by Hamas in parliamentary elections. Abbas, elected separately a year ago, said he was committed to negotiations with Israel and suggested talks would be conducted through the Palestine Liberation Organization, a possible way around a Hamas-led government.

I guess that's one way of to finding out what the U.S. government really thinks about exporting democracy. American foreign policy has always been about supporting governments which support its policies regardless of whether they are democracies or brutal dictatorships. Heck, just a few months before the events of September 11, 2001 the United States government gave aid to the Taliban because they took a hard line position in the war on drugs.

Lots of people talk about democracy without really understanding what it means. Spreading democracy isn't about making the more places share American culture, it's about giving people the freedom to choose their way of life. The hard part for the U.S. government is that sometimes their choices will be different from the ones Americans would like them to make.


 

Categories: Current Affairs

The team I work for in MSN Windows Live has open developer and program management positions. Our team is responsible for the underlying infrastructure of services like Windows Live Messenger, Windows Live Mail, Windows Live Favorites, MSN Spaces, and MSN Groups. We frequently collaborate with other properties across MSN Windows Live including the Live.com, Windows Live Local, Windows Live Expo, and Windows OneCare Live teams as well as with other product units across Microsoft such as in the case of Office Live. If you are interested in building world class software that is used by hundreds of millions of people and the following job descriptions interest you then send me your resume

Program Management Architect
The Communications Core platform team for Windows Live services is looking for an experienced, enthusiastic and highly technical PM to jump start a brand new service that helps developers adopt our platform at a very rapid pace. You will be responsible for building a platform where developers can easily take advantage of emerging technology from our large scale services (e.g. Messenger, Hotmail, Contacts, Storage services) and empower quick schema and API changes for a rapid TTD (Time to Demo!). Designing, developing, deploying, evangelizing and supporting this so called “Sandbox” environment will require excellent cross-group working skills as you will have to interact extensively with business planning, dev/test, operations, and partner support teams. It will also require a high level of technical depth in order to intimately understand and create clones of the back end services involved as well as extensive web services and API knowledge. We are looking for someone with core technical skills, a services or high scale server background; experience with API development, web services and a passion to win developers from the competition.

Program Manager
If you are an experienced Program Manager with strong technical skills and a strong desire to work in an enthusiastic fast paced environment then this job is for you! The Communications Core Platform team for Windows Live services owns the data store serving hundreds of millions of end users with billions of contacts, files and photos. Our systems handle tens of thousands of transactions per second. Our team owns core MSN and Windows Live platforms, including ABCH (storing Messenger and Hotmail contacts, groups and ACLs) and MSN Storage (storing files, photos and other data types). We are looking for a creative, self-driven, technical Program Manager who is interested in designing and shipping the next generation of back-end components that drive this massively scalable service in the midst of stiff competition from Microsoft's toughest competitors. You will be responsible for defining and writing technical specifications for new features, developing creative solutions for difficult scale and performance problems, driving the capacity management framework, working with teams across the company on critical cross-group projects, working extensively with development and test to drive project schedules and ultimately shipping new versions of the service that provide tremendous value for our customers and partners.

Software Development Engineer
400 million address books. 8 billion contacts. A gazillion relationships! That is the magnitude of data the Windows Live Contacts team hosts today (and it is growing fast!). The service (called the ABCH) doesn't just host contacts and address books but provides a platform for building rich permissions and sharing scenarios (sharing objects to individuals, groups or social networks). Now imagine, if this treasure trove of data were accessible via programmable APIs to all the developers in the world. Imagine the scenarios that it could enable. Imagine the interesting applications that developers around the world would build.

This is what we want to provide as part of the Windows Live vision. We are looking for an experienced software developer who can spearhead our effort in providing APIs (SOAP, DAV, REST) to our contacts and permissions service that can be used by third-party developers and ISVs.

The ideal candidate will have at least five years of demonstrated experience in shipping commercial software. The candidate should be a solid developer with above average design skills. The candidate should have a very keen sense of programmability, security and privacy and be willing to go the extra mile to make sure a users' data isn't compromised.

Email your resume to dareo@msft.com (replace msft with microsoft) if the above job descriptions sound like they are a good fit for you. If you have any questions about what working here is like, you can send me an email and I'll either follow up via email or my blog to answer any questions of general interest [within reason].


 

January 27, 2006
@ 01:04 AM

As Mike Torres notes, a BIG update to MSN Spaces just shipped. Below is a list of some of the features from his post. The ones I worked on have been highlighted in red. :)

  • Spaces Search.  This is an incredibly cool feature that lets you find interesting spaces based on keyword, a user's profile information, or by clicking on most popular interests across all of spaces.  You can also run a search from any space just by clicking "Search Spaces" in the header above.  One thing to mention about the search feature is that it will be ramping up for a few days - but you can help make it better!  Learn more about this on The Space Craft.

  • Mobile Search from Mobile Spaces!  Search for spaces from your mobile device.  Mike Smuga will be talking about this more over on his space soon.

  • Your own advertising on your space (as an option) to make money from clicks - powered by Kanoodle!  (This feature is only available in the United States and Canada at this time.)

  • Book lists with Amazon integration.  Automatically insert information from Amazon.com directly into your book list - and again, make money through Amazon Associates when people end up buying the book!  It's very cool (by the way, our book is called Share Your Story if you want to add it to your book list :)

  • Better blog navigation.  This feature is one of those things we needed to do.  You can now "view more entries" at the bottom of the page, and navigate through Previous and Next pages while looking through blog entries.

  • Customized blog entry display.  Choose how you want your blog entries to appear, by date or entry title.  This is a great feature for people who write essays or incredibly insightful posts once a month.  Date isn't really important in this case, whereas sorting by entry title may make more sense.

  • Integrated Help.  Confused?  Click the Learn link in the header above to figure out what to do next!

  • Enhanced Profile including General, Social, and Contact Info sections.  Each section will have it's own permissions so any part you would like to limit access to (say your personal contact information), you can do it.  There's also an updated profile module for the homepage with an actionable photo; anytime you see someone's picture anywhere you can right-click (or click on the down arrow) to view their contact card, space, profile, and more.

  • Live Contacts Beta!  Brand new feature which you'll see popping up throughout Windows Live in time.  What is it?  It's the ability to subscribe to automatic contact updates.  When your friend changes his or her address or phone number (in their Profile mentioned above), your address book will be automatically updated if you are subscribed to updates.  This is crazy cool.  Learn more here (an overview will be posted soon)

  • Which reminds me: Spaces now has contact cards like MSN/Windows Live Messenger!  Just right-click on someone's profile photo (or click on the down arrow) and select "View contact card" to see a preview of their space.

  • Better commenting for blogs and now photos as well!  This feature also has an (optional) clickable profile photo that you can leave behind when leaving a comment.  And there's a mini text editor so you can format your comments (something I'm really glad we did!) kinda like blog entries.  Note that if you would like to turn off photo comments, you can do this in Settings.

  • Photos are no longer limited to 30MB; you can now upload 500 photos per month without worrying about running out of storage space.

  • MetaWeblog API (OK, this one is from December – but it's still cool).  Read more here.

  • Better URLs!  Sometimes the little things matter the most.  This is one of those things.  Say goodbye to /members/.  You can now be reached at http://spaces.msn.com/[username].  For example, http://spaces.msn.com/mike now works!  We also have cleaner paths to pages, so if you want to give someone a link to your blog or to your photos, you can send them to http://spaces.msn.com/[username]/blog or /photos.

  • Xbox Live integration.  Themes, recent games module, and gamer card integration!  This feature has been the single biggest reason my gamer score is now clocking in at 500 instead of 0.  If you're into Xbox Live, these features rock!  Check out my theme and gamer card and you can see why.

  • New themes and categorized theme picker.  We now have well over 100 themes!

  • Do you like email or mobile publishing?  You can now publish from 3 email addresses instead of just one.

  • For those of you with private spaces (you know who you are!) people can now request access to spaces via anonymous email.  I like to think about this as "knocking on the door of someone's house".

  • Privacy controls (communication preferences) for who can request access to your space and to your contact information and how.  Check it out in Settings.

  • We doubled the size limit on the HTML PowerToy.

There's a lot of good stuff in this release and its great to be able tp work on shipping these features to our tens of millions of users.


 

Categories: Windows Live

Thanks to the recent news of the US Department of Justice's requests for information from the major web search engines, I've seen a number of people express surprise and dismay that online services track information that they'd consider private. A term that I've seen bandied about a lot recently is Personally Identifiable Information (PII) which I'd never heard before starting work at MSN.

The Wikipedia definition for Personally Identifiable Information (PII) states

In information security and privacy, personally identifiable information or personally identifying information (PII) is any piece of information which can potentially be used to uniquely identify, contact, or locate a single person.

Items which might be considered PII include, but are not limited to, a person's:

Information that is not generally considered personally identifiable, because many people share the same trait, include:

  • First or last name, if common
  • Country, state, or city of residence
  • Age, especially if non-specific
  • Gender or race
  • Name of the school they attend or workplace
  • Grades, salary, or job position
  • Criminal record

When a person wishes to remain anonymous, descriptions of them will often employ several of the above, such as "a 34-year-old black man who works at Target". Note that information can still be private, in the sense that a person may not wish for it to become publicly known, without being personally identifiable. Moreover, sometimes multiple pieces of information, none of which are PII, may uniquely identify a person when brought together; this is one reason that multiple pieces of evidence are usually presented at criminal trials. For example, there may be only one Inuit person named Steve in the town of Lincoln Park, Michigan.

In addition, there is the notion of sensitive PII. This is information which can be linked to a person which the person desires to keep private due to potential for abuse. Examples of "sensitive PII" are a person's medical/health conditions; racial or ethnic origin; political, religious or philosophical beliefs or affiliations; trade union membership or sex life.

Many online services such as MSN have strict rules about when PII should be collected from users, how it must be secured and under what conditions it can be shared with other entities. However many Internet users don't understand that they disclose PII when using online services. Not only is there explicit collection of PII such as when user's provide their name, address and credit card information to online stores but there is often implicit PII collected which even savvy users fail to consider. For example, most Web servers log IP addresses of incoming HTTP requests which can then be used to identify users in many cases. It's easy to forget that practically every website you visit stores your IP address somewhere on their servers as soon as you hit the site. Other examples aren't so obvious. There was a recent article on Boing Boing entitled Data Mining 101: Finding Subversives with Amazon Wishlists which showed how to obtain sensitive PII such as people's political beliefs from their wishlists on Amazon.com. A few years ago I read a blog post entitled Pets Considered Harmful which showed how one could obtain sensitive PII such as someone's email password by obtaining the name of the person's pet from reading their blog since "What is the name of your cat?" was a question used by GMail to allow one to change their password.  

The reason I bring this stuff up is that I've seen people like Robert Scoble's make comments about wanting "a button to click that shows everything that’s being collected from their experience". This really shows a lack of understanding about PII. Would such a button prevent users from revealing their political affiliations in their Amazon wishlists or giving would be email account hijackers the keys to their accounts by blogging about their pets? I doubt it.

The problem is that most people don't realize that they've revealed too much information about themselves until something bad happens. Unfortunately, by then it is usually too late to do anything about it. If are an Internet user,  you should be cognizant of the amount of PII you are giving away by using web applications like search engines, blogs, email, instant messaging, online stores and even social bookmarking services.

Be careful out there.


 

Categories: Current Affairs

January 26, 2006
@ 06:54 PM

From the press release Microsoft Expands Internet Research Efforts With Founding of Live Labs we learn

REDMOND, Wash. — Jan. 25, 2006 —Microsoft Corp. today announced the formation of Microsoft® Live Labs, a research partnership between MSN® and Microsoft Research. Under the leadership of Dr. Gary William Flake, noted industry technologist and Microsoft technical fellow, Live Labs will consist of a dedicated group of researchers from MSN and Microsoft Research that will work with researchers across Microsoft and the academic research community. Live Labs will provide consistency in vision, leadership and infrastructure as well as a nimble applied research environment that fosters rapid innovations.

"Live Labs is a fantastic alliance between some of the best engineering and scientific talent in the world. It will be the pre-eminent applied research laboratory for Internet technologies," Flake said. “This is a very exciting opportunity for researchers and technologists to have an immediate impact on the next evolution of Microsoft's Internet products and services and will help unify our customers' digital world so they can easily find information, pursue their interests and enrich their lives."

The Live Labs — a confederation of dedicated technologists and affiliated researchers in pre-existing projects from around Microsoft — will focus on Internet-centric applied research programs including rapidly prototyping and launching of emerging technologies, incubating entirely new inventions, and improving and accelerating Windows Live™ offerings. This complements the company’s continuing deep investment in basic research at Microsoft Research and product development at MSN.

Ray Ozzie, Craig Mundie and David Vaskevitch, Microsoft’s chief technical officers, will serve as the Live Labs Advisory Board. Ozzie sees Live Labs as an agile environment for fast-tracking research from the lab into people’s hands. "Live Labs is taking an exciting approach that is both organic and consumer-driven," Ozzie said. "Within the context of a broad range of rich usage scenarios for Windows Live, the labs will explore new ways of bringing content, commerce and community to the Internet."

You can check out the site at http://labs.live.com/. It's unclear to me why we felt we had to apply the "Live" brand to what seems to be a subsection of http://research.microsoft.com/. I guess "Live" is going to be the new ".NET" and before the end of the year everything at Microsoft will have a "Live" version.

*sigh*


 

Categories: Windows Live

Today while browsing the Seattle Post Intelligencer, I saw an article with the headline Google agrees to censor results in China which began

SAN FRANCISCO -- Online search engine leader Google Inc. has agreed to censor its results in China, adhering to the country's free-speech restrictions in return for better access in the Internet's fastest growing market.

The Mountain View, Calif.-based company planned to roll out a new version of its search engine bearing China's Web suffix ".cn," on Wednesday. A Chinese-language version of Google's search engine has previously been available through the company's dot-com address in the United States. By creating a unique address for China, Google hopes to make its search engine more widely available and easier to use in the world's most populous country.
...
To obtain the Chinese license, Google agreed to omit Web content that the country's government finds objectionable. Google will base its censorship decisons on guidance provided by Chinese government officials.

Although China has loosened some of its controls in recent years, some topics, such as Taiwan's independence and 1989's Tiananmen Square massacre, remain forbidden subjects.

Google officials characterized the censorship concessions in China as an excruciating decision for a company that adopted "don't be evil" as a motto. But management believes it's a worthwhile sacrifice.

"We firmly believe, with our culture of innovation, Google can make meaningful and positive contributions to the already impressive pace of development in China," said Andrew McLaughlin, Google's senior policy counsel.

Google's decision rankled Reporters Without Borders, a media watchdog group that has sharply criticized Internet companies including Yahoo and Microsoft Corp.'s MSN.com for submitting to China's censorship regime.

No comment.


 

Brian Jones has a blog post entitled Corel to support Microsoft Office Open XML Formats which begins

Corel has stated that they will support the new XML formats in Wordperfect once we release Office '12'. We've already seen other applications like OpenOffice and Apple's TextEdit support the XML formats that we built in Office 2003. Now as we start providing the documentation around the new formats and move through Ecma we'll see more and more people come on board and support these new formats. Here is a quote from Jason Larock of Corel talking about the formats they are looking to support in coming versions (http://labs.pcw.co.uk/2006/01/new_wordperfect_1.html):

Larock said no product could match Wordperfect's support for a wide variety of formats and Corel would include OpenXML when Office 12 is released. "We work with Microsoft now and we will continue to work with Microsoft, which owns 90 percent of the market. We would basically cut ouirselves off if you didn't support the format."

But he admitted that X3 does not support the Open Document Format (ODF), which is being proposed as a rival standard, "because no customer that we are currently dealing with as asked us to do so."

X3 does however allow the import and export of portable document format (pdf) files, something Microsoft has promised for Office 12.

I mention this article because I wanted to again stress that even our competitors will now have clear documentation that allows them to read and write our formats. That isn't really as big of a deal though as the fact that any solution provider can do this. It means that the documents can now be easily accessed 100 years from now, and start to play a more meaningful role in business processes.

Again I want to extend my kudos to Brian and the rest of the folks on the Office team who have been instrumental in the transition of the Microsoft Office file formats from proprietary binary formats to open XML formats.


 

Categories: Mindless Link Propagation | XML

Sunava Dutta on the Internet Explorer team has written about their support for a Native XMLHTTPRequest object in IE 7. He writes

I’m excited to mention that IE7 will support a scriptable native version of XMLHTTP. This can be instantiated using the same syntax across different browsers and decouples AJAX functionality from an ActiveX enabled environment.

What is XMLHTTP?

XMLHTTP was first introduced to the world as an ActiveX control in Internet Explorer 5.0. Over time, this object has been implemented by other browsing platforms, and is the cornerstone of “AJAX” web applications. The object allows web pages to send and receive XML (or other data) via the HTTP protocol. XMLHTTP makes it possible to create responsive web applications that do not require redownloading the entire page to display new data. Popular examples of AJAX applications include the Beta version of Windows Live Local, Microsoft Outlook Web Access, and Google’s GMail.

Charting the changes: XMLHTTP in IE7 vs. IE6

In IE6 and below, XMLHTTP is implemented as an ActiveX object provided by MSXML.

In IE7, XMLHTTP is now also exposed as a native script object. Users and organizations that choose to disable ActiveX controls can still use XMLHTTP based web applications. (Note that an organization may use Group Policy or IE Options to disable the new native XMLHTTP object if desired.) As part of our continuing security improvements we now allow clients to configure and customize a security policy of their choice and simultaneously retain functionality across key AJAX scenarios.

IE7’s implementation of the XMLHTTP object is consistent with that of other browsers, simplifying the task of cross-browser compatibility.  Using just a bit of script, it’s easy to build a function which works with any browser that supports XMLHTTP:

if (window.XMLHttpRequest){

          // If IE7, Mozilla, Safari, etc: Use native object
          var xmlHttp = new XMLHttpRequest()

}
else
{
if (window.ActiveXObject){

          // ...otherwise, use the ActiveX control for IE5.x and IE6
          var xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
          }

}

Note that IE7 will still support the legacy ActiveX implementation of XMLHTTP alongside the new native object, so pages currently using the ActiveX control will not require rewrites.

I wonder if anyone else sees the irony in Internet Explorer copying features from Firefox which were originally copied from IE?


 

Categories: Web Development

Brad Fitzpatrick, founder of Livejournal, has a blog post entitled Firefox bugs where he talks about some of the issues that led to the recent account hijackings on the LiveJournal service.

What I found most interesting were Brad's comments on Bug# 324253 - Do Something about the XSS issues that -moz-binding introduces in the Firefox bugzilla database. Brad wrote

Hello, this is Brad Fitzpatrick from LiveJournal.

Just to clear up any confusion: we do have a very strict HTML sanitizer. But we made the decision (years ago) to allow users to host CSS files offsite because... why not? It's just style declarations, right?

But then came along behavior, expression, -moz-binding, etc, etc...

Now CSS is full of JavaScript. Bleh.

But Internet Explorer has two huge advantages over Mozilla:

-- HttpOnly cookies (Bug 178993), which LiveJournal sponsored for Mozilla, over a year ago. Still not in tree.

-- same-origin restrictions, so an offsite behavior/binding can't mess with the calling node's DOM/Cookies/etc.

Either one of these would've saved our ass.

Now, I understand the need to innovate and add things like -moz-bindings, but please keep in mind the authors of webapps which are fighting a constant battle to improve their HTML sanitizers against new features which are added to browser.

What we'd REALLY love is some document meta tag or HTTP response header that declares the local document safe from all external scripts. HttpOnly cookies are such a beautiful idea, we'd be happy with just that, but Comment 10 is also a great proposal... being able to declare the trust level, effectively, of external resources. Then our HTML cleaner would just insert/remove the untrusted/trusted, respectively.

Cross site scripting attacks are a big problem for websites that allow users to provide HTML input. LiveJournal isn't the only major blogging site to have been hit by them, last year the 'samy is my hero' worm hit MySpace and caused some downtime for the service.

What I find interesting from Brad's post is how on the one hand having richer features in browsers is desirable (e.g. embedded Javascript in CSS) and on the other becomes a burden for developers building web apps who now have to worry that even stylesheets can contain malicious code.

The major browser vendors really need to do a better job here. I totally agree with one of the follow up comments in the bug which stated If Moz & Microsoft can agree on SSL/anti-phishing policy and an RSS icon, is consensus on scripting security policy too hard to imagine?. Collaborating on simple stuff like what orange icon to use for subscribing to feeds is nice, but areas like Web security could do with more standardization across browsers. I wonder if the WHAT WG is working on standardizing anything in this area... 


 

Categories: Web Development

January 23, 2006
@ 10:42 PM

I don't usually spam folks with links to amusing video clips that are making the rounds in email inboxes, but the video of Aussie comedy group Tripod performing their song "Make You Happy Tonight" struck a chord with me because I did what the song talks about this weekend.

The game in question was Star Wars: Knights of the Old Republic II. :)


 

One part of the XML vision that has always resonated with me is that it encourages people to build custom XML formats specific to their needs but allows them to map between languages using technologies like XSLT. However XML technologies like XSLT focus on mapping one kind of syntax for another. There is another school of thought from proponents of Semantic Web technologies like RDF, OWL, and DAML+OIL, etc that higher level mapping between the semantics of languages is a better approach. 

In previous posts such as RDF, The Semantic Web and Perpetual Motion Machines and More on RDF, The Semantic Web and Perpetual Motion Machines I've disagreed with the thinking of Semantic Web proponents because in the real world you have to mess with both syntactical mappings and semantic mappings. A great example of this is shown in the post entitled On the Quality of Metadata... by Stefano Mazzocchi where he writes

One thing we figured out a while ago is that merging two (or more) datasets with high quality metadata results in a new dataset with much lower quality metadata. The "measure" of this quality is just subjective and perceptual, but it's a constant thing: everytime we showed this to people that cared about the data more than the software we were writing, they could not understand why we were so excited about such a system, where clearly the data was so much poorer than what they were expecting.

We use the usual "this is just a prototype and the data mappings were done without much thinking" kind of excuse, just to calm them down, but now that I'm tasked to "do it better this time", I'm starting to feel a little weird because it might well be that we hit a general rule, one that is not a function on how much thinking you put in the data mappings or ontology crosswalks, and talking to Ben helped me understand why.

First, let's start noting that there is no practical and objective definition of metadata quality, yet there are patterns that do emerge. For example, at the most superficial level, coherence is considered a sign of good care and (here all the metadata lovers would agree) good care is what it takes for metadata to be good. Therefore, lack of coherence indicates lack of good care, which automatically resolves in bad metadata.

Note how the is nothing but a syllogism, yet, it's something that, rationally or not, comes up all the time.

This is very important. Why? Well, suppose you have two metadatasets, each of them very coherent and well polished about, say, music. The first encodes Artist names as "Beatles, The" or "Lennon, John", while the second encodes them as "The Beatles" and "John Lennon". Both datasets, independently, are very coherent: there is only one way to spell an artist/band name, but when the two are merged and the ontology crosswalk/map is done (either implicitly or explicitly), the result is that some songs will now be associated with "Beatles, The" and others with "The Beatles".

The result of merging two high quality datasets is, in general, another dataset with a higher "quantity" but a lower "quality" and, as you can see, the ontological crosswalks or mappings were done "right", where for "right" I mean that both sides of the ontological equation would have approved that "The Beatles" or "Beatles, The" are the band name that is associated with that song.

At this point, the fellow semantic web developers would say "pfff, of course you are running into trouble, you haven't used the same URI" and the fellow librarians would say "pff, of course, you haven't mapped them to a controlled vocabulary of artist names, what did you expect?".. deep inside, they are saying the same thing: you need to further link your metadata references "The Beatles" or "Beatles, The" to a common, hopefully globally unique identifier. The librarian shakes the semantic web advocate's hand, nodding vehemently and they are happy campers.

The problem Stefano has pointed out is that just being able to say that two items are semantically identical (i.e. an artist field in dataset A is the same as the 'band name' field in dataset B) doesn't mean you won't have to do some syntactic mapping as well (i.e. alter artist names of the form "ArtistName, The" to "The ArtistName") if you want an accurate mapping.

The example I tend to cull from in my personal experience is mapping between different XML syndication formats such as Atom 1.0 and RSS 2.0. Mapping between both formats isn't simply a case of saying <atom:published>  owl:sameAs <pubDate> or that <atom:author>  owl:sameAs <author> . In both cases, an application that understands how to process one format (e.g. an RSS 2.0 parser) would not be able to process the syntax of the equivalent  elements in the other (e.g. processing RFC 3339 dates as opposed to RFC 822 dates).

Proponents of Semantic Web technologies tend to gloss over these harsh realities of mapping between vocabularies in the real world. I've seen some claims that simply using XML technologies for mapping between XML vocabularies means you will need N2 transforms as opposed to needing 2N transforms if using SW technologies (Stefano mentions this in his post as has Ken Macleod in his post XML vs. RDF :: N × M vs. N + M). The explicit assumption here is that these vocabularies have similar data models and semantics which should be true otherwise a mapping wouldn't be possible. However the implicit assumption is that the syntax of each vocabulary is practically identical (e.g. same naming conventions, same date formats, etc) which this post provides a few examples where this is not the case. 

What I'd be interested in seeing is whether there is a way to get some of the benefits of Semantic Web technologies while acknowledging the need for syntactical mappings as well. Perhaps some weird hybrid of OWL and XSLT? One can only dream...


 

Categories: Web Development | XML

January 21, 2006
@ 02:26 AM

There's been a bunch of speculation about the recent DOJ requests for logs from the major search engines. Ken Moss of the MSN Search team tells their side of the story in his post Privacy and MSN Search. He writes

There’s been quite a frenzy of speculation over the past 24 hours regarding the request by the government for some data in relation to a child online protection lawsuit.  Obviously both privacy and child protection are both super important topics – so I’m glad this discussion is happening.

Some facts have been reported, but mostly I’ve seen a ton of speculation reported as facts.   I wanted to use this blog post to clarify some facts and to share with you what we are thinking here at MSN Search.

Let me start with this core principle statement: privacy of our customers is non-negotiable and something worth fighting to protect.

Now, on to the specifics.  

Over the summer we were subpoenaed by the DOJ regarding a lawsuit.  The subpoena requested that we produce data from our search service. We worked hard to scope the request to something that would be consistent with this principle.  The applicable parties to the case received this data, and  the parties agreed that the information specific to this case would remain confidential.  Specifically, we produced a random sample of pages from our index and some aggregated query logs that listed queries and how often they occurred .  Absolutely no personal data was involved.

With this data you:

        CAN see how frequently some query terms occurred.
        CANNOT look up an IP and see what they queried
        CANNOT look for users who queried for both “TERM A” and “TERM B”.

At MSN Search, we have strict guidelines in place to protect the privacy of our customers data, and I think you’ll agree that privacy was fully protected.  We tried to strike the right balance in a very sensitive matter.

I've been surprised at how much rampant speculation from blogs has been reported in mainstream media articles as facts without people getting information directly from the source.


 

Categories: Current Affairs

A user of RSS Bandit recently forwarded me a discussion on the atom-syntax mailing list which criticized some of our design decisions. In an email in the thread entitled Reader 'updated' semantics Tim Bray wrote

On Jan 10, 2006, at 9:07 AM, James M Snell wrote:

In RSS there is definite confusion on what constitutes an update. In
Atom it is very clear. If <updated> changes, the item has been updated.
No controversy at all.

Indeed. There's a word for behavior of RssBandit and Sage: WRONG. Atom provides a 100% unambiguous way to say "this is the same entry, but it's been changed and the publisher thinks the change is significant." Software that chooses to hide this fact from users is broken - arguably dangerous to those who depend on receiving timely and accurate information - and should not be used. -Tim

People who write technology specifications often have good intentions but unfortunately they often aren't implementers of the specs they are creating. This leads to disconnects between reality and what is actually in the spec.

The problems with updates to blog posts is straightforward. There are minor updates which don't warrant signalling to the user such as typos being fixed (e.g. 12 of 13 miner survive mine collapse changed to 12 of 13 miners survive mine collapse) and those which do because they add significant changes to the story (e.g. 12 of 13 miners survive mine collapse changed to 12 of 13 miners survive killed in mine collapse). 

James Snell is right that it is ambiguous how to detect this in RSS but not in Atom due to the existence of the atom:updated element. The Atom spec states

The "atom:updated" element is a Date construct indicating the most recent instant in time when an entry or feed was modified in a way the publisher considers significant. Therefore, not all modifications necessarily result in a changed atom:updated value.

On paper it sounds like this solves the problem. On paper, it does. However for this to work correctly, weblog software now need to include an option such as 'Indicate that this change is significant' when users edit posts. Without such an option, the software cannot correctly support the atom:updated element. Since I haven't found any mainstream tools that support this functionality, I haven't bothered to implement a feature which is likely to annoy users more often than be useful since many people edit their blog posts in ways that don't warrant alerting the user.

However I do plan to add features for indicating when posts have changed in unambiguous scenarios such as when new comments are added to a blog post of interest to the user. The question I have for our users is how would you like this indicated in the RSS Bandit user interface?


 

Categories: RSS Bandit

January 20, 2006
@ 02:27 AM

Richard Searle has a blog post entitled The SOAP retrieval anti-pattern where he writes

I have seen systems that use SOAP based Web Services only to implement data retrievals.

The concept is to provide a standardized mechanism for external systems to retrieve data from some master system that controls the data of interest. This has value in that it enforces a decoupling from the master systems data model. It can also be easy to manage and control than the alternative to allowing the consuming systems to directly query the master systems database tables.
...
The selection of a SOAP interface over a RESTful interface is also questionable. The SOAP interface has a few (generally one) parameters and then returns a large object. Such an interface with a single parameter has a trivial representation as a GET. A multi-parameter call can also be trivial mapped if the parameters define a conceptual heirarchy (eg the ids of a company and one of its employees).

Such a GET interface avoids all the complexities of SOAP, WSDL, etc. AJAX and XForm clients can trivially and directly use the interface. A browser can use XSLT to provide a human readable representation.

Performance can easily be boosted by interposing a web cache. Such optimization would probably occur essentially automatically since any significant site would already have caching. Such caching can be further enhanced by using the HTTP header timestamps to compare against the updated timestamps in the master system tables.

I agree 100%, web services that use SOAP solely for data retrieval are usually a sign that the designers of the service need to get a clue when it comes to building distributed applications for the Web.

PS: I realize that my employer has been guilty of this in the past. In fact, we've been known to do this at MSN as well although at least we also provided RESTful interfaces to the service in that instance. ;)


 

Categories: XML Web Services

Since writing my post Microformats vs. XML: Was the XML Vision Wrong?, I've come across some more food for thought in the appropriateness of using microformats over XML formats. The real-world test case I use when thinking about choosing microformats over XML is whether instead of having an HTML web page for my blog and an Atom/RSS feed, I should instead have a single HTML  page with <div class="rss:item"> or <h3 class="atom:title"> embedded in it. To me this seems like a gross hack but I've seen lots of people comment on how this seems like a great idea to them. Given that I hadn't encountered universal disdain for this idea, I decided to explore further and look for technical arguments for and against both approaches.

I found quite a few discussions on the how and why microformats came about in articles such as The Microformats Primer in the Digital Web Magazine and Introduction to Microformats in the Microformats wiki. However I hadn't seen many in-depth technical arguments of why they were better than XML formats until recently. 

In a comment in response to my Microformats vs. XML: Was the XML Vision Wrong?, Mark Pilgrim wrote

Before microformats had a home page, a blog, a wiki, a charismatic leader, and a cool name, I was against using XHTML for syndication for a number of reasons.

http://diveintomark.org/archives/2002/11/26/syndication_is_not_publication

I had several basic arguments:

1. XHTML-based syndication required well-formed semantic XHTML with a particular structure, and was therefore doomed to failure. My experience in the last 3+ years with both feed parsing and microformats parsing has convinced me that this was incredibly naive on my part. Microformats may be *easier* to accomplish with semantic XHTML (just like accessibility is easier in many ways if you're using XHTML + CSS), but you can be embed structured data in really awful existing HTML markup, without migrating to "semantic XHTML" at all.

2. Bandwidth. Feeds are generally smaller than their corresponding HTML pages (even full content feeds), because they don't contain any of the extra fluff that people put on web pages (headers, footers, blogrolls, etc.) And feeds only change when actual content changes, whereas web pages can change for any number of valid reasons that don't involve changes to the content a feed consumer would be interested in. This is still valid, and I don't see it going away anytime soon.

3. The full-vs-partial content debate. Lots of people who publish full content on web pages (including their home page) want to publish only partial content in feeds. The rise of spam blogs that automatedly steal content from full-content feeds and republish them (with ads) has only intensified this debate.

4. Edge cases. Hand-crafted feed summaries. Dates in Latin. Feed-only content. I think these can be handled by microformats or successfully ignored. For example, machine-readable dates can be encoded in the title attribute of the human-readable date. Hand-crafted summaries can be published on web pages and marked up appropriately. Feed-only content can just be ignored; few people do it and it goes against one of the core microformats principles that I now agree with: if it's not human-readable in a browser, it's worthless or will become worthless (out of sync) over time.

I tend to agree with Mark's conclusions. The main issue with using microformats for syndication instead of RSS/Atom feeds is wasted bandwidth since web pages tend to contain more stuff than feeds and change more often.

Norm Walsh raises a few other good points on the trade offs being made when choosing microformats over XML in his post Supporting Microformats where he writes

Microformats (and architectural forms, and all the other names under which this technique has been invented) take this one step further by standardizing some of these attribute values and possibly even some combination of element types and attribute values in one or more content models.

This technique has some stellar advantages: it's relatively easy to explain and the fallback is natural and obvious, new code can be written to use this “extra” information without any change being required to existing applications, they just ignore it.

Despite how compelling those advantages are, there are some pretty serious drawbacks associated with microformats as well. Adding hCalendar support to my itineraries page reinforced several of them.

  1. They're not very flexible. While I was able to add hCalendar to the overall itinerary page, I can't add it to the individual pages because they don't use the right markup. I'm not using <div> and <span> to markup the individual appointments, so I can't add hCalendar to them.

  2. I don't think they'll scale very well. Microformats rely on the existing extensibility point, the role or class attribute. As such, they consume that extensibility point, leaving me without one for any other use I may have.

  3. They're devilishly hard to validate. DTDs and W3C XML Schema are right out the door for validating microformats. Of course, Schematron (and other rule-based validation languages) can do it, but most of us are used to using grammar-based validation on a daily basis and we're likely to forget the extra step of running Schematron validation.

    It's interesting that RELAX NG can almost, but not quite, do it. RELAX NG has no difficulty distinguishing between two patterns based on an attribute value, but you can't use those two patterns in an interleave pattern. So the general case, where you want to say that the content of one of these special elements is “an <abbr> with class="dtstart" interleaved with an <abbr> with class="dtend" interleaved with…”, you're out of luck. If you can limit the content to something that doesn't require interleaving, you can use RELAX NG for your particular application, but most of the microformats I've seen use interleaving in the general case.

    Is validation really important? Well, I have well over a decade of experience with markup languages at this point and I was reminded just last week that I can't be relied upon to write a simple HTML document without markup errors if I don't validate it. If they can't be validated, they will often be incorrect.

The complexity of validating microformats isn't something I'd considered in my original investigation but is a valid point. As a developer of an RSS aggregator, I've found the existence of the Feed Validator to be an immense help in tracking down issues. Not having the luxury of being able to validate feeds would make building an aggregator a lot harder and a lot less fun. 

I'll continue to pay attention to this discussion but for now microformats will remain in the "gross hack" bucket for me.


 

Categories: XML

January 18, 2006
@ 12:03 PM

Once people find out that they can use tools like ecto, Blogjet or W.Bloggar to manage their blog on MSN Spaces via the MetaWeblog API, they often ask me why we don't have something equivalent to the Flickr API so they can do the same for the photos they have in their space. 

My questions for folks out there is whether this is something you'd like to see? Do you want to be able to create, edit and delete photos and photo albums in your Space using desktop tools? If so, what kind of tools do you have in mind?

If you are a developer, what kind of API would you like to see? Should it use XML-RPC, SOAP or REST? Do you want a web service or a DLL?

Let me know what you think.


 

Categories: Windows Live | XML Web Services

A few weeks ago I wrote a blog post entitled Windows Live Fremont: A Social Marketplace about the upcoming social marketplace coming from Microsoft. Since then the project has been renamed to Windows Live Expo and the product team is now blogging.

The team blog is located at http://spaces.msn.com/members/teamexpo and they've already posted an entry addressing their most frequently asked question, "So when is it launching then?".


 

Categories: Windows Live

It's been about three years since I first started on RSS Bandit and it doesn't seem like I've run out of steam yet. Every release the application seems to become more popular and last month we finally broke 100,000 downloads in a single month. The time has come for me to start thinking about what I'd like to see in the next family of releases and elicit feedback from our users. The next release is codenamed Jubilee.

Below are a list of feature areas where I'd like to see us work on over the next few months

  1. Extensibility Framework to Enable Richer Plugins: We currently use the IBlogExtension plugin mechanism which allows one to add new context menu items when right-clicking on an item in the list view.  I've used this to implement features such as the "Email  This" and "Post to del.icio.us" which ship with the default install. Torsten implemented "Send to OneNote" using this mechanism as well.

    The next step is to enable richer plugins so people can add their own menu items, toolbar buttons as well as processing steps for received feed items. Torsten used a prototype of this functionality to add Ad blocking features to RSS Bandit. I'd like to add weblog posting functionality using such a plugin model instead of making it a core part of the application since many of our users may just want a reader and not a weblog editor as well.

  2. Comment Watching: For many blogs such as Slashdot and Mini-Microsoft, the comments are often more interesting than the actual blog post. In the next version we'd like to make it easier to not only be updated when new blog posts appear in a blog you are interested in but also when new comments show up in a post you are interested in as well.

  3. Provide better support for podcasts and other rich media in feeds: With the growing popularity of podcasts, we plan to make it easier for users to discover and download rich media from their feeds. This doesn't just mean supporting downloading media files in the background but also supporting better ways of displaying rich media in our default views. Examples of what we have in mind can be taken from  the post Why should text have all the fun? in the Google Reader blog. We should have richer experiences for photo feeds, audio feeds and video feeds.

  4. Thumbs Up & Thumbs Down for Filtering and Suggesting New Feeds:  big problem with using a news aggregator is that it eventually leads to information overload. One tends to subscribe to feeds which produce lots of content of which only a subset is of interest to the user. At the other extreme, users often find it difficult to find new content that matches their interests. Both of these problems can be solved by providing a mechanism which allows the user to rate feeds or entries that are of interest to the user. A thumbs up or thumbs down rating similar to what systems such as TiVo use today. This system can be used to highlight items of interest from subscribed feeds to the user or suggest new feeds using a suggestion service such as AmphetaRate.

  5. Applying search filters to the list view: In certain cases a user may want to perform the equivalent of a search on the items currently being displayed in the list view without resorting to an explicit search. An example is showing all the unread items in the list view. RSS Bandit should provide a way to apply filters to the items currently being displayed in the list view either by providing certain predefined filters or providing the option to apply search folder queries as filters.

These are just some of the ideas I've had. There are also the dozens of feature requests we've received from our users over the past couple of months which we'll use as fodder for ideas for the Jubilee release.


 

Categories: RSS Bandit

January 15, 2006
@ 08:08 PM

Dave Winer made the following insightful observation in a recent blog post

Jeremy Zawodny, who works at Yahoo, says that Google is Yahoo 2.0. Very clever, and there's a lot of truth to it, but watch out, that's not a very good place to be. That's how Microsoft came to dominate the PC software industry. By shipping (following the analogy) WordPerfect 2.0 (and WordStar, MacWrite and Multimate) and dBASE 2.0 (by acquiring FoxBase) and Lotus 2.0 (also known as Excel). It's better to produce your own 2.0s, as Microsoft's vanquished competitors would likely tell you.

Microsoft's corporate culture is very much about looking at an established market leader then building a competing product which is (i) integrated with a family of Microsoft products and (ii) fixes some of the weakneses in the competitors offerings. The company even came up with the buzzword Integrated Innovation to describe some of these aspects of its corporate strategy. 

Going further, one could argue that when Microsoft does try to push disruptive new ideas the lack of a competitor to focus on leads to floundering by the product teams involved. Projects such as WinFS, Netdocs and even Hailstorm can be cited as examples of projects that floundered due to the lack of a competitive focus.

New employees to Microsoft are sometimes frustrated by this aspect of Microsoft's culture. For some it's hard to acknowledge that working at Microsoft isn't about building cool, new stuff but about building cooler versions of products offered by our competitors which integrate well with other Microsoft products. This ethos not only brought us Microsoft Office which Dave mentions in his post but also newer examples including XBox (a better Playstation), C# (a better Java) and MSN Spaces (a better TypePad/Blogger/LiveJournal). 

The main reason I'm writing this is so I don't have to keep explaining it to people, I can just give them a link to this blog post next time it comes up.


 

Categories: Life in the B0rg Cube

A few members of the Hotmail Windows Live Mail team have been doing some writing about scalability recently

From the ACM Queue article A Conversation with Phil Smoot

BF Can you give us some sense of just how big Hotmail is and what the challenges of dealing with something that size are?

PS Hotmail is a service consisting of thousands of machines and multiple petabytes of data. It executes billions of transactions over hundreds of applications agglomerated over nine years—services that are built on services that are built on services. Some of the challenges are keeping the site running: namely dealing with abuse and spam; keeping an aggressive, Internet-style pace of shipping features and functionality every three and six months; and planning how to release complex changes over a set of multiple releases.

QA is a challenge in the sense that mimicking Internet loads on our QA lab machines is a hard engineering problem. The production site consists of hundreds of services deployed over multiple years, and the QA lab is relatively small, so re-creating a part of the environment or a particular issue in the QA lab in a timely fashion is a hard problem. Manageability is a challenge in that you want to keep your administrative headcount flat as you scale out the number of machines.

BF I have this sense that the challenges don’t scale uniformly. In other words, are there certain scaling points where the problem just looks completely different from how it looked before? Are there things that are just fundamentally different about managing tens of thousands of systems compared with managing thousands or hundreds?

PS Sort of, but we tend to think that if you can manage five servers you should be able to manage tens of thousands of servers and hundreds of thousands of servers just by having everything fully automated—and that all the automation hooks need to be built in the service from the get-go. Deployment of bits is an example of code that needs to be automated. You don’t want your administrators touching individual boxes making manual changes. But on the other side, we have roll-out plans for deployment that smaller services probably would not have to consider. For example, when we roll out a new version of a service to the site, we don’t flip the whole site at once.

We do some staging, where we’ll validate the new version on a server and then roll it out to 10 servers and then to 100 servers and then to 1,000 servers—until we get it across the site. This leads to another interesting problem, which is versioning: the notion that you have to have multiple versions of software running across the sites at the same time. That is, version N and N+1 clients need to be able to talk to version N and N+1 servers and N and N+1 data formats. That problem arises as you roll out new versions or as you try different configurations or tunings across the site.

Another hard problem is load balancing across the site. That is, ensuring that user transactions and storage capacity are equally distributed over all the nodes in the system without any particular set of nodes getting too hot.

From the the blog post entitled Issues with .NET Frameworks 2.0by Walter Hsueh

Our team is tackling the scale issues, delving deep into the CLR and understanding its behavior.  We've identified at least two issues in .NET Frameworks 2.0 that are "low-hanging fruit", and are hunting for more.

1a)  Regular Expressions can be very expensive.  Certain (unintended and intended) strings may cause RegExes to exhibit exponential behavior.  We've taken several hotfixes for this.  RegExes are so handy, but devs really need to understand how they work; we've gotten bitten by them.

1b)  Designing an AJAX-style browser application (like most engineering problems) involves trading one problem for another.  We can choose to shift the application burden from the client onto the server.  In the case of RegExes, it might make sense to move them to the client (where CPU can be freely used) instead of having them run on the server (where you have to share).  WindowsLive Mail made this tradeoff in one case.

2)  Managed Thread Local Storage (TLS) is expensive.  There is a global lock in the Whidbey RTM implementation of Thread.GetData/Thread.SetData which causes scalability issues.  Recommendation is to use the [ThreadStatic] attribute on static class variables.  Our RPS went up, our CPU % went down, context switches dropped by 50%, and lock contentions dropped by over 80%.  Good stuff.

Our devs have also started migrating some of our services to Whidbey and they've also found some interesting issues with regards to performance. It'd probably would be a good idea to get together some sort of lessons learned while building mega-scale services on the .NET Framework article together.


 

Categories: Windows Live

Recently we had some availability issues with MSN Spaces which have caused some  complaints from some of our loyal customers. Mike Torres addresses these issues in his post Performance & uptime which states

One of the hardest parts about running a worldwide service with tens of millions of users is maintaining service performance and overall uptime.  As a matter of fact, a member of our team (Dare) had some thoughts about this not too long ago.  While we're constantly working towards 100% availability and providing the world's fastest service, sometimes we run into snags along the way that impact your experience with MSN Spaces.
 
That seems to have happened yesterday.  For the networking people out there, it turned out to be a problem with a load balancing device resulting in packet loss (hence the overall slowness of the site).  After some investigation, the team was able to determine the cause and restore the site back to normal.
 
Rest assured that as soon as the service slows down even a little bit, or it becomes more difficult to reach individual spaces, we're immediately aware of it here within our service operations center.  Within minutes we have people working hard to restore things to their normal speedy and reliable state.  Of course, sometimes it takes a little while to get things back to normal - but don't believe for a second that we aren't aware or concerned about the problem.  As a matter of fact, almost everyone on our team uses Spaces daily (surprise!) so we are just as frustrated as you are when things slow down.  So I'm personally sorry if you were frustrated yesterday - I know I was!  We are going to continue to do everything we can to minimize any impact on your experience...  most of the time we'll be successful and every once in a while we won't.  But it's our highest priority and you have a firm commitment from us to do so.

I'm glad to seeing us be more transparent about what's going on with our services. This is a good step.


 

Categories: MSN

Over a year ago, I wrote a blog post entitled SGML on the Web: A Failed Dream? where I asked whether the original vision of XML had failed. Below are excerpts from that post

The people who got together to produce the XML 1.0 recommendation where motivated to do this because they saw a need for SGML on the Web. Specifically  
their discussions focused on two general areas:
  • Classes of software applications for which HTML was an inadequate information format
  • Aspects of the SGML standard itself that impeded SGML's acceptance as a widespread information technology

The first discussion established the need for SGML on the web. By articulating worthwhile, even mission-critical work that could be done on the web if there were a suitable information format, the SGML experts hoped to justify SGML on the web with some compelling business cases.

The second discussion raised the thornier issue of how to "fix" SGML so that it was suitable for the web.

And thus XML was born.
...
The W3C's attempts to get people to author XML directly on the Web have mostly failed as can be seen by the dismal adoption rate of XHTML and in fact many [including myself] have come to the conclusion that the costs of adopting XHTML compared to the benefits are too low if not non-existent. There was once an expectation that content producers would be able to place documents conformant to their own XML vocabularies on the Web and then display would entirely be handled by stylesheets but this is yet to become widespread. In fact, at least one member of a W3C working group has called this a bad practice since it means that User Agents that aren't sophisticated enough to understand style sheets are left out in the cold.

Interestingly enough although XML has not been as successfully as its originators initially expected as a markup language for authoring documents on the Web it has found significant success as the successor to the Comma Separated Value (CSV) File Format. XML's primary usage on the Web and even within internal networks is for exchanging machine generated, structured data between applications. Speculatively, the largest usage of XML on the Web today is RSS and it conforms to this pattern.

These thoughts were recently rekindled when reading Tim Bray's recent post Don’t Invent XML Languages where Tim Bray argues that people should stop designing new XML formats. For designing new data formats for the Web, Tim Bray advocates the use of Microformats instead of XML.

The vision behind microformats is completely different from the XML vision. The original XML inventers started with the premise that HTML is not expressive enough to describe every possible document type that would be exchanged on the Web. Proponents of microformats argue that one can embed additional semantics over HTML and thus HTML is expressive enough to represent every possible document type that could be exchanged on the Web. I've always considered it a gross hack to think that instead of having an HTML web page for my blog and an Atom/RSS feed, instead I should have a single HTML  page with <div class="rss:item"> or <h3 class="atom:title"> embedded in it instead. However given that one of the inventors of XML (Tim Bray) is now advocating this approach, I wonder if I'm simply clinging to old ways and have become the kind of intellectual dinosaur I bemoan. 


 

Categories: XML

The documentation for the implementation of the MetaWeblog API for MSN Spaces is now available on MSDN.

Developers interested in building applications that can be used to create, edit or delete blog posts on a space should read the documentation about the MetaWeblogAPI and MSN Spaces. Questions about the API should be directed to the MSN Spaces Development forum.

PS: I forgot to blog about this over the holidays but astute folks who've been watching http://msdn.microsoft.com/msn already found this out.


 

Categories: Windows Live

January 11, 2006
@ 03:54 AM
 The following is a tutorial on posting to your blog on MSN Spaces using the Flickr.
  1. Create a Space on http://spaces.msn.com if you don't have one

  2. Go to 'Edit Your Space->Settings->Email Publishing'

  3. Turn on Email Publishing (screenshot below)

  4. Choose a secret word (screenshot below)

  5. Create an account on Yahoo or Flickr if you don't have one

  6. Go to the "Add a Weblog" page at http://flickr.com/blogs_add_metaweblogapi.gne

  7. Specify the API end point as https://storage.msn.com/StorageService/MetaWeblog.rpc. The user name is the name of your space (e.g. I use 'carnage4life' because the URL of my space is http://spaces.msn.com/members/carnage4life). The password is the secret word you selected when you turned on Email Publishing on your space.

  8. Click "Next", then Click "All Done"

  9. Go ahead and create blog posts on your space directly from Flickr. You can see the test post I made to my space from Flickr here.

PS: Thanks to the Yahoo! folks on the Flickr team who helped debug the issues that prevented this from working when we first shipped our MetaWeblog API support.


 

Categories: Windows Live

January 10, 2006
@ 04:27 PM

In his blog post Windows DVD Maker Not Managed Code Charles Cook writes

Last week Eric Gunnerson mentioned that he has been working on an application for Vista: Windows DVD Maker. Yesterday he posted a FAQ for the application. The answer to question 4 was disappointing:

4: Is DVD Maker written in managed code?

A: No. Yes, it is ironic that I spent so much time on C# and then spent a ton of time writing something in C++ code. Everybody on the team is a believer in managed code, and we hope we'll be able to use it for future projects.

Given that there is a whole new set of APIs in Vista for writing managed applications - Avalon, WinFX, etc - why has a new self-contained app like this been written in unmanaged C++? Actually writing real applications, instead of just samples, with the new managed APIs would be far more convincing than any amount of hype from Robert Scoble.

I agree with Charles. If Microsoft believed in managed code, we would build applications using the .NET Framework. We do.

In his post Cha-Cha-Changes Dan Fernandez wrote

The Microsoft's not using Managed Code Myth
One of the biggest challenges in my old job was that customers didn't think Microsoft was using managed code. Well, the truth is that we have a good amount of managed code in the three years that the .NET Framework has been released including operating systems, client tools, Web properties, and Intranet applications. For those of you that refuse to believe, here's an estimate of the lines of managed code in Microsoft applications that I got permission to blog about:

  • Visual Studio 2005: 7.5 million lines
  • SQL Server 2005: 3 million lines
  • BizTalk Server: 2 million lines
  • Visual Studio Team System: 1.7 million lines
  • Windows Presentation Foundation: 900K lines
  • Windows Sharepoint Services: 750K lines
  • Expression Interactive Designer: 250K lines  
  • Sharepoint Portal Server: 200K lines
  • Content Management Server: 100K lines

We also use managed code for the online services that power various MSN Windows Live properties from Windows Live Messenger and Windows Live Mail to Live.com and Windows Live Expo. I find it surprising that people continue to think that we don't use managed code at Microsoft.


 

Categories: Life in the B0rg Cube

There are a couple of contentious topics I tend not to bother debating online because people on both sides of the argument tend to have entrenched positions. The debate on abortion in the U.S. is an example of such a topic. Another one for me is DRM and it's sister topics piracy copyright infringement and file sharing networks.

Shelley Powers doesn't seem to have my aversion for these topics and has written an insightful post entitled Debate on DRM which contains the following excerpt

Doc Searls points to a weblog post by the Guardian Unlimited’s Lloyd Shepherd on DRM and says it’s one of the most depressing things he’s read. Shepherd wrote:

I’m not going to pick a fight with the Cory Doctorows of the world because they’re far more informed and cleverer than me, but let’s face it: we’re going to have to have some DRM. At some level, there has to be an appropriate level of control over content to make it economically feasible for people to produce it at anything like an industrial level. And on the other side of things, it’s clear that the people who make the consumer technology that ordinary people actually use - the Microsofts and Apples of the world - have already accepted and embraced this. The argument has already moved on.

Doc points to others making arguments in refutation of Shepherd’s thesis (Tom Coates and Julian Bond), and ends his post with:

We need to do with video what we’ve started doing with music: building a new and independent industry...


I don’t see how DRM necessarily disables independents from continuing their efforts. Apple has invested in iTunes and iPods, but one can still listen to other formats and subscribe to other services from a Mac. In fact, what Shepard is proposing is that we accept the fact that companies like Apple and Google and Microsoft and Yahoo are going to have these mechanisms in place, and what can we do to ensure we continue to have options on our desktops?

There’s another issue though that’s of importance to me in that the concept of debate being debated (how’s this for a circular discussion). The Cluetrain debate method consists of throwing pithy phrases at each other over (pick one): spicey noodles in Silicon Valley; a glass of ale in London; something with bread in Paris; a Boston conference; donuts in New York. He or she who ends up with the most attention (however attention is measured) wins.

In Doc’s weblog comments, I wrote:

What debate, though? Those of us who have pointed out serious concerns with Creative Commons (even demonstrating problems) are ignored by the creative commons people. Doc, you don’t debate. You repeat the same mantra over and over again: DRM is bad, openness is good. Long live the open internet (all the while you cover your ears with your hands and hum “We are the Champions” by Queen under your breath).

Seems to me that Lloyd Shepherd is having the debate you want. He’s saying, DRM is here, it’s real, so now how are we going to come up with something that benefits all of us?

Turning around going, “Bad DRM! Bad!” followed by pointing to other people going “Bad DRM! Bad!” is not an effective response. Neither is saying how unprofitable it is, when we only have to turn our little eyeballs over to iTunes to generate an “Oh, yeah?”

Look at the arguments in the comments to Shepherd’s post. He is saying that as a business model, we’re seeing DRM work. The argument back is that the technology fails. He’s talking ‘business’ and the response is ‘technology’. And when he tries to return to business, the people keep going back to technology (with cries of ‘…doomed to failure! Darknet!’).

The CES you went to showed that DRM is happening. So now, what can we do to have input into this to ensure that we’re not left with orphaned content if a particular DRM goes belly up? That we have fair use of the material? If it is going to exist, what can we do to ensure we’re not all stuck with betamax when the world goes VHS?

Rumbles of ‘darknet’, pointers to music stores that feature few popular artists, and clumsy geeky software as well as loud hyperbole from what is a small majority does not make a ‘debate’. Debate is acknowledging what the other ’side’ is saying, and responding accordingly. Debate requires some openness.

There is reason to be concerned about DRM (Digital Rights Management–using technology to restrict access to specific types of media). If operating systems begin to limit what we can and cannot use to view or create certain types of media; if search engine companies restrict access to specific types of files; if commercial competition means that me having an iPod, as compared to some other device, limits the music or services at other companies I have access to, we are at risk in seeing certain components of the internet torn into pieces and portioned off to the highest bidders.

But by saying that all DRM is evil and that only recourse we have is to keep the Internet completely free, and only with independents will we win and we will win, oh yes we will–this not only disregards the actuality of what’s happening now, it also disregards that at times, DRM can be helpful for those not as well versed in internet technologies.

I tend to agree with Shelley 100% [as usual]. As much as the geeks hate to admit it, DRM is here to stay. The iTunes/iPod combination has shown that consumers will accept DRM in situations where they are provided value and that the business model is profitable. Secondly,  as Lloyd Shepherd points out,  the major technology companies from Microsoft and Intel to Apple and Google are all building support for DRM in their products for purchasing and/or consuming digital media.

Absolutists who argue that DRM is evil and should be shunned are ignoring reality. I especially despise arguments that are little more than throwing around dogmatic, pithy phrases such as "information wants to be free" and other such mindless drivel. If you really think DRM is the wrong direction, then create the right direction by proposing or building a workable alternative that allows content creators to get paid without losing their rights. I'd like to see more discussions in the blogosphere like Tim Bray's On Selling Art instead of the kind of crud perpetuated by people like Cory Doctorow which made me stop reading Boing Boing.

PS: There's also a good discussion going on in the comments to Shelley's blog post. Check it out.


 

Categories: Technology

January 10, 2006
@ 03:23 PM

I found out about http://www.google.com/ig/dell via John Batelle's blog last night. It looks like Google now has a personalized home page for users of Dell computers.

During the Web 2.0 conference, Sergey Brin commented that "Google innovates with technology not with business". I don't know about that. The AdSense/AdWords market is business genius and the fact that they snagged the AOL deal from more experienced companies like Microsoft shows that behind the mask of technical naivette is a company with strong business sense.

If I was competing with a company that produced the dominant operating system and Web browser used to access my service, I'd figure ways to disintermediate them. Perhaps by making deals with OEMs so that all the defaults for online services such as search which ships on PCs point to my services. Maybe I could incentivize them to do this if there is the promise of recurring revenue by giving them a cut of ad revenue from searches performed on said portal pages.

Of course, this may not be what http://www.google.com/ig/dell is for, but if it isn't I wouldn't be surprised if that doesn't eventually become the case.


 

Categories: Current Affairs

Web usability guru, Jakob Nielsen, has written an article entitled Search Engines as Leeches on the Web which begins

Summary: Search engines extract too much of the Web's value, leaving too little for the websites that actually create the content. Liberation from search dependency is a strategic imperative for both websites and software vendors.

I worry that search engines are sucking out too much of the Web's value, acting as leeches on companies that create the very source materials the search engines index.We've known since AltaVista's launch in 1995 that search is one of the Web’s most important services. Users rely on search to find what they want among the teeming masses of pages. Recently, however, people have begun using search engines as answer engines to directly access what they want -- often without truly engaging with the websites that provide (and pay for) the services..

I've seen some people claim that "Google is Evil" is the new meme among web geeks and this looks like a manifestation of this trend. It looks like the more money Google makes, the more people resent them. Alas, that is the price of success.


 

The Wall Street Journal has an article entitled The Men Who Came To Dinner, and What They Said About Email which contains the following excerpt

"Email is one of the liveliest niches in tech right now. Google, Microsoft and Yahoo all view it as a key to winning new customers and making money off current ones. And so they are innovating with new email programs and services all the time. Since all three companies' email teams are in my neck of the woods, I thought it would be fun to have the heads of each team come over one night for dinner and conversation. The three companies were good sports and agreed, in part because I said I wasn't interested in a shouting match.

As it happened, Google's Paul Buchheit, 29 years old; Kevin Doerr, 39, of Microsoft (no relation to the venture capitalist) and Ethan Diamond, 34, of Yahoo were all on their best behavior. Whatever they may say about their competitors at work, at my table they were gracious and complimentary. Gentle teasing was about as far as they would go.

The evening began with even the Microsoft and Yahoo delegates agreeing that much of the current excitement in the email world can be traced back to last year's debut of Mr. Buchheit's Gmail. The program had a fast user interface with a fresh new look, along with a then-remarkable gigabyte of free storage.

Mr. Buchheit said he started working on Gmail after observing that other email programs were getting worse, not better. Microsoft's Mr. Doerr said that at his company, Gmail was a thunderbolt. "You guys woke us up," he told Mr. Buchheit. Yahoo's Mr. Diamond, then at a startup with its own hot, new email program, said Gmail was the final impetus that Yahoo needed to buy his company.

Mr. Buchheit responded with a victory lap. "We were trying to make the email experience better for our users," he said. "We ended up making it better for yours, too."

The evening wasn't all a Gmail love-in, though. The Microsoft and Yahoo representatives said their many millions of users might not accept some of Gmail's departures from email norms, such as the way the program groups messages into "conversations." The two men also razzed Mr. Buchheit a bit, saying that it had been easy for Google to promise a lot of storage to its users because it carefully controlled how many users Gmail would have by requiring an invitation to get an account."

As someone who has to build services which compete with Google's the last statement in the above excerpt resonates with me. I tend to think that in a number of their products such as GMail, Google Talk and even Google Pack, the folks at Google are practising the lessons learned from articles such as Joel Spolsky's Fire & Motion. In the article Joel Spolsky argues that large companies like Microsoft tend to create technological imperatives that force competitors to respond and keep up thus preventing them from focusing on new features.

Examples of Google practising Fire & Motion are somewhat different from what Joel Spolsky describes in his article but the ideas are similar.  Google tends to create initiatives that are either much more expensive for their competitors than them to provide (e.g. giving users gigabytes of storage space for email but limiting sign ups on the service) or would be detrimental to their market share to compete with (e.g. allowing non-Google clients to access the Google Talk servers). I've had co-workers joke that for every dollar Google spends on some of its efforts, its competitors are forced to spend five to ten dollars. Here is a back of the envelope calculation that illustrates this point.

Email ServiceEstimated Number of UsersInbox SizeTotal Storage provided
GMail 5 million2.5GB12.5 petabytes
Yahoo! Mail219 million
1GB
219 petabytes
HotMail221 million
0.25 GB
55.25 petabytes

Of course, these numbers are off because they are based on estimates. Also I think the Hotmail numbers should be somewhat lower since I haven't confirmed that we've rolled out the 250MB inbox to every market. The point should still be clear though, Google has forced its competitors such as Microsoft and Yahoo! to spend orders of magnitude more money on storage which distracts them from competing with Google in the places where it is strong. More importantly its competitors have to provide from 10 to 20 times the total amount of storage Google is providing just to be competitive. 

This is often the dilemma when competing with Google. On the one hand, you have customers who rightly point out that Google is more generous but on the other the fact is that it costs us a whole lot more to do the things Google does since we have a whole lot more users than they do. The cool things about this is that it forces to be very imaginative about how we are competitive in the market place and challenges are always fun.  


 

Categories: Life in the B0rg Cube

January 6, 2006
@ 07:10 PM

It's a new year so it's time to make some more promises to myself which I'll likely break in a few weeks. This time I thought it would help if I wrote them up in public so I'd be better motivated to actually achieve them.

  1. Learn a New Programming Language: When I was in school, I got to explore a new programming language every couple of months. I used C, C++, Java, Smalltalk, JavaScript and Lisp while in school. In recent years I've been programming exclusively in C# although I've started toying with JavaScript again due to the AJAX hype. I've decided that I want to learn a dynamic programming language like Python or Ruby. Given that the .NET Framework now has IronPython, I suspect Python is what I'll end up picking up. Since we plan to greatly improve the plugin story for RSS Bandit, I may get some practical experience by building new plugins for RSS Bandit using IronPython.

  2. Write More Articles: Looking back on various articles I've written it's clear that since joining MSN and getting a new girlfriend my output has reduced. I only wrote two articles last year compared to a minimum of five or six in previous years. I've already tried to start on the right foot by promising an article on my Seattle Movie Finder page for the O'Reilly Network. My big goal is to update my C# From a Java Developer's Perspective article to account for Whidbey (C# 2.0) and Tiger (Java 5.0). The article still gets thousands of hits a month even though its over four years old.

  3. Come Up With New Career Goals: When I was in school, my dream was to become a well-known technology guru like Don Box or Scott Meyers then get paid consulting gigs to be the hero that comes in to fix peoples problems and tell them how to build their software. Since then, I've seen a lot of the people who I once idolized end up working in the b0rg cube. In conversations with Don Box, he's mentioned that the life isn't as glamorous as I assumed.

    It's coming on my fourth year at Microsoft and I'm not clear what my long term career goals are anymore. I love my current job; I get the build cool stuff that impacts millions of people and work with a bunch of smart people. However I don't have a clear idea of where this leads. In recent months I've gotten pings from recruiters from AMZN and GOOG, which I've discounted but the funny thing is if I was looking to leave I probably couldn't articulate what I was looking for to a recruiter.

    The only thing I am sure of is that I'm not going to get my MBA after all. My main motivation for getting it was "to do it now before it got too late" but that's enough of a motivator to put in the effort since I don't know what I'd do with it once I got it. 

    It's going to be time for my mid-year review and discussion with my boss in a couple of weeks. I hope I have a clearer idea where I want to go by then.

  4. Piss of Less People with my Writing: Whatever. I've already gotten two angry emails from different folks at work about stuff I've written online and it isn't even the first week of the year. Maybe next year. ;)


 

Categories: Ramblings

January 6, 2006
@ 05:53 PM

Every couple of weeks while I'm at Microsoft I hear co-workers or executives say stuff that makes me wonder whether we are stuck in a time warp. My current pet peeve  is when I hear someone use the term Software as a Service or even worse the abbreviation "SaaS" to describe Web-based. There are people here who are so disconnected from the real world that to them Web-based software is some hot, new thing that deserves it's own magical new buzzword. Seriously, if you go around saying stuff like "Software as a Service" in 2006 then you are fricking dinosaur.

Another example of the kind of dinosaur mentality I'm ranting against is linked from a post on Robert Scoble's blog entitled Flickr’ing an unusual Mix06 meeting. In that post he links to the following image

At some meeting about a new Web conference coming out of Microsoft, one of the insightful ideas on the white board is "The Web is inevitable and her to stay". Is this 1996? That would have been insightful a decade ago, now it just makes us seem over the hill. Competitors like Google and Yahoo! are already thinking about the next level (e.g. making deals with network service providers to increase the quality of the user experience when visiting their sites, making heavy bets on the mobile Web, and bridging the gap between the Web and television in concrete ways) yet here we are finally admitting that maybe wishing that the Web will go away isn't a winning strategy.

Sometimes it feels like I work in dinosaur country.


 

Categories: Life in the B0rg Cube

January 4, 2006
@ 10:10 PM

From the E! Online article Back to the "Futurama"? we learn

Following the hugely successful resurrection of Family Guy, Fox execs are reportedly in talks to bring Futurama back from the dead.

The studio has begun talks to revive the Emmy-winning animated series and produce a limited number of new episodes, thanks to a resurgence in the show's popularity on DVD and in reruns, Variety reports.

Reps for 20th Century Fox have declined to comment on the news, but Variety says initial negotiations have begun.

If revived, it's unclear exactly which network would air the new episodes. While Fox housed the original series, the show found new life once reruns began showing on the Cartoon Network. Comedy Central subsequently snapped up the off-air rights and will exclusively air the repeats beginning in 2008.

The brainchild of Simpsons mastermind Matt Groening and writer David X. Cohen, Futurama debuted on Fox in March 1999. The series revolved around Fry, a pizza delivery boy, who is accidentally frozen for a thousand years. He wakes up in the year 3000 and befriends sassy one-eyed pilot Leela and the cranky robot Bender, who both work for an intergalactic delivery service run by a distant nephew of Fry's.

After five seasons and three Emmys, including the 2002 prize for Best Animated Series, Futurama was shuttered in August 2003.

Should the show make its way back to the airwaves, it would follow in the footsteps of another Fox cult 'toon, Family Guy.

The latter show was brought back in 2004 thanks to robust rerun ratings and staggeringly high DVD sales--the show ranks as the fourth-biggest TV series seller ever. Since its comeback, Fox has produced two more seasons and the direct-to-DVD movie Stewie Griffin: The Untold Story.

First Family Guy, now Futurama. I can only hope that Cartoon Network borrows a leaf from Fox and brings back Samurai Jack.
 

January 4, 2006
@ 07:56 PM

Via Miguel De Icaza I found the post Fear is the mind killer from the Jesus General blog. It states

Our Leader hasn't caught Osama bin Laden, but he's doing a bang up job rounding up brown people.

From the documentary, Persons of Interest:

SYED ALI

"Syed Ali was a partner in a successful securities firm prior to September 11th. Following an unrelated business dispute, one of his partners told the FBI Syed was a terrorist. The authorities stormed his house and found, among other things, a visitor's pass to the World Trade Center and his son's flight simulator video game. Syed was held on Rikers Island for 100 days. He lost his business; family and friends became scared off by the terror allegations. The government dropped all terrorist charges against Syed Ali. He now operates a limousine franchise. Previously a homemaker, his wife, Deliliah, found work as a legal secretary and hospital clerk."

NABIL AYESH

"Nabil is originally from Palestine. He was arrested on September 11 2001 while stopped at a traffic light in Philadelphia. "Where are you from?" Nabil remembers the officer asking him. "Israel," Nabil answered. The officer asked Nabil if he was Israeli or Arabic. "I said I'm Arabic, and they said you're under arrest." Nabil was detained for one year and seventeen days. He was never charged with anything. His wife and children were all deported back to Palestine. After he was released Nabil got a working permit and a job as a contractor. "I am trying to get my life back together," he said, "But it's hard. It was hard for me in jail. Now my main concern is my family." Nabil was re-arrested in April 2003 when police in Syracuse, NY pulled over a speeding car in which he was a passenger. He was held in a Batavia, NY jail and then deported to the West Bank, where he was reunited with his wife and four children."

MATEEN BUTT

"Mateen Butt, 26, came to the United States from Pakistan when he was nine years old. He lives in Valley Stream, New York and was working as a telecommunications analyst on Sept. 11. On Sept. 18 2002, ten officers surrounded Mateen's house at 6 a.m. and took him away in shackles. He was told he was being detained because of an application for a work visa he filed when he wad 16 years old. Mateen was interrogated and asked whether he was a Muslim and attended a Mosque, but he refused to answer. He was detained in both Middlesex and Bergen County. Mateen's experience in prison affected him dramatically. He has become much more religious and no longer feels safe here in the United States. "I don't feel free any more," he said, "I don't have the same feeling." Mateen's mother, Naz, has sold her Subway sandwich shop and the family plans to return to Karachi, Pakistan, a land Mateen has not known since he was a child."

There are a couple more profiles on the site which detail some of the people's whose lives have been changed by being part of the collateral damage in the United States's "War on Terror". Of course, they could have had it worse.

When I read blog posts like Shelley Powers They're Back or Robert Scoble's Microsoft takes down Chinese blogger (my opinions on that), I wonder why I tend to see American bloggers writing angry missives about perceived injustices in faraway lands but never about the oppression by government in their own countries. I guess it's all a case of Luke 6:41 in action.


 

Categories: Current Affairs

January 3, 2006
@ 07:31 PM

It's another year, which means it's soon going to be time to figure out which conferences I'll be attending over the next few months. So far, three conferences have come up on my radar and I suspect I'll attend at least two of them. The conferences in order of my likelihood of attending them are

  1. VSLive: A conference for Visual Studio developers. I'll likely be there with other folks from MSN Windows Live to talk about the various APIs we provide and perhaps give hints or details on some of our upcoming API plans.

  2. ETech: I attended this conference last year and found it extremely valuable. There were developers from small and large Web companies talking about technical issues they had faced while delivering services on the Web as well as announcing cool new offerings. The list of speakers is great; Danah Boyd, Joel Spolsky, Kathy Sierra, Sam Ruby, Jon Udell, Simon Willison and Ray Ozzie. I don't plan to miss this one. 

  3. MIX: This is a Microsoft conference that will focus on our hip, Web-based offerings like IE7, Windows Media, Windows Live!, as well as "Atlas", Microsoft’s new AJAX framework. Given that I'll already have lost a week of work by attending ETech and I won't really be learning anything I can't find on internal websites by attending the conference, I'll probably miss this one. Of course, if my workload is light and/or I'm told I'll be speaking I might end up attending.

If you'll be at any of these conferences and would like to meet up to shoot the breeze about mindless geekery, holla at me. Also what other interesting Web geek conferences are out there?


 

Categories: Technology

A number of people have written posts about Microsoft's poor stock performance over the past week but there are two posts I thought were interesting enough to share on my blog.

In his post 7 years ago Omar Shahine writes

I started working at Microsoft. Just as a data point, the “strike price” for my first option grant was $31.7250. Today, the stock is trading around $26. At 7 years, there are two notable events:

  • You start to accrue 4 weeks of vacation per year
  • Your first stock option grant expires

So long option grant #1, I barely knew you :-). What are the chances the stock will shoot up 5 bucks in the next few hours so I can sell my grant?

On a more serious note, has the company really done so little in the last 7 years that the stock price warrants being down 19%? Will 2006 be the year MSFT rebounds? I sure hope so.

Anyway, it’s been a great 7 years. I look forward to the next few! Microsoft has been great to me over the years.

From a response by Andrew Leckey in the letters page of the Orlando Sentinel we get the item Shift to Internet impacts Microsoft stock which states

Q: I really expected more from my shares of Microsoft Corp. Has tech left it behind?

K.T., via the Internet

A: Microsoft rolled out its hot-selling Xbox 360 video-game console worldwide months ahead of rival Sony's next-generation PlayStation 3.

It is developing an online classified ad service to compete with the popular Craigslist. And it is entering the high-end supercomputer market with a version of Windows that ties together computers in a high-speed network.

But even though Chairman Bill Gates is the world's richest man and the firm has an unparalleled financial record and a mountain of cash, Microsoft gets no respect.

It is now frequently considered a value stock rather than a growth stock, a lumbering tech giant attracting investors with a relatively modest share price.

Microsoft stock fell 2 percent in 2005, following a drop of 2 percent in 2004 and a gain of 6 percent in 2003. Compare that to skyrocketing Google Inc. or Microsoft's own track record for 1996 through 2002, when its shares jumped nearly 400 percent.

Its highly profitable, industry-dominant Windows and Office software account for about 60 percent of its revenue, with an additional 25 percent coming from software for enterprise servers. New software products are continually being introduced.

The corporate vision is to expand beyond all that via the Xbox, Windows Mobile, Windows Media Center and IPTV platform to become the center of the digital home.

Because Microsoft shares look inexpensive in light of the potential to accomplish that, the Wall Street consensus recommendation is midway between "strong buy" and "buy," according to Thomson Financial. That consists of 14 "strong buys," 16 "buys," four "holds" and one "strong sell."

The biggest challenge, termed a "sea change" by Gates, is an industrywide shift to Internet-based software and services for everything from word processing to photo storage. This could make its conventional software packages less relevant.

Earnings are expected to increase 14 percent in fiscal 2006, which ends in June, the same estimate as for the application software industry.

I feel the same way as Omar does. You don't have to do the math and figure out my option grant from when I was hired four years ago to tell that it is underwater. However working at Microsoft has been great this past four years (wow, that long?) and I look forward to the next few years building social software for MSN Windows Live. It is interesting that Andrew Leckey feels that our biggest challenge Web-based software, it's definitely going to be an interesting year. 


 

January 1, 2006
@ 07:57 PM

Reading the blogs of Tim Berners-Lee and Jon Udell this morning, I was struck by how clear it is that the Javascript platform within the browser is somewhat immature and incomplete.

In his post Links on the Semantic Web Tim Berners-Lee writes

To play with semantic web links, I made a toy semantic web browser, Tabulator. Toy, because it is hacked up in Javascript
....
Here is the current snag, though. Firefox security does not allow a script from a given domain to access data from any other domain, unless the scripts are signed, or made into an extension. And looking for script signing tools (for OS X?) led me to dead ends. So if anyone knows how to do that, let me know. Untill I find a fix for that, the power of following links -- which is that they can potentially go anywhere -- is alas not evident!

In his post Predictions for 2006 Jon Udell writes

June 15: Browser local storage

An Alchemy application, though, always works with a genuine local data model that it stores as sets of XML fragments and navigates in a relational style. Bosworth's hunch is that a Web-style thin client, driven by a rich data model intelligently synchronized with the services cloud, could do most of what we really need -- both offline and online.
That's from a column entitled Thin client, rich data. The next turn of the AJAX crank has to involve an intelligent local data store. It's been on my wishlist forever, but Mozilla CTO Brendan Eich told me to expect results in 2006, so I do.

Almost everyone who has attempted building an AJAX application has hit the issues mentioned by Jon Udell and Tim Berners-Lee in their posts. Everytime I mess around with AJAX I can't help thinking how much more interesting the applications could be if I could offload the data aggregation/integration to the client browser instead of doing it on the server. I've thought the same about offline storage, why can't I store richer information than just cookie data on the local client in a cross-platform manner?

It's hard to get wrapped up in the AJAX hype when such fundamental holes exist in the functionality provided by modern web browsers. I hope Jon Udell is right and the Mozilla folks plan to fix some of the more fundamental problems with building AJAX applications on the Web today. 


 

Categories: Web Development