The team I work for in MSN Windows Live has open developer and program management positions. Our team is responsible for the underlying infrastructure of services like Windows Live Messenger, Windows Live Mail, Windows Live Favorites, MSN Spaces, and MSN Groups. We frequently collaborate with other properties across MSN Windows Live including the Live.com, Windows Live Local, Windows Live Expo, and Windows OneCare Live teams as well as with other product units across Microsoft such as in the case of Office Live. If you are interested in building world class software that is used by hundreds of millions of people and the following job descriptions interest you then send me your resume

Program Management Architect
The Communications Core platform team for Windows Live services is looking for an experienced, enthusiastic and highly technical PM to jump start a brand new service that helps developers adopt our platform at a very rapid pace. You will be responsible for building a platform where developers can easily take advantage of emerging technology from our large scale services (e.g. Messenger, Hotmail, Contacts, Storage services) and empower quick schema and API changes for a rapid TTD (Time to Demo!). Designing, developing, deploying, evangelizing and supporting this so called “Sandbox” environment will require excellent cross-group working skills as you will have to interact extensively with business planning, dev/test, operations, and partner support teams. It will also require a high level of technical depth in order to intimately understand and create clones of the back end services involved as well as extensive web services and API knowledge. We are looking for someone with core technical skills, a services or high scale server background; experience with API development, web services and a passion to win developers from the competition.

Program Manager
If you are an experienced Program Manager with strong technical skills and a strong desire to work in an enthusiastic fast paced environment then this job is for you! The Communications Core Platform team for Windows Live services owns the data store serving hundreds of millions of end users with billions of contacts, files and photos. Our systems handle tens of thousands of transactions per second. Our team owns core MSN and Windows Live platforms, including ABCH (storing Messenger and Hotmail contacts, groups and ACLs) and MSN Storage (storing files, photos and other data types). We are looking for a creative, self-driven, technical Program Manager who is interested in designing and shipping the next generation of back-end components that drive this massively scalable service in the midst of stiff competition from Microsoft's toughest competitors. You will be responsible for defining and writing technical specifications for new features, developing creative solutions for difficult scale and performance problems, driving the capacity management framework, working with teams across the company on critical cross-group projects, working extensively with development and test to drive project schedules and ultimately shipping new versions of the service that provide tremendous value for our customers and partners.

Software Development Engineer
400 million address books. 8 billion contacts. A gazillion relationships! That is the magnitude of data the Windows Live Contacts team hosts today (and it is growing fast!). The service (called the ABCH) doesn't just host contacts and address books but provides a platform for building rich permissions and sharing scenarios (sharing objects to individuals, groups or social networks). Now imagine, if this treasure trove of data were accessible via programmable APIs to all the developers in the world. Imagine the scenarios that it could enable. Imagine the interesting applications that developers around the world would build.

This is what we want to provide as part of the Windows Live vision. We are looking for an experienced software developer who can spearhead our effort in providing APIs (SOAP, DAV, REST) to our contacts and permissions service that can be used by third-party developers and ISVs.

The ideal candidate will have at least five years of demonstrated experience in shipping commercial software. The candidate should be a solid developer with above average design skills. The candidate should have a very keen sense of programmability, security and privacy and be willing to go the extra mile to make sure a users' data isn't compromised.

Email your resume to dareo@msft.com (replace msft with microsoft) if the above job descriptions sound like they are a good fit for you. If you have any questions about what working here is like, you can send me an email and I'll either follow up via email or my blog to answer any questions of general interest [within reason].


 

January 27, 2006
@ 01:04 AM

As Mike Torres notes, a BIG update to MSN Spaces just shipped. Below is a list of some of the features from his post. The ones I worked on have been highlighted in red. :)

  • Spaces Search.  This is an incredibly cool feature that lets you find interesting spaces based on keyword, a user's profile information, or by clicking on most popular interests across all of spaces.  You can also run a search from any space just by clicking "Search Spaces" in the header above.  One thing to mention about the search feature is that it will be ramping up for a few days - but you can help make it better!  Learn more about this on The Space Craft.

  • Mobile Search from Mobile Spaces!  Search for spaces from your mobile device.  Mike Smuga will be talking about this more over on his space soon.

  • Your own advertising on your space (as an option) to make money from clicks - powered by Kanoodle!  (This feature is only available in the United States and Canada at this time.)

  • Book lists with Amazon integration.  Automatically insert information from Amazon.com directly into your book list - and again, make money through Amazon Associates when people end up buying the book!  It's very cool (by the way, our book is called Share Your Story if you want to add it to your book list :)

  • Better blog navigation.  This feature is one of those things we needed to do.  You can now "view more entries" at the bottom of the page, and navigate through Previous and Next pages while looking through blog entries.

  • Customized blog entry display.  Choose how you want your blog entries to appear, by date or entry title.  This is a great feature for people who write essays or incredibly insightful posts once a month.  Date isn't really important in this case, whereas sorting by entry title may make more sense.

  • Integrated Help.  Confused?  Click the Learn link in the header above to figure out what to do next!

  • Enhanced Profile including General, Social, and Contact Info sections.  Each section will have it's own permissions so any part you would like to limit access to (say your personal contact information), you can do it.  There's also an updated profile module for the homepage with an actionable photo; anytime you see someone's picture anywhere you can right-click (or click on the down arrow) to view their contact card, space, profile, and more.

  • Live Contacts Beta!  Brand new feature which you'll see popping up throughout Windows Live in time.  What is it?  It's the ability to subscribe to automatic contact updates.  When your friend changes his or her address or phone number (in their Profile mentioned above), your address book will be automatically updated if you are subscribed to updates.  This is crazy cool.  Learn more here (an overview will be posted soon)

  • Which reminds me: Spaces now has contact cards like MSN/Windows Live Messenger!  Just right-click on someone's profile photo (or click on the down arrow) and select "View contact card" to see a preview of their space.

  • Better commenting for blogs and now photos as well!  This feature also has an (optional) clickable profile photo that you can leave behind when leaving a comment.  And there's a mini text editor so you can format your comments (something I'm really glad we did!) kinda like blog entries.  Note that if you would like to turn off photo comments, you can do this in Settings.

  • Photos are no longer limited to 30MB; you can now upload 500 photos per month without worrying about running out of storage space.

  • MetaWeblog API (OK, this one is from December – but it's still cool).  Read more here.

  • Better URLs!  Sometimes the little things matter the most.  This is one of those things.  Say goodbye to /members/.  You can now be reached at http://spaces.msn.com/[username].  For example, http://spaces.msn.com/mike now works!  We also have cleaner paths to pages, so if you want to give someone a link to your blog or to your photos, you can send them to http://spaces.msn.com/[username]/blog or /photos.

  • Xbox Live integration.  Themes, recent games module, and gamer card integration!  This feature has been the single biggest reason my gamer score is now clocking in at 500 instead of 0.  If you're into Xbox Live, these features rock!  Check out my theme and gamer card and you can see why.

  • New themes and categorized theme picker.  We now have well over 100 themes!

  • Do you like email or mobile publishing?  You can now publish from 3 email addresses instead of just one.

  • For those of you with private spaces (you know who you are!) people can now request access to spaces via anonymous email.  I like to think about this as "knocking on the door of someone's house".

  • Privacy controls (communication preferences) for who can request access to your space and to your contact information and how.  Check it out in Settings.

  • We doubled the size limit on the HTML PowerToy.

There's a lot of good stuff in this release and its great to be able tp work on shipping these features to our tens of millions of users.


 

Categories: Windows Live

Thanks to the recent news of the US Department of Justice's requests for information from the major web search engines, I've seen a number of people express surprise and dismay that online services track information that they'd consider private. A term that I've seen bandied about a lot recently is Personally Identifiable Information (PII) which I'd never heard before starting work at MSN.

The Wikipedia definition for Personally Identifiable Information (PII) states

In information security and privacy, personally identifiable information or personally identifying information (PII) is any piece of information which can potentially be used to uniquely identify, contact, or locate a single person.

Items which might be considered PII include, but are not limited to, a person's:

Information that is not generally considered personally identifiable, because many people share the same trait, include:

  • First or last name, if common
  • Country, state, or city of residence
  • Age, especially if non-specific
  • Gender or race
  • Name of the school they attend or workplace
  • Grades, salary, or job position
  • Criminal record

When a person wishes to remain anonymous, descriptions of them will often employ several of the above, such as "a 34-year-old black man who works at Target". Note that information can still be private, in the sense that a person may not wish for it to become publicly known, without being personally identifiable. Moreover, sometimes multiple pieces of information, none of which are PII, may uniquely identify a person when brought together; this is one reason that multiple pieces of evidence are usually presented at criminal trials. For example, there may be only one Inuit person named Steve in the town of Lincoln Park, Michigan.

In addition, there is the notion of sensitive PII. This is information which can be linked to a person which the person desires to keep private due to potential for abuse. Examples of "sensitive PII" are a person's medical/health conditions; racial or ethnic origin; political, religious or philosophical beliefs or affiliations; trade union membership or sex life.

Many online services such as MSN have strict rules about when PII should be collected from users, how it must be secured and under what conditions it can be shared with other entities. However many Internet users don't understand that they disclose PII when using online services. Not only is there explicit collection of PII such as when user's provide their name, address and credit card information to online stores but there is often implicit PII collected which even savvy users fail to consider. For example, most Web servers log IP addresses of incoming HTTP requests which can then be used to identify users in many cases. It's easy to forget that practically every website you visit stores your IP address somewhere on their servers as soon as you hit the site. Other examples aren't so obvious. There was a recent article on Boing Boing entitled Data Mining 101: Finding Subversives with Amazon Wishlists which showed how to obtain sensitive PII such as people's political beliefs from their wishlists on Amazon.com. A few years ago I read a blog post entitled Pets Considered Harmful which showed how one could obtain sensitive PII such as someone's email password by obtaining the name of the person's pet from reading their blog since "What is the name of your cat?" was a question used by GMail to allow one to change their password.  

The reason I bring this stuff up is that I've seen people like Robert Scoble's make comments about wanting "a button to click that shows everything that’s being collected from their experience". This really shows a lack of understanding about PII. Would such a button prevent users from revealing their political affiliations in their Amazon wishlists or giving would be email account hijackers the keys to their accounts by blogging about their pets? I doubt it.

The problem is that most people don't realize that they've revealed too much information about themselves until something bad happens. Unfortunately, by then it is usually too late to do anything about it. If are an Internet user,  you should be cognizant of the amount of PII you are giving away by using web applications like search engines, blogs, email, instant messaging, online stores and even social bookmarking services.

Be careful out there.


 

Categories: Current Affairs

January 26, 2006
@ 06:54 PM

From the press release Microsoft Expands Internet Research Efforts With Founding of Live Labs we learn

REDMOND, Wash. — Jan. 25, 2006 —Microsoft Corp. today announced the formation of Microsoft® Live Labs, a research partnership between MSN® and Microsoft Research. Under the leadership of Dr. Gary William Flake, noted industry technologist and Microsoft technical fellow, Live Labs will consist of a dedicated group of researchers from MSN and Microsoft Research that will work with researchers across Microsoft and the academic research community. Live Labs will provide consistency in vision, leadership and infrastructure as well as a nimble applied research environment that fosters rapid innovations.

"Live Labs is a fantastic alliance between some of the best engineering and scientific talent in the world. It will be the pre-eminent applied research laboratory for Internet technologies," Flake said. “This is a very exciting opportunity for researchers and technologists to have an immediate impact on the next evolution of Microsoft's Internet products and services and will help unify our customers' digital world so they can easily find information, pursue their interests and enrich their lives."

The Live Labs — a confederation of dedicated technologists and affiliated researchers in pre-existing projects from around Microsoft — will focus on Internet-centric applied research programs including rapidly prototyping and launching of emerging technologies, incubating entirely new inventions, and improving and accelerating Windows Live™ offerings. This complements the company’s continuing deep investment in basic research at Microsoft Research and product development at MSN.

Ray Ozzie, Craig Mundie and David Vaskevitch, Microsoft’s chief technical officers, will serve as the Live Labs Advisory Board. Ozzie sees Live Labs as an agile environment for fast-tracking research from the lab into people’s hands. "Live Labs is taking an exciting approach that is both organic and consumer-driven," Ozzie said. "Within the context of a broad range of rich usage scenarios for Windows Live, the labs will explore new ways of bringing content, commerce and community to the Internet."

You can check out the site at http://labs.live.com/. It's unclear to me why we felt we had to apply the "Live" brand to what seems to be a subsection of http://research.microsoft.com/. I guess "Live" is going to be the new ".NET" and before the end of the year everything at Microsoft will have a "Live" version.

*sigh*


 

Categories: Windows Live

Today while browsing the Seattle Post Intelligencer, I saw an article with the headline Google agrees to censor results in China which began

SAN FRANCISCO -- Online search engine leader Google Inc. has agreed to censor its results in China, adhering to the country's free-speech restrictions in return for better access in the Internet's fastest growing market.

The Mountain View, Calif.-based company planned to roll out a new version of its search engine bearing China's Web suffix ".cn," on Wednesday. A Chinese-language version of Google's search engine has previously been available through the company's dot-com address in the United States. By creating a unique address for China, Google hopes to make its search engine more widely available and easier to use in the world's most populous country.
...
To obtain the Chinese license, Google agreed to omit Web content that the country's government finds objectionable. Google will base its censorship decisons on guidance provided by Chinese government officials.

Although China has loosened some of its controls in recent years, some topics, such as Taiwan's independence and 1989's Tiananmen Square massacre, remain forbidden subjects.

Google officials characterized the censorship concessions in China as an excruciating decision for a company that adopted "don't be evil" as a motto. But management believes it's a worthwhile sacrifice.

"We firmly believe, with our culture of innovation, Google can make meaningful and positive contributions to the already impressive pace of development in China," said Andrew McLaughlin, Google's senior policy counsel.

Google's decision rankled Reporters Without Borders, a media watchdog group that has sharply criticized Internet companies including Yahoo and Microsoft Corp.'s MSN.com for submitting to China's censorship regime.

No comment.


 

Brian Jones has a blog post entitled Corel to support Microsoft Office Open XML Formats which begins

Corel has stated that they will support the new XML formats in Wordperfect once we release Office '12'. We've already seen other applications like OpenOffice and Apple's TextEdit support the XML formats that we built in Office 2003. Now as we start providing the documentation around the new formats and move through Ecma we'll see more and more people come on board and support these new formats. Here is a quote from Jason Larock of Corel talking about the formats they are looking to support in coming versions (http://labs.pcw.co.uk/2006/01/new_wordperfect_1.html):

Larock said no product could match Wordperfect's support for a wide variety of formats and Corel would include OpenXML when Office 12 is released. "We work with Microsoft now and we will continue to work with Microsoft, which owns 90 percent of the market. We would basically cut ouirselves off if you didn't support the format."

But he admitted that X3 does not support the Open Document Format (ODF), which is being proposed as a rival standard, "because no customer that we are currently dealing with as asked us to do so."

X3 does however allow the import and export of portable document format (pdf) files, something Microsoft has promised for Office 12.

I mention this article because I wanted to again stress that even our competitors will now have clear documentation that allows them to read and write our formats. That isn't really as big of a deal though as the fact that any solution provider can do this. It means that the documents can now be easily accessed 100 years from now, and start to play a more meaningful role in business processes.

Again I want to extend my kudos to Brian and the rest of the folks on the Office team who have been instrumental in the transition of the Microsoft Office file formats from proprietary binary formats to open XML formats.


 

Categories: Mindless Link Propagation | XML

Sunava Dutta on the Internet Explorer team has written about their support for a Native XMLHTTPRequest object in IE 7. He writes

I’m excited to mention that IE7 will support a scriptable native version of XMLHTTP. This can be instantiated using the same syntax across different browsers and decouples AJAX functionality from an ActiveX enabled environment.

What is XMLHTTP?

XMLHTTP was first introduced to the world as an ActiveX control in Internet Explorer 5.0. Over time, this object has been implemented by other browsing platforms, and is the cornerstone of “AJAX” web applications. The object allows web pages to send and receive XML (or other data) via the HTTP protocol. XMLHTTP makes it possible to create responsive web applications that do not require redownloading the entire page to display new data. Popular examples of AJAX applications include the Beta version of Windows Live Local, Microsoft Outlook Web Access, and Google’s GMail.

Charting the changes: XMLHTTP in IE7 vs. IE6

In IE6 and below, XMLHTTP is implemented as an ActiveX object provided by MSXML.

In IE7, XMLHTTP is now also exposed as a native script object. Users and organizations that choose to disable ActiveX controls can still use XMLHTTP based web applications. (Note that an organization may use Group Policy or IE Options to disable the new native XMLHTTP object if desired.) As part of our continuing security improvements we now allow clients to configure and customize a security policy of their choice and simultaneously retain functionality across key AJAX scenarios.

IE7’s implementation of the XMLHTTP object is consistent with that of other browsers, simplifying the task of cross-browser compatibility.  Using just a bit of script, it’s easy to build a function which works with any browser that supports XMLHTTP:

if (window.XMLHttpRequest){

          // If IE7, Mozilla, Safari, etc: Use native object
          var xmlHttp = new XMLHttpRequest()

}
else
{
if (window.ActiveXObject){

          // ...otherwise, use the ActiveX control for IE5.x and IE6
          var xmlHttp = new ActiveXObject("Microsoft.XMLHTTP");
          }

}

Note that IE7 will still support the legacy ActiveX implementation of XMLHTTP alongside the new native object, so pages currently using the ActiveX control will not require rewrites.

I wonder if anyone else sees the irony in Internet Explorer copying features from Firefox which were originally copied from IE?


 

Categories: Web Development

Brad Fitzpatrick, founder of Livejournal, has a blog post entitled Firefox bugs where he talks about some of the issues that led to the recent account hijackings on the LiveJournal service.

What I found most interesting were Brad's comments on Bug# 324253 - Do Something about the XSS issues that -moz-binding introduces in the Firefox bugzilla database. Brad wrote

Hello, this is Brad Fitzpatrick from LiveJournal.

Just to clear up any confusion: we do have a very strict HTML sanitizer. But we made the decision (years ago) to allow users to host CSS files offsite because... why not? It's just style declarations, right?

But then came along behavior, expression, -moz-binding, etc, etc...

Now CSS is full of JavaScript. Bleh.

But Internet Explorer has two huge advantages over Mozilla:

-- HttpOnly cookies (Bug 178993), which LiveJournal sponsored for Mozilla, over a year ago. Still not in tree.

-- same-origin restrictions, so an offsite behavior/binding can't mess with the calling node's DOM/Cookies/etc.

Either one of these would've saved our ass.

Now, I understand the need to innovate and add things like -moz-bindings, but please keep in mind the authors of webapps which are fighting a constant battle to improve their HTML sanitizers against new features which are added to browser.

What we'd REALLY love is some document meta tag or HTTP response header that declares the local document safe from all external scripts. HttpOnly cookies are such a beautiful idea, we'd be happy with just that, but Comment 10 is also a great proposal... being able to declare the trust level, effectively, of external resources. Then our HTML cleaner would just insert/remove the untrusted/trusted, respectively.

Cross site scripting attacks are a big problem for websites that allow users to provide HTML input. LiveJournal isn't the only major blogging site to have been hit by them, last year the 'samy is my hero' worm hit MySpace and caused some downtime for the service.

What I find interesting from Brad's post is how on the one hand having richer features in browsers is desirable (e.g. embedded Javascript in CSS) and on the other becomes a burden for developers building web apps who now have to worry that even stylesheets can contain malicious code.

The major browser vendors really need to do a better job here. I totally agree with one of the follow up comments in the bug which stated If Moz & Microsoft can agree on SSL/anti-phishing policy and an RSS icon, is consensus on scripting security policy too hard to imagine?. Collaborating on simple stuff like what orange icon to use for subscribing to feeds is nice, but areas like Web security could do with more standardization across browsers. I wonder if the WHAT WG is working on standardizing anything in this area... 


 

Categories: Web Development

January 23, 2006
@ 10:42 PM

I don't usually spam folks with links to amusing video clips that are making the rounds in email inboxes, but the video of Aussie comedy group Tripod performing their song "Make You Happy Tonight" struck a chord with me because I did what the song talks about this weekend.

The game in question was Star Wars: Knights of the Old Republic II. :)


 

One part of the XML vision that has always resonated with me is that it encourages people to build custom XML formats specific to their needs but allows them to map between languages using technologies like XSLT. However XML technologies like XSLT focus on mapping one kind of syntax for another. There is another school of thought from proponents of Semantic Web technologies like RDF, OWL, and DAML+OIL, etc that higher level mapping between the semantics of languages is a better approach. 

In previous posts such as RDF, The Semantic Web and Perpetual Motion Machines and More on RDF, The Semantic Web and Perpetual Motion Machines I've disagreed with the thinking of Semantic Web proponents because in the real world you have to mess with both syntactical mappings and semantic mappings. A great example of this is shown in the post entitled On the Quality of Metadata... by Stefano Mazzocchi where he writes

One thing we figured out a while ago is that merging two (or more) datasets with high quality metadata results in a new dataset with much lower quality metadata. The "measure" of this quality is just subjective and perceptual, but it's a constant thing: everytime we showed this to people that cared about the data more than the software we were writing, they could not understand why we were so excited about such a system, where clearly the data was so much poorer than what they were expecting.

We use the usual "this is just a prototype and the data mappings were done without much thinking" kind of excuse, just to calm them down, but now that I'm tasked to "do it better this time", I'm starting to feel a little weird because it might well be that we hit a general rule, one that is not a function on how much thinking you put in the data mappings or ontology crosswalks, and talking to Ben helped me understand why.

First, let's start noting that there is no practical and objective definition of metadata quality, yet there are patterns that do emerge. For example, at the most superficial level, coherence is considered a sign of good care and (here all the metadata lovers would agree) good care is what it takes for metadata to be good. Therefore, lack of coherence indicates lack of good care, which automatically resolves in bad metadata.

Note how the is nothing but a syllogism, yet, it's something that, rationally or not, comes up all the time.

This is very important. Why? Well, suppose you have two metadatasets, each of them very coherent and well polished about, say, music. The first encodes Artist names as "Beatles, The" or "Lennon, John", while the second encodes them as "The Beatles" and "John Lennon". Both datasets, independently, are very coherent: there is only one way to spell an artist/band name, but when the two are merged and the ontology crosswalk/map is done (either implicitly or explicitly), the result is that some songs will now be associated with "Beatles, The" and others with "The Beatles".

The result of merging two high quality datasets is, in general, another dataset with a higher "quantity" but a lower "quality" and, as you can see, the ontological crosswalks or mappings were done "right", where for "right" I mean that both sides of the ontological equation would have approved that "The Beatles" or "Beatles, The" are the band name that is associated with that song.

At this point, the fellow semantic web developers would say "pfff, of course you are running into trouble, you haven't used the same URI" and the fellow librarians would say "pff, of course, you haven't mapped them to a controlled vocabulary of artist names, what did you expect?".. deep inside, they are saying the same thing: you need to further link your metadata references "The Beatles" or "Beatles, The" to a common, hopefully globally unique identifier. The librarian shakes the semantic web advocate's hand, nodding vehemently and they are happy campers.

The problem Stefano has pointed out is that just being able to say that two items are semantically identical (i.e. an artist field in dataset A is the same as the 'band name' field in dataset B) doesn't mean you won't have to do some syntactic mapping as well (i.e. alter artist names of the form "ArtistName, The" to "The ArtistName") if you want an accurate mapping.

The example I tend to cull from in my personal experience is mapping between different XML syndication formats such as Atom 1.0 and RSS 2.0. Mapping between both formats isn't simply a case of saying <atom:published>  owl:sameAs <pubDate> or that <atom:author>  owl:sameAs <author> . In both cases, an application that understands how to process one format (e.g. an RSS 2.0 parser) would not be able to process the syntax of the equivalent  elements in the other (e.g. processing RFC 3339 dates as opposed to RFC 822 dates).

Proponents of Semantic Web technologies tend to gloss over these harsh realities of mapping between vocabularies in the real world. I've seen some claims that simply using XML technologies for mapping between XML vocabularies means you will need N2 transforms as opposed to needing 2N transforms if using SW technologies (Stefano mentions this in his post as has Ken Macleod in his post XML vs. RDF :: N × M vs. N + M). The explicit assumption here is that these vocabularies have similar data models and semantics which should be true otherwise a mapping wouldn't be possible. However the implicit assumption is that the syntax of each vocabulary is practically identical (e.g. same naming conventions, same date formats, etc) which this post provides a few examples where this is not the case. 

What I'd be interested in seeing is whether there is a way to get some of the benefits of Semantic Web technologies while acknowledging the need for syntactical mappings as well. Perhaps some weird hybrid of OWL and XSLT? One can only dream...


 

Categories: Web Development | XML