I’ve read a number of articles about account security, passwords and secret questions this week for obvious reasons. Although I’ve seen a number of posts directed at end users as to how to better safeguard their accounts, I haven’t seen anything similar providing guidance to developers of online services on how to better safeguard their users in what is a very hostile environment.

Below are the top five (plus a bonus one) account security features that every competent online service should have implemented. None of these are ground breaking but it is quite clear that many services that we all use every day don’t implement even these basic security features thus putting our data at risk.

  1. Strong passwords including banning common passwords: The most basic practice is requiring that users create a strong password often by requiring some combination of minimum length, at least one of upper & lower case character and encouraging the use of punctuation. Although this is a good first steps there are other steps services need to take to ensure their users are using hard to guess passwords. One such approach is to take a look at the common common choices of user passwords that have been observed as a result of website hacks

    Analysis of these lists show that people are quite predictable and you often find "password", "abc123", "letmein" or the name of the website being used by a sizable percentage of the users on your site. It thus makes sense to ban users from using any of these fairly common passwords which can then lead to successful drive-by hacking incidents. For example, a hacker can take the basic approach of trying to log-in to a bunch of users accounts using "password", "123456" as their email address and if past history is a judge can end up compromising thousands of user accounts with just this brain dead tactic.

  2. Throttling failed password attempts: Regardless of how strong a user’s password is, it is trying to stop a bullet with a wet paper towel against a dedicated brute force attack if no protections are in place. Password cracking tools like John the Ripper can crack a strong eight character password in about 15 minutes. This means to fully protect users, online services should have a limit on how often a user can fail a password challenge before you put some road blocks in their way. These road blocks can include exponentially increasing delays after each failed attempt (wait 1 minute, if failed again then 2 minutes, etc) or requiring the person to solve a CAPTCHA to prove they are human.

    Another thing services should do is look at patterns of failed password attempts to see if broader prevention strategies are necessary. For example, if you are seeing hundreds of users failing multiple password attempts from a particular IP range you may want to block that IP range since given our previous discussion about weak passwords they probably have successfully hacked some of your accounts.

  3. 2-factor authentication: Every online service should give customers the option to trade convenience (i.e. password only sign in) with more security. Two-factor authentication is typically the practice of combining something the user knows (e.g. a password) with something the user has (e.g. their smart phone or biometric data). Although more inconvenient than just providing a password, it greatly increases the security for users who may be desirable targets for account hijackings or when providing a service that holds sensitive data. This is why it is supported by a number of popular online service providers including Google, Microsoft and Twitter.

    A common practice to improve the usability of 2-factor authentication is to give users the option to only require it the first time the sign-in from a particular device. This means that once the user goes through the two step authentication process from a new computer, you can assume that that device is safe and then only require a password the next time they sign in from that device. 

  4. Choose better secret questions or better yet replace them with proofs: Inevitably, users will forget the password they use with your service especially if you require strong passwords and have a policy that is incompatible with their default password choice (which hopefully isn’t “password1” Smile). A common practice, which has now become an Achilles heel of account security, is to have a set of back up questions that you ask the user if they have forgotten their password. The problem for account security is that it is often easier to guess the answers to these questions than it is to hack the user’s password. There is a great check list for what makes a good secret question at goodsecurityquestions.com with examples of good, fair and poor security questions.

    In general you should avoid security questions because most can be easily guessed such as what is your favorite color or sports team and for others their answers can be easily found on Facebook such as where the user went to high school or via social engineering your friends. A much better approach is to use a similar approach to 2-factor authentication where a user provides proof of something they have such as their smartphone (send an SMS) or alternate email account (send an email) to verify that they are who they say they are.

  5. Show customers their sign-in activity: When all else fails, it is important to give your customers the tools to figure out for themselves if they have been hacked. A good way to do this is to let them know of sign-in attempts that have occurred on their account so they can that either failed or were successful. Google does this today via its last account activity feature. You can find this by going to security.google.com and click Recent activity under “Security” on the left. Microsoft provides this with its recent activity feature which you can find by going to https://account.live.com/activity.

Implementing these features isn’t a cure all for account security woes and should instead be treated as the minimum bar for providing a reasonable level of security for your users. 

 

Note Now Playing: BeyonceFlawless Remix (featuring Nicki Minaj) Note


 

Categories: Cloud Computing | Programming

I stumbled on an interesting blog post today titled Why Files Exist which contains the following excerpt

Whenever there is a conversation about the future of computing, the discussion inevitably turns to the notion of a “File.” After all, most tablets and phones don’t show the user anything that resembles a file, only Apps that contain their own content, tucked away inside their own opaque storage structure.

This is wrong. Files are abstraction layers around content that are necessary for interoperability. Without the notion of a File or other similar shared content abstraction, the ability to use different applications with the same information grinds to a halt, which hampers innovation and user experience.

Given that one of the hats I wear in my day job is responsibility for the SkyDrive API, questions like whether the future of computing should include an end user facing notion of files and how interoperability across apps should work are often at the top of my mind. I originally wasn’t going to write about this blog post until I saw the discussion on Hacker News which was a bit disappointing since people either decided to get very pedantic on the specifics of how a computer file is represented in the operating system or argued that inter-app sharing between apps via intents (on Android) or contracts (in Windows 8/Windows RT) makes the notions of files obsolete.

The app-centric view of data (as espoused by iOS) is that apps own any content created within the app and there is no mechanism outside the app’s user experience to interact with or manage this data. This also means there is no global namespace where other apps or the end user can interact with this data also known as a file system. There are benefits to this approach such as greatly simplifying the concepts the user has to deal with and preventing both the user or other apps from mucking with the app’s experience. There are also costs to this approach as well.

The biggest cost is as highlighted in the Why Files Exist post is that interoperability is compromised. The reason is that it is a well known truism that data outlives applications. My contact list, my music library and the code for my side projects across the years are all examples of data which has outlived the original applications I used to create and manage them. The majority of this content is in the cloud today primarily because I want universal access to my data from any device and any app. A world where moving from Emacs to Visual Studio or WinAmp to iTunes means losing my files created in those applications would be an unfortunate place to live in the long term.

App-to-app sharing as is done with Android intents or contracts in Windows 8 is a powerful way to create loosely coupled integration between apps. However there is a big difference between one off sharing of data (e.g. share this link from my browser app to my social networking app) to actual migration or reuse of data (e.g. import my favorites and passwords from one browser app to another). Without a shared global namespace that all apps can access (i.e. a file system) you cannot easily do the latter.

The Why Files Exist ends with

Now, I agree with Steve Jobs saying in 2005 that a full blow filesystem with folders and all the rest might not be necessary, but in every OS there needs to be at least some user-facing notion of a file, some system-wide agreed upon way to package content and send it between applications. Otherwise we’ll just end up with a few monolithic applications that do everything poorly.

Here I actually slightly disagree with characterizing the problem as needing a way to package content and send it between applications. Often my data is actually conceptually independent of an application and it is more like I want to give access to my data to apps not that I want to package up some of my data from one app to another. For example, I wouldn’t characterize playing my MP3s originally ripped in Winamp or bought from Amazon MP3 in iTunes as packaging content between those apps and iTunes. Rather there is a global concept known as my music library which multiple apps can add to or play from.

So back to the question that is the title of this blog post; have files outlived their usefulness? Only if you think reusing data across multiple applications has.

Note Now Playing: Meek Mill - Pandemomiun (featuring Wale and Rick Ross Note


 

Categories: Cloud Computing | Technology

Earlier this year I was approached about writing a book on cloud computing topics by an editor for one of the big computer book publishers. Given that between my day job and having an infant I barely have time to keep this blog updated, I had to turn down the offer. However I did spend some time taking a second look at various cloud computing platforms like Amazon Web Services and Google App Engine then trying to put myself into the mindset of a potential customer as a way to figure out the target audience for the book. Below are the two categories of people I surmised would be interested in spending their hard earned cash on a book about cloud computing platforms

  1. Enterprise developers looking to cut costs of running their own IT infrastructure by porting existing apps or writing new apps. 
  2. Web developers looking to build new applications who are interested in leveraging a high performance infrastructure without having to build their own.

As I pondered this list it occurred to me that neither of these groups is well served by Google App Engine.

Given the current economy, an attractive thing to enterprises will be reducing the operating costs of their current internal applications as well as eliminating significant capital expenditure on new applications. The promise of cloud computing is that they can get both. The cloud computing vendor manages the cloud so you no longer need the ongoing expense of your own IT staff to maintain servers. You also don't need to make significant up-front payments to buy servers and software if you can pay as you go on someone else's cloud instead. Google App Engine fails the test as a way to port existing applications because it is a proprietary application platform that is incompatible with pre-existing application platforms. This same incompatibility rears its head when you look at App Engine simply as a way for enterprises to do new development. App Engine is based on Python, which if you look at the State of the Computer Book Market 2008, part 4 -- The Languages is not a terribly popular programming language. In today's world, enterprise development still means Java or .NET development which means enterprises will favor a platform where they can reuse their existing skills and technology expertise. Google App Engine isn't it.

So how about Web developers? In my classification, I broke up Web developers who'd be interested in cloud computing into hobbyists (like myself when writing a Twitter search engine on Windows Azure) and professionals (like myself when working on the platform that powers Hotmail's recently launched social features). Hobbyists either don't spend money or spend relatively little so I discounted them as a target audience of interest. The professional Web developers interested in cloud computing would be those who are considering Server or Web hosting but have concerns about scaling up if their service gets successful. After all, it seems like every week you are either reading about scaling hurdles that startup developers have faced as their applications become successful whether it is Bret Taylor's recent post How FriendFeed uses MySQL to store schema-less data or Jeff Atwood's Deadlocked! which talks about how he had to learn more about SQL Server's locking strategy as StackOverflow.com became more popular. The fact that Google App Engine is limited to only Python meaning that it is unavailable to developers using WISC platforms and only a subset of developers using LAMP can participate on the platform. Furthermore, there are key limitations in the platform that make it infeasible to build a full scale application. For example, Bret Taylor mentions that a consequence of having a denormalized database they need to run a background "Cleaner" process which fixes up data references and makes their database consistence. The App Engine DataStore API requires applications to store data in a denormalized way but there is no facility to run background processes to clean up the data as FriendFeed and most other large scale services which use database sharding often do. According to a recent blog post the Google App Engine roadmap has been updated so at least this limitation will be addressed. Other limitations for Python developers are that they can't use all of their existing knowledge of Python libraries since it only supports a subset of Python libraries. Database developers may be relieved that a lot of database management tasks no longer exist but may be chagrined once they see the restrictions on queries they can perform and hear some of the horror stories about query performance. At the end of the day, it just didn't seem to me that there were many professional Web developers who would put up with all of this pain over just going with AWS or dedicated hosting.

That said, Google App Engine does address the long tail of developers which I guess is nothing to sneeze at. Maybe it will see some success from targeting the low end in the same way that AdSense targeted the long tail of advertisers and is now the powerhouse of search advertising. Maybe. I doubt it though.

Note Now Playing: Aerosmith - Livin' On the Edge Note


 

Categories: Cloud Computing

Sarah Perez over on ReadWriteWeb has a blog post entitled In Cloud We Trust? where she states

Cloud computing may have been one of the biggest "buzzwords" (buzz phrases?) of this past year. From webmail to storage sites to web-based applications, everything online was sold under a new moniker in 2008: they're all "cloud" services now. Yet even though millions of internet users make use of these online services in some way, it seems that we haven't been completely sold on the cloud being any more safe or stable than data stored on our own computers.
...
Surprisingly, even on a site that tends to attract a lot of technology's earliest adopters, the responses were mixed. When asked the question: "Do you trust the cloud?," the majority of responses either came back as a flat-out "no" or as a longer explanation as to why their response was a "maybe" or a "sometimes." In other words, some people trust the cloud here, but not there, or for this, but not that.

The question this article asks is pointless on several levels.  First of all, it doesn't really matter if people trust the cloud or not. What matters is whether they use it or not. The average person doesn't trust computers, automobile mechanics or lawyers yet they use them anyway. Given the massive adoption of the Web from search engines and e-commerce sites to Web-based email and social networking services, it is clear that the average computer person trusts the cloud enough to part with their personal information and their money. Being scared and untrusting of the cloud is like being scared and untrusting of computers, it is a characteristic that belongs to an older generation while the younger generation couldn't imagine life any other way. It's like Douglas Adams wrote in his famous essay How to Stop Worrying and Learn to Love the Internet back in 1999.

Secondly, people are often notoriously bad at assessing risk and often fail to consider that it is more likely that data loss will occur when their personal hardware fails given that the average computer user doesn't have a data backup strategy than it is likely to occur if their information is stored on some Web company's servers. For example, I still have emails from the last decade available to me in my Hotmail and Yahoo! Mail accounts. On the other hand, my personal archive of mail from the early 2000s which had survived being moved across three different desktop PCs was finally lost when the hard drive failed on my home computer a few months ago. I used to have a personal backup strategy for my home desktop but gave up after encountering the kinds of frustrations Mark Pilgrim eloquently rants about in his post Juggling Oranges. These days, I just put all the files and photos I'm sure I'd miss on SkyDrive and treat any file not worth uploading the cloud as being transient anyway. It is actually somewhat liberating since I no longer feel like I'm owned by all my digital stuff I have to catalog, manage and archive.

On a final note, the point isn't that there aren't valid concerns raised whenever this question is brought up. However progress will march on despite our  Luddite concerns because the genie is already out of the bottle. For most people the benefits of anywhere access to their data from virtually any device and being able to share their content with people on the Web far outweighs the costs of not being in complete control of the data. In much the same way, horseless carriages (aka automobiles) may cause a lot more problems than horse drawn carriages from the quarter ton of carbon monoxide poured into the air per year by an average car to the tens of thousands of people killed each year in car crashes, yet the benefits of automobiles powered by internal combustion engines is so significant that humanity has decided to live with the problems that come along with them.

The cloud Web is already here and it is here to stay. It's about time we stopped questioning it and got used to the idea.

Note Now Playing: Amy Winehouse - Love Is A Losing Game Note


 

Categories: Cloud Computing

A few months ago, Tim O'Reilly wrote an excellent post entitled Web 2.0 and Cloud Computing where provides some definitions of two key cloud computing paradigms, Utility Computing and Platform as a Service. His descriptions of these models can be paraphrased as

  1. Utility Computing: In this approach, a vendor provides access to virtual server instances where each instance runs a traditional server operating system such as Linux or Windows Server. Computation and storage resources are metered and the customer can "scale infinitely" by simply creating new server instances. The most popular example of this approach is Amazon EC2.
  2. Platform as a Service: In this approach, a vendor abstracts away the notion of accessing traditional LAMP or WISC stacks from their customers and instead provides an environment for running programs written using a particular platform. In addition, data storage is provided via a custom storage layer and API instead of traditional relational database access. The most popular example of this approach is Google App Engine.

The more I interact with platform as a service offerings, the more I realize that although they are more easily approachable for getting started there is a cost because you often can't reuse your existing skills and technologies when utilizing such services. A great example of this is Don Park's post about developing on Google's App Engine entitled So GAE where he writes

What I found frustrating while coding for GAE are the usual constraints of sandboxing but, for languages with rich third-party library support like Python, it gets really bad because many of those libraries have to be rewritten or replaced to varying degrees. For example, I couldn’t use existing crop of Twitter client libraries so I had to code the necessary calls myself. Each such incident is no big deal but the difference between hastily handcrafted code and libraries polished over time piles up.

I expect that the inability of developers to simply use the existing libraries and tools that they are familiar with on services like Google App Engine is going to be an adoption blocker. However I expect that the lack of of a "SQL database in the cloud" will actually be an even bigger red flag than the fact that some APIs or libraries from your favorite programming language are missing.

A friend of mine who runs his own software company recently mentioned that one of the biggest problems he has delivering server-based software to his customers is that eventually the database requires tuning (e.g. creating indexes) and there is no expertise on-site at the customer to perform these tasks.  He wanted to explore whether a cloud based offering like the Azure Services Platform could help. My response was that it would if he was willing to rewrite his application to use a table based storage system instead of a relational database. In addition, aside from using a new set of APIs for interacting with the service he'd also have to give up relational database features like foreign keys, joins, triggers and stored procedures. He thought I was crazy to even suggest that as an option.  

This reminds me of an earlier post from Dave Winer entitled Microsoft's cloud strategy? 

No one seems to hit the sweet spot, the no-brainer cloud platform that could take our software as-is, and just run it -- and run by a company that stands a chance of surviving the coming recession (which everyone really thinks may be a depression).

Of all the offerings Amazon comes the closest.

As it stands today  platform as a service offerings currently do not satisfy the needs of people who have existing apps that want to "port them to the cloud". Instead this looks like it will remain the domain of utility computing services which just give you a VM and the ability to run any software you damn well please on the your operating system of choice.

However for brand new product development the restrictions of platform as a service offerings seem attractive given the ability to "scale infinitely" without having to get your hands dirty. Developers on platform as a service offerings don't have to worry about database management and the ensuing complexitiies like sharding, replication and database tuning.

What are your thoughts on the strengths and weaknesses of both classes of offerings?

Note Now Playing: The Pussycat Dolls - I Hate This Part Note


 

Categories: Cloud Computing