Wednesday, October 20, 2010

Privacy and Future Web Services

Naval Ravikant wrote a great (and concise!) post on a trend in popular web services of taking data that was previously private and making it public.  I totally agree with this observation, and want to add to his post. From Naval’s post:

All sorts of businesses are being built by violating assumptions about the privacy of data.

Flickr violated the assumption that you wanted your photos private by default. Before Flickr came along, the default photo sharing model, espoused by Shutterfly, Snapfish etc., was that of private photo sharing.

Naval then continues the post by counting a bunch of different services as examples (foursquare, Twitter, Blippy, Instagram, etc…). He then ends the post by asking which data set will next be given the “public by default” treatment, thus creating the next hot startup along the way?

My comments on Naval’s thoughts:

Public by Default is Not New

This trend of public by default has been happening for awhile (at least 5 years), and yet every time it is applied to a new data set, the result seems surprising and new.  

Ari Paparo wrote back in 2005 a reflection on the Social Bookmarking space after Joshua Schacter sold del.icio.us to Yahoo.  Ari wrote that del.icio.us succeeded where many other bookmarking services failed before because (in large part) del.icio.us was the first social bookmarking service to make all bookmarks public by default.

Ari’s post I’m sure is not the first to expound on the value of public by default in web services, but it’s the oldest example I can think of off the top of my head. The whole Web 2.0 boom has been predicated in part on this public by default approach, of which del.icio.us and Flicker were pioneers. And yet, even today, making user data public by default in areas where traditionally the data has been private feels like radically new innovation each time it happens. 

Why Is Public By Default So Effective?

I think there are a few reasons why public by default keeps working over and over again in different data sets:

1. Public by default means more useful data exposed to every user. (A simple point, but more data is better than less data… duh).

2. Public data allows you to start the typical “social validation -> voyeurism” cycle that drives the popularity of nearly all social sites. If a data set is public, you can put it in a reverse-chronologically sorted feed to keep things current for users.

3. More public data allows you to show interesting metadata that emerges from the underlying data (what’s historically popular, what’s trending, what’s popular in your social graph, etc…). If you try to do this with private (anonymized) data, you’ll likely end with with an accidental privacy breach (think Google Buzz and the Contacts leaks).

4. Public by default means that unintended use cases are more likely to emerge.  For example, one of the best blogs in the music industry is actually a feed published on del.icio.us by mediaeater. I don’t think Joshua ever intended people to “blog” on del.icio.us, but it works great! The same unintended use cases definitely emerged in Twitter that never emerged in the previous incarnation of status updates (AIM, Y!Messenger, etc…).

So, what’s next?

Here are some data sets that are ripe for startups to change from private over to public by default. Some (probably all) of these are already happening, but have yet to reach mainstream adoption.  I think they all will eventually hit a broad audience as we continue the trend towards living in public:

  • Health (I know HIPAA adds huge friction here, but there will definitely be public patients in the future… we’re already seeing this in health coping forums and disease-specific blogs).
  • Finance (kaChing, StockTwits, Covestor are the beginnings of a larger movement that I think will gain broad adoption on Wall Street eventually).
  • Dating (Online dating today is a largely private process, which I think will open up over time as the stigma of having an e-girlfriend wears away).
  • Consulting (Today, billions of dollars in cognitive load are hidden behind private documents and consulting relationships. This data will move into the open via services like StackOverflow and Quora, where consulting-level of quality data is available for free).
  • Education (OpenCourseware is just the tip of the iceberg… Many classes will be widely syndicated through public channels.  All the economics are tied up in accreditation anyway, so there’s no reason not to open up class video and document feeds, especially at public govt-funded universities).

Notes

  1. shreenath reblogged this from thegongshow
  2. thegongshow posted this