Who Owns Your Data?

Scott Hanselman published an interesting blog post (Your words are
wasted
) this past weekend that discusses ownership of user-generated
content in social media. While Scott’s post was primarily concerned
with blogging and related social media services like Twitter, Facebook
and Google+, I think it raises questions about data ownership with any
web or cloud applications1
and I think it is important to consider the risks that come with storing
your data in a cloud application.

Your Data is Being Held Hostage

Data backup is an important issue to address. The application provider
may or may not provide a backup feature. For example, 37signals and
Google do have options for dumping your data but many others
including Twitter do not. Even with a data backup feature, there is no
guarantee that you will get all of data or get it in a format that can
be consumed by a similar application. In other words, just because you
can dump data from one application does not mean that it can be easily
loaded into another.

Since the service provider has access to all of your data, they can use
it to build a profile for advertising purposes. This is how most of the
“free” services operate. They mine your data and sell that information
to advertisers. This can include watching how you use the application,
looking at your social graphs and even tracking you across other sites
on the internet. Most of the social media buttons you see on other sites
are there for the express purpose of tracking users so that they can
build a more attractive profile for advertising.

It is always good to remember that when using any cloud application your
data is at the mercy of the service provider.

Some Providers are Better Than Others

The first risk to consider is that the provider may go out of business
or get acquired and shut down by the new owners. (If you think Facebook
and Twitter are too big to fail, please remember that not too terribly
long ago people were saying the same thing about MySpace and Digg.) In
either scenario, your data is now gone. If you are lucky, the
application will be shut down gracefully and provide the opportunity to
download a copy of your data before it goes dark.

There is also the risk of the application provider making radical
changes to the terms of service. Facebook is infamous for making
changes to their privacy policy and terms of service. In more recent
news, Twitter is making a radical change in the usage terms for its API and
killing off third-party clients. Providers can also make radical changes
in the pricing of their application. While not much of a concern for
“free” applications, it is something to think about if you pay a
subscription fee for use of an application. The danger here is that once
you are heavily vested in an application, moving to a different
application may be more painful than simply accepting some onerous new
terms or a price increase.

Where Do We Go From Here?

Despite the risks listed above, I do not think that all cloud
applications are bad. Most cloud applications are running in data
centers that are more sophisticated and robust than our personal
computers and mobile devices. Honestly, having access to my data on all
my devices is a killer feature and it is not something I want to part
with any time soon. I just wish there was a way to have all the
advantages of cloud applications while still retaining full control and
ownership of my data. If I can find some free time I want to look into
what it would take to install and configure some
FLOSS
cloud applications on Amazon EC2 or Rackspace Cloud Server. In
essence I would like to build my own “personal cloud”. But that is
something for a future post.

1 For the purpose of this
discussion, I am limiting cloud applications to software-as-a-service
(SaaS) offerings and social media platforms.