RAID in the cloud

Imagine if you will that I represent a major international criminal organization.

Like any business I have a vast amount of information to store: intelligence information on major law enforcement organizations, cash flow through money laundering, kickbacks, global sales reports etc.. Needless to say I want this information to remain private.

I need this information accessible globally to my various business subsidiaries, and I definitely don’t want this information stored on my laptop as I pass through customs as various jurisdictions feel they have the right to inspect its data contents.

With my own, or with an outsourced data centre anywhere in the world I run the risk that law enforcement officials can walk into the data centre with a court order to provide physical access to the servers my data is stored on. With enough time they can surely decrypt my data. This is an unacceptable state of affairs.

You would think that I would not want to risk storing the information in a data centre such as Azure or Amazon S3 storage: Microsoft may not be able to provide physical access to a server because its been welded into a shipping container, but I’d be a fool to believe there’s absolutely no way to retrieve the data from a specific customer.

I have found a new way to solve my dilemma. Cloud based data storage providers have multiple data centres distributed globally. Typically these use some concept of a container for the data, and when I create this container I can set the location for it. For example with Azure Microsoft have the capacity to make data centres available in multiple countries – USA and Ireland look like the first, but there’s no reason they can’t be rolled out to other countries.

Legal data centres protect against disk failure by using RAID – an implementation of error correcting codes to distribute a byte over more than one physical disk. My globalized illegal data storage will use a similar algorithm but instead of striping over disks I will stripe over multiple data centres located in multiple legal jurisdictions and from multiple vendors where possible.

If I can find 3 locations initially (US, Ireland and Iran for example) then I can use ‘RAID’ 5 over 3 data centres.

Consider the advantages this provides me. The US decides I must be brought to task and forces Microsoft to hand over my data from the US. No problem. With 1/3 of the data, and with the algorithm used correctly it should not be possible to retrieve the full byte of information. The US decides to deny access to my data store. Again, no problem. From the remaining data stores in Ireland and Iran I can reconstruct the missing data and push this into a new data centre.

Perhaps you feel the US and Ireland could agree on the need to force disclosure of my data. As additional cloud storage locations open in other countries the problem becomes increasingly pernicious. It becomes much harder to locate and access my data without the correct keys.

Like any new technology, cloud computing is bringing new challenges – some for developers who must choose how to best take advantage of it, but also to government regulators who need to decide how to control abuses of the technology. Have you thought through the opportunities and consequences yet?