My latest podcast is a reading (MP3) of “Data – the new oil, or potential for a toxic oil spill?” — a column I wrote for Kaspersky in which I argue that data was never “the new oil” – instead, it was always the new toxic waste: “pluripotent, immortal – and impossible to contain.”
Data breaches are inevitable (any data you collect will probably leak; any data you retain will definitely leak) and cumulative (your company’s data breach can be combined with each subsequent attack to revictimize your customers). Identity thieves benefit enormously from cheap storage, and they collect, store and recombine every scrap of leaked data. Merging multiple data sets allows for reidentification of “anonymized” data, and it’s impossible to predict which sets will leak in the future.
These nondeterministic harms have so far protected data-collectors from liability, but that can’t last. Toxic waste also has nondeterministic harms (we never know which bit of effluent will kill which person), but we still punish firms that leak it.
Waiting until the laws change to purge your data is a bad bet – by then, it may be too late. All the data your company collects and retains represents an unquantifiable, potentially unlimited source of downstream liability.
What’s more, you probably aren’t doing anything useful with it. The companies that make the most grandiose claims about data analytics are either selling analytics or data (or both). These claims are sales literature, not peer-reviewed citations to empirical research.
Data is cheap to collect and store – if you don’t have to pay for the chaos it sows when it leaks. And some day, we will make data-hoarders pay.