Why Data Can Be More of a Liability Than an Asset

When the Amount of Data Gets Truly Big, So Do the Problems in Managing It All

By Published on .

Target infamously outted a pregnant teenager to her father as a result of its data collection.
Target infamously outted a pregnant teenager to her father as a result of its data collection. Credit: David Ryder/Bloomberg
Most Popular

If you work involves producing software -- websites, mobile apps, anything -- sooner or later you learn that code is a liability. All things being equal, the less code you have, the better off you are.

This is because code slows you down. Code equals complexity, and complexity makes it hard to change things and move forward. Code also has bugs, and bugs will make you spend time chasing after them. Code makes onboarding new developers more difficult. The list of downsides goes on and on and on.

In other words, we spend a ton of time and money accumulating more and more code, but at the end of the day, it's something we wish we had much less of.

Turns out data is the same way. The default stance of any company is to want it all. The big data megatrend has taught us that user data is hugely valuable. And unlike code, data seems almost free: User activity generates an endless amount of it.

In the trenches, different viewpoints appear. As with code, more data makes things more difficult. When the amount of data gets truly big, so do the problems in managing it all. IDC estimates that big data companies will sell $125 billion worth of solutions to those problems in 2015 alone. These direct costs are huge, but they are dwarfed by inherent risks in storing unbound amounts of private user data.

Regulatory compliance is a factor, but it's something that big corporations, top marketers and publishers among them, are uniquely suited to tackle. Yet the business risks of storing data are manifold. Nobody wants to be the next Ashley Madison, but the even bigger risk is breaking the trust of users in more mundane ways.

The canonical example is the story of Target outing a pregnant teenager to her father. In marketing circles, this is often touted as illustrating the predictive power of big data. But for the brand relationship between the teenager and Target, the breach of trust will forever trump any upside of the pregnancy-targeted offers. Can you imagine yourself in her shoes? Can you imagine how your own customers would feel if the data you've stored about them got out in the open?

Here's a hard truth: Regardless of the boilerplate in your privacy policy, none of your customers have given informed consent to being tracked in this way. Your customers have not carefully weighed the pros and cons of sharing data with you. Nor should they have to, because the burden is on you to make sure that your data collection will only result in positive outcomes for your customers and your business. Is that a guarantee you can give?

If you collect everything you can, because you can, the answer to that question is no. You have no idea what you are getting yourself and your customers into. And so if your big data strategy boils down to these three steps:
1. Write down all the data
2. ???
3. Profit

It's time to re-evaluate, fast. You can't expect the value of data to just appear out of thin air. Data isn't fissile material. It doesn't spontaneously reach critical mass and start producing insights.
The solution I advocate is simple. You don't start with the raw data. You start with the questions you want answered. Then you collect the data you need (and just the data you need) to answer those questions.

Think this way for a while, and you notice a key factor: Old personal data isn't answering many questions. You tend to use old data in aggregate to see big-picture trends. Fresh data is what you're directly monetizing with targeted offers. But old, personal data? More than anything, it's just an accident waiting to happen.

So the problem with storing personal data indefinitely, just in case you'll need it later, is that you simply won't. Incurring the cost of storing, managing and safeguarding it makes no sense at all.
Actionable insight is the asset you're after. But personal data is a liability. And old personal data is a nonperforming loan.

Marko Karppinen is founder and CEO of app designer Richie.