There is no doubt that federal agencies today are grappling with how to manage massive amounts of data and make sense of the information collected for increased efficiencies. Big data continues to drive lively discussions in our industry. Earlier this year we spoke with our Data Center Practice Director, Eric Oberhofer about the drivers fueling big data projects, you can listen to the podcast here. Today, we open the discussion to our partners and feature a Q&A with Chuck Hollis, Vice President, Global Marketing CTO at EMC Corporation. His thoughts are included below:
How is big data impacting public/private sectors today?
We live in very exciting times, if you think about it. I think of big data as a “perfect storm” – masses of easy-to-get raw data, inexpensive computing resources and powerful analytical tools are all readily at hand. We’ve all now seen the potential of powerful predictive models and how they can transform virtually everything we do. The hard part? Figuring out how to put the new tools to work – and that takes strong leadership.
Virtually everything in our world today creates a vast data footprint, and it’s getting far easier to source all sorts of interesting data sets: social, mobile, web, sensors, etc. – in addition to all the traditional data sources. That’s not going to slow down anytime soon.
New expertise (e.g. data science) and new tools (e.g. Hadoop) now make it far easier to extract valuable insight from disparate data sources than ever before. It doesn’t matter whether we’re talking big business or local government – the impact of big data is starting to be visible everywhere. Some people call big data “digital oil” in that the valuable stuff is already out there, all we have to do is get it out of the ground, refine it and put it to work.
In some cases, we’ve moved on to the next big challenge: learning to make better business and policy decisions in the face of inarguable math. That’s a relatively recent phenomenon, and not everyone is comfortable with it – especially when the model’s recommendations fly in the face of established conventional wisdom, organizational hierarchies or political leanings.
There’s actually a term for it – the end of the HIPPO. It stands for the “highest paid person’s opinion,” which – if you think about it – accurately describes how most business and policy decisions get made today.
Thinking back to the last U.S. presidential election as an example, Nate Silver and his statistics-oriented peers shamed an awful lot of highly paid political commentators (e.g. HIPPOs) with stunning accuracy and detail. And, if I remember right, Nate Silver called the election in June, not November. To Nate and his peers, election outcomes are purely a numbers game. They are somewhat predictable far in advance. So are so many other things around us, it seems.
All that being said, the race is on to harness these powerful tools as quickly as possible. Private and public sector leaders aren’t really interested in big data, they want big answers. And they’re willing to invest to get them.
What are the challenges in dealing with the amount of data that is out there today?
There are multiple potential challenges we could discuss, but I keep coming back to one fundamental one – most organizations don’t know how to fully value the information they have access to.
Because information is hard to value, it’s hard to figure out what resources to throw at it: storage, network, compute, people and so on. Big data changes this: everyone now understands the value of the data being kept around, so it’s easier to justify the resources and investment to do a good job at it. Otherwise, keeping all that data around is seen more as just another expense to be minimized.
On the technology front, the most expensive component associated with big data is – of course – storage. The good news is that per-unit storage costs keep dropping and dropping. The not-so-good news is that those improving efficiencies can’t begin to keep up with demand, so – every year – we collectively spend more to gather, store, protect and manage information. I don’t think that’s going to change anytime soon, either.
What are the benefits?
Better decisions made through better insights.
It doesn’t matter whether you’re in the public or private sector – you’ve got a limited amount of resources to get the job done. But you can greatly impact the outcome by making better decisions around how to deploy and organize those resources. Big data is sort of a crystal ball that gives you better insight on likely outcomes if you choose one course of action or another. What will be the effect on the crime rate if we increase police presence in these three neighborhoods? How likely is this particular buyer likely to positively respond to a certain kind of message? When will this person be likely to be buying their next car, or house? Do patients stay healthier with regular email reminders to take their medications?
Historically, we used to try and do this with limited data sources and the wrong sorts of tools. The familiar business intelligence (BI) and reporting tools were designed to report on what happened in the past – and not try to glimpse what might happen in the future. While business reporting is still valuable and important, it’s not really big data and it’s not data science. Different disciplines, different tools, different outcomes.
Discuss best practices to follow to get the true value of big data.
The best model I’ve seen is to frame the discussion around a journey to organizational proficiency. In an ideal world, there’d be a critical mass of people throughout the organization who were comfortable with not only the new tools, but the newer non-HIPPO style of decision making that goes along with it. And there are more than a few organizations out there that fit that model, so it’s not an idle theory.
But I wouldn’t describe the vast majority of organizations as “analytically proficient.” So how does one go about becoming one?
Executive management has a clear role in establishing the priority, making resources available and driving an agenda. Very often, a program management team will be formed to select initial use cases, run a few proof-of-concepts, and then move on to a more programmatic approach. And IT, of course, has to stand up newer, self-service environments where data is far easier to find and far easier to experiment with than the familiar tools we have today.
But the good news is that – once you get started – things often move pretty fast. Once decision makers see the power of these predictive models, they usually are “all in” on doing more, and doing it faster. It can be challenging for the rest of the organization to keep up, but I suppose that’s a good problem to have.
Over the next few years, how will big data evolve?
We’re now in an era of widespread experimentation and exploration. Everyone seems to be standing up these environments, and seeing what magic they unlock from the data that’s already readily available. And I’m sure that won’t slow down anytime soon.
But we’ve now seen clear signs of the next phase – going from exploration to production applications. Once you’ve figured out a predictive model that really moves the needle, you want to bake it into an application so everyone can use it all the time. These new predictive analytics applications are very, very demanding: lots of data, lots of processing, lots of performance. But we’ve started to see larger enterprises start to invest in them for one simple reason: they work.
If I were to speculate, I could see a world where powerful predictive modeling tools (and the skills to use them) are very commonplace and hardly exotic. I remember seeing my first spreadsheet (VisiCalc on an Apple II) back when I was in high school – I knew it would end up being pretty popular. In some sense, the cycle repeats itself: better technology means we can craft better tools to make better decisions – and everyone benefits.
And certainly big data is the most powerful new one we have at our disposal.
A 17-year EMV veteran, Chuck Hollis is a popular industry blogger and writes frequently about advanced IT leadership topics. You can read his blog at http://chucksblog.emc.com.