Big Data demystified

April 20, 2015

by Kristen Bradley ’03

Although the term is easily thrown around, Big Data engineering is difficult.

Back in 1956, Big Data was literal—IBM’s first hard drive weighed a ton and held just 5 megabits of data. Today, with hard drives as small as eight ounces, the term Big Data has nuance—it refers to computers that are getting faster at collecting and storing enormous amounts—exabytes even—of data.

It means fewer physical limits and more creative ones: how quickly and effectively can we use this information? And how do we make sense of it all?

Experts say Walmart stores more than 3 petabytes of data—think, one petabyte is twenty million filing cabinets’ worth of text—and Walmart processes more than a million transactions every hour from its customers. It’s no wonder people are concerned when companies like Walmart are hacked and their data is stolen. Companies consider their data holdings to be a valuable asset and treasure trove of information.

And it is.

Bradley 1

Imagine a large farming company that collects harvest output over a period of twenty years and compares it to weather data and seed production. All that information is gathered and stored in large databases, perhaps—very aptly—on large server farms. This information can be used to anticipate production or set prices. If algorithms and analysts ask the right questions—like “how do seed sales drop in certain weather cycles?”—then both crops and revenues can be reaped.

Data manipulation is complex. And the technology needed to handle, search and manipulate this scale of data is not stand-alone. You need to have skilled engineers who intimately understand them. Identifying what you are trying to analyze and, more importantly, asking why are crucial to moving Big Data past being a buzzword to being beneficial across industries.

I worked more than ten years for the State Department in international relations before I joined a Big Data company. But it wasn’t until I started to work in Big Data that I could appreciate how much of a government’s security stems from digital research and analysis.

For example, big data modeling can predict disease patterns and help prevent post-conflict violence. Governments, NGOs, and companies are partnering together to match information data from tweets and search terms with industry databases and road circulation patterns to prevent crime, stop human trafficking, and connect vulnerable communities to needed resources.

Big Data is a fast-paced, growing industry with plenty of room for the entrepreneurial or philanthropic Morehead-Cain grad.

Big Data 2
Don’t be fooled, however.

Big Data isn’t available at an effortless click of a button. Big Data analytics requires extensive infrastructure and innovation.  The process is humbling. Yes, collecting and analyzing big data remains a challenge; but hey, when we graduate from UNC, cap in hand, that’s exactly what we signed up for.

Kristen Bradley works in business development at Palantir Technologies, a company specializing in data analysis