Big Data vs. Artificial Intelligence

In recent years, almost all companies have jumped on the bandwagon to invest in Big Data. Each year, Matt Turck compiles a comprehensive list of Big Data companies that shows the relevant players on the market. Although such a list is ultimately incomplete, the recent versions have become so complex and detailed that they become hard to read, even on high-resolution displays.

Big Data vs. AI

From Big Data...

What is “Big Data”? Gartner suggests that “Big Data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision-making, and process automation.” This definition points out a few crucial aspects, which sometimes get a bit lost. First and foremost, Big Data has a specific aim: Enhanced decision-making and process automation. In 2012, Gartner pointed out that “processing large volumes or wide varieties of data remains merely a technological solution unless it is tied to business goals and objectives.”

Looking back, what happened in the early days of Big Data went like this: Companies “did” Big Data, spending a considerable amount of money to buy large computing clusters, often based on the then-new Hadoop ecosystem or in-memory columnar databases – without really knowing why. A number of companies, both established players as well as new start-ups, entered the market to provide storage and data-processing systems, mainly building technical solutions. After a while, companies realized that having data is good, but being able to process it is even better, though still without knowing why.

As a result, investing in Big Data was nothing but a large pit for money and resources (although some IT departments were quite happy to get access to the latest toys without being too closely supervised as to how they used it). The Big Data market started to expand and new sectors started to appear in the Big Data Landscape: Applications for industry, enterprises and even consumers. Companies populating these sectors started to change the way Big Data is perceived on the market. While the technological aspects remain important and are addressed by specialized companies, other enterprises started to focus on what to do with these vast data collections and their processing capabilities – much more in line with Gartner’s statement from 2012.

…to Artificial Intelligence

Enter the world of machine learning and artificial intelligence (AI): Machine learning and AI methods have advanced considerably in the past few years as well as – crucially – the technological basis required to run AI-based applications. Just a few years ago, building a Big Data application required building and operating a large scale computing and storage cluster, as well as developing the underlying machine learning code that is used to build a predictive application. Many cloud providers like Google, Microsoft or Amazon now offer services to host data as well as process it, build and run predictive models. All without having to maintain the underlying infrastructure – but, of course, at a cost.

Operational issues do not magically disappear when someone else handles them, but relying on industry giants allows companies to use expert-level services and focus on their core competencies.

A similar story unfolds with machine learning and AI algorithms: Although research groups have always made their algorithms available, the release of TensorFlow by Google or Torch by Facebook meant that AI frameworks, which were "battle tested" on a planetary scale, became available as open source packages seemingly overnight. As with storage systems, computing clusters and technical infrastructure, this allowed new players in the AI market to focus on their core competencies and develop novel applications.

Similar to the “Big Data Landscape”, various compilations of the “AI Landscape” summarize the current state of the industry. Although these compilations also feature companies and think-tanks that focus on the development of core technologies and hardware, most of the canvas is filled with companies developing novel applications based on machine learning and using artificial intelligence to “re-think” entire industries or verticals.

Is Big Data still a "thing"?

Is Big Data then still a “thing”? Yes and no. Developing novel AI-based applications to re-think the way business processes are handled or how entire enterprises or verticals can be transformed, needs access to a large amount of high quality data. Big Data technologies enable companies to harness the power of their data to become predictive enterprises.

The emergence of many cloud-based solutions also means that operating a big data center has become one of several options, though much of the gathering and cleaning the data work remains the same. Certain applications and sectors, like nuclear power plants, will always have to rely on on-premise installations because of security considerations or regulatory compliance. However, many others are not constrained by this. Big Data has certainly raised the awareness of what can be possible when enterprises use data to their advantage – but to move forward, businesses have to approach all sides of the value triangle.

Dr. Ulrich Kerzel Dr. Ulrich Kerzel

earned his PhD under Professor Dr Feindt at the US Fermi National Laboratory and at that time made a considerable contribution to core technology of NeuroBayes. He continued this work as a Research Fellow at CERN before he came to Blue Yonder as a Principal Data Scientist.