Big data projects: separate roles, but a common goal

IN Big Data General — 23 October, 2013

McKinsey analysts recently described which roles need to be filled in a big data project, in one of their blog postings. Alongside various analytical capabilities, one thing is very important: giving the specialized user the information that he or she needs, exactly when he or she needs it. Easy-to-implement apps with an intelligent and easy-to-understand interface support the user in making operative decisions.


Data Science Academy is coming soon ... © Lonely - Data Science Academy is coming soon ... © Lonely -


In big data projects, an exchange between specialist users and the data scientists is of particular importance. Only in that way can application-friendly apps be created that lead to powerful analyses that support enterprise success. In this regard, McKinsey authors Matt Ariker, Tim McGuire and Jesko Perry defined five roles that should be filled in a big data project. An initial expert needs to make sure that the data is "clean." Is the data set complete? Have the data been captured based on unified criteria and can they be clearly assigned? Sources of error in this sensitive phase could be that time intervals were defined different from one another or that differing "generations" of product data were mixed and that they are thus mutually exclusive. The next step is to choose the relevant data for the forecasts. Most data that exist in enterprise systems are used in operative processes. But the primary goal was not to analyze them. Creative minds are needed here to develop ideas on what direction to move in, and they need to have a good feel for what material would be best for that.

 As an example, the McKinsey authors cite data from cash registers. Using that data, conclusions can be reached on the future purchasing behavior of customers. The business solution architect designs a structure that ensures that users easily receive the forecasts relevant to them. Part of that is defining the time intervals in which the analyses will be accessed: Dynamic pricing requires data in real time. For automated materials management, it is enough to create a forecast every 24 hours. However, it will be very comprehensive, and the error tolerance is low.

 Only in phase four does the actual data scientist go to work, and the data scientist is responsible for the analysis models. Part of the job then is to adapt them exactly to the purpose of the analysis  –  from customer analysis to preventive maintenance. At the same time, the models must be continuously maintained. After that, the campaign experts set up the actual link to the users. They use the results from the analyses, for example by planning multi-level marketing campaigns and define the target group for the individual actions.

 In actual practice, what the blog posting authors spent so much time separating, is really interlinked. All the participants must know their tasks exactly and they must feel responsible for the overall project. But most important is that the IT experts always change their perspective. For example, the issue must be addressed as to who will ultimately deal with the analyses. Will the materials management planners be presented with the forecasts in such a way that they can then work with them easily and immediately? What information do they need and how often?

 Blue Yonder, as a software provider, supports IT experts in building up their individual know-how and thus helps them position themselves optimally in big data projects. For that reason, we founded the Data Science Academy. We will be introducing the Academy curriculum to you in the coming weeks. But we can already reveal the following: there will be webinars, multi-day training events and online tutorials. We are thinking about setting up a strategy forum to add to the educational offerings that already include training data scientists and user interface designers. We hope that you will be surprised and pleased by what we offer!

Dr. Ulrich Kerzel Dr. Ulrich Kerzel

earned his PhD under Professor Dr Feindt at the US Fermi National Laboratory and at that time made a considerable contribution to core technology of NeuroBayes. He continued this work as a Research Fellow at CERN before he came to Blue Yonder as a Principal Data Scientist.