In 2012, Thomas Davenport and DJ Patil, who went on to become the first Chief Data Scientist of the United States Office of Science and Technology Policy at the White House under President Obama, published an essay in Harvard Business Review titled “Data Science is the Sexiest Job of the 21st Century”. In it, they defined this role as “a high-ranking professional with the training and curiosity to make discoveries in the world of big data.” That’s undeniably one of the key aspects of the work a Data Scientist does: Dive into the data, explore and understand it – and then use machine learning and artificial intelligence (AI) to predict future events.
But that really only scratches the surface – where does all this data come from? In most real-life scenarios, companies have accumulated an eco-system of IT systems over the years, most of which weren’t designed to access all data on the finest granularity in near-real time all the time. Accessing those systems and merging the data streams is hardly trivial and then there's the data quality issues. This work alone can account for up to 80% of all project work.
There is also work to be done on the output: Making predictions about future events is all well and good, but they are not “actionable” as is. In order to create value from these predictions they need to be transformed into decisions, ideally ones that include a wide range of constraints. These decisions should also be optimized, which requires an objective function or some metric(s) that can be minimized or maximized. Once these decisions have been formed, they need to be executed, which, again, requires a lot of “data plumbing” to feed them into the operational layer of a business. And it doesn’t end here: Once a system has been put in place that ingests data, predicts future events, transforms the predictions into actionable optimal decisions that are then fed back into the operational layer, this system has to run reliably, smoothly and fault-tolerant.
Embedding Data Scientists in a Company
Data Science is certainly the essential piece of a modern commercial AI offering, but attributing all the success to the Data Scientists seems unfair to the other equally vital players. Several companies have experimented with a tiered approach to deal with the lack of Data Science talent on the market and surrounded each Data Scientist with a group of Data Engineers. These people primarily focus on getting the data ready for the Data Scientist and receiving the prediction model, which then has to be implemented, interacting with IT and Operations departments to get the model running.
Apart from the social conundrum of elevating Data Scientists to idol-like status, this approach has other disadvantages: The Data Scientist is far removed from the challenges of obtaining and cleaning the data, as well as making sure the prediction and decision model can run reliably and smoothly 24/7, which, in the end, creates the value for the end-user or customer. The approach can work, often by necessity as there are too few Data Scientists to be found on the market. But it’s not ideal as the different roles care about different things in this setup: Data Engineers care about “data plumbing”, Data Scientists about models, IT and Operations about operating a running system. Data Scientists may want to change models quickly to ingress more data or use more feature variables, Operations doesn’t want to touch the system once it’s up and running, agreeing to a few carefully planned releases per year at most. No one really cares about the customer or end user — as long as they don’t jump ship.
Comparing with other sectors: Software development
AI companies running into these issues are not the first to notice that working in siloes does not provide optimal results. In the worst case, people may not know what the others are doing – or may even work against each other on misaligned goals as everyone optimizes their own work merits independently. About 10 years ago, the software industry started to face similar challenges: Software developers and Operations became too detached from each other to work optimally. As computers became cheaper and more powerful, mainframes got mainly replaced by servers. The data centers started to scale to hundreds of computers, later thousands and even hundreds of thousands of computers. Physical maintenance of the hardware is still an important aspect of keeping computers running, but administration, setup, monitoring all needed to be automated in order to ensure the data center provides reliable and reproducible service. The next wave of innovation saw servers being virtualized and moved into the cloud – today’s Operations teams have little to do with actual hardware: Infrastructure is now code. James Urquhart points out: All aspects of operations such as monitoring, fault tolerance, reliable service, etc. still exist – but they have become part of the application. Operations becomes development and vice versa. This development gave rise to DevOps.
What is DevOps ?
DevOps brings all roles together and focus their efforts on a common goal. The main idea is to align the objectives (and even incentives) of the various groups: Software Developers still build new software, add more features, Testers and Q/A still mitigate risks and Operations still seeks stability – however, their common goal is to deliver a service to a customer and keep them satisfied. What helps customers the most? Adding new features to a software, testing them thoroughly and making sure that the service runs reliably and smoothly. Each aspect is important and incentives are aligned on delivering the whole package.
DevOps in an AI Company
The same approach brings significant benefit to AI companies: Putting the customers first may sound like a trivial motto – after all, who wouldn’t want to do this as it is the customer who ultimately pays for it all? However, implementing this as an organization is quite a challenge.
The key question everything evolves around is: What brings most value to a customer? The customer cares about optimal decisions, which are aligned to their key objectives and executed reliably and smoothly. To achieve this, data needs to be accessed, cleansed, prepared and high-quality predictions need to be made by excellent models. These predictions need to be transformed into optimal decisions, which in turn need to be transferred to the operational layer of the customer’s system – and the whole setup needs to run without a hiccup 24/7, all year round.
Developing a product in an engineering department, tasking Data Scientists in a Data Science organization to do everything from “data plumbing” to model development and then getting the product developed by the Engineers to run for a specific customer before passing it on to another team such as Operations is bound to create all the frictions and operational pains the software industry endured before “discovering” DevOps. A similar reasoning holds for AI projects: Bringing all groups together and aligning their objectives and incentives has the potential to bring the most value to customers – and to the organization providing the services.
Organizations Design Systems
Though Conway's Law was first developed in 1968, but is still relevant today: “Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations.” This means that the organizational structure of your company will be reflected in the products and services it develops. Large departments tasked with the development of complex products or services tend to develop large and monolithic applications with many complex dependencies. This makes it difficult to disentangle the different parts, change aspects of it or replace specific components as everything depends on everything else.
For example, if Data Scientists are not aligned well with the other members of the organization, automated decisions are likely hard to include in the landscape of IT and business systems.
Deloitte recently published a study comparing how organizational structures have changed significantly – and are still evolving. Traditionally, companies used to be organized in strict hierarchies with fixed reporting lines and responsibilities. Information mainly travels from top to bottom, if information needs to be shared with different units, it typically had to travel up a few levels before trickling down somewhere else. But in today's businesses, Deloitte noted that many companies have adapted a kind of mesh, where the various departments are no longer organized in a strict hierarchy, but directly connected to each other, with varying degrees of dependency on others. While this organizational structure allows for more flexibility than the hierarchical model, it doesn’t really help align the organization towards common goals. The result is everyone is connected to everyone, albeit often indirectly, without a clear structure. A more promising approach is to organize the company in small teams according to dedicated tasks. Each team has a full stack of competencies and resources and focuses on one aspect, minimizing the dependencies on other teams. In the agile vocabulary, this is often denoted as squads, tribes, chapters, guilds – or any other denomination for a group with e same interests (or objectives and incentives) and is able to operate (more or less) independently without strong dependencies to other groups. In the world of software development, “micro-services” are one realization of this idea.
The key principle remains the same: Build team with all required competencies and resources and align them toward a common goal that focuses on what creates value for the company and their customers.