Video produced by Steve Nathans-Kelly
At Data Summit Connect 2021, John Mosch, Cisco senior manager, analytics, broke down the key roles for an effective analytics team and explained how (and in what order) to fill these roles and assemble your team.
"Getting started before you even hire somebody. So who do you hire first? It's not going to be a data scientist. I can tell you that the most important role, the most important first hire is a data engineer," said Mosch. "Now, the reason for that is without data, there's nothing to do. All the initial work is going to be data work. These are the people who are going to make the data available and usable. they're going to collect it and arrange it into a form that can be ultimately useful for analytics that ultimately be used by data scientists, but it's all data work upfront. I mean, a data scientist can't find anything and can't do anything until there's a good set of data to work from. Now, this role is necessary. There's no way around it. If you don't have this role, if someone's not doing the data work, well, someone has to do the data work."
There's just no way around it, said Mosch. "If the data engineer is not doing it, your data scientist is going to get stuck with it. And this is a misuse of their skillset. I mean, they're capable of doing it. They're not ideal for doing it. I mean, they're not necessarily going to know how to do ETLs and APIs, and they're not going to necessarily know how to manage a data warehouse. They might but, even if they do know, it's not necessarily something that they want to spend time doing, they want to spend time doing statistical modeling. They want to spend time questioning the data. They want to spend time, looking for results and crafting those results into a story. They don't want to spend time doing data work. And that's why it's very important to have these data engineer roles."
In fact, said Mosch, a company should have at least two data engineers for every data scientist. "That's how important this work is. I mean this is fundamental for one reason, as you're offloading a lot of the data work from your data science engineers, as well as you're setting up the opportunity for automation—you should not be doing something manually more than twice. If that's the case, it should be automated. And the data engineers can take care of that. They can automate things so that the data scientists don't have to mess with them on a day-to-day or week-to-week, month-to-month basis." It allows data scientiss to have a nice clean datasets to work from. "And like I said, it's work that needs to be done. It's the bulk of the work, and you're best off hiring specialists that deal with it to free up the data scientists."
The next hire would be data scientists, said Mosch. "Once you've got some data to work from, these are the people who are going to be creating the models and analytics that produce insights. Like I mentioned already, it's kind of a common mistake that people think that the data scientist should be the first hire or potentially the only hire, but that's like trying to start a band with only a lead singer. I mean, you need people to set up data systems, data collection systems, data processing systems, integrate those results into operational processes. They need to maintain those data systems. If you've only got a data scientist, they're going to be super busy with doing things and it's not really data science so it's important to have these other roles that are important to have data engineers, to help out the data scientists.
Another important aspect of hiring is getting people with communications skills. "These are the people who you're going to be calling on to present to executives, present to stakeholders," said Mosch. "You really do want to emphasize communication skills as part oftheir capabilities it is something to interview for. It is something to assess as you're looking at people who are candidates for this communication is really key in this role because they are very often the front face of what your organization is, is doing."
And last, you are going to need to hire people to maintain the systems, said Mosch. These could just be data engineer specialists in a way, a specialist in monitoring and instrumenting, a specialist in maintaining the data systems. This is somebody you hire later on, once you have a whole system that you need to maintain. "Now. you might be asking yourself, why do I need all three of these different things? Why can't I just hire one person to do all three? Well, first of all, good luck finding someone who's truly capable of doing all three things. They are few and far between. And if you do find somebody, they'll probably be very expensive. But, even if you could, it's still not a good idea because with these three different roles, it's not just three different skill sets, it's three different mindsets," said Mosch.
"The data scientists come at it with a science mindset. They like to do experiments. They like to explore options. The data engineers come from it with a software engineering kind of mindset. They like to build things. They like to build quality software, quality code, and have everything operate in a practical way. And then the DevOps/DataOps people, they come at it with an operation mindset. They like to monitor things and have them keep running smoothly. So you're dealing with three completely different mindsets for approaching the work, which is a perspective you want to have on your team. It creates a much more balanced and comprehensive approach to the work. It's in my opinion, key to the success of these kinds of projects that you have that balanced and comprehensive view of the work, which you won't get from having a single person doing all three roles."
Lastly, Mosch said, you have the leadership role. "Once your team's big enough, you want to hire somebody, but in some cases, if you don't have someone already in the team, and if you're not the person already on the team that has some management experience, as well as some skill and leading analytic projects—if that doesn't exist already—this is probably going to be your first hire. I mean, you want to build someone who's capable of building the team from the ground up. So if you don't have somebody like this already, this, this is not your last hire. This is your first hire. But once the team grows to a certain size, that is absolutely critical. You're going to have so much work, you are going to need to have somebody help manage the work and make sure it gets done as efficiently and effectively as it possibly can be."