Looking again at the data science diagram – or the unicorn diagram for that matter – makes me realize they are not really addressing how a typical data science role fits into an organization. To do that we have to contrast it with two other roles: data engineer and business analyst.
What makes a data scientist different from a data engineer? Most data engineers can write machine learning services perfectly well or do complicated data transformation in code. It’s not the skill that makes them different, it’s the focus: data scientists focus on the statistical model or the data mining task at hand, data engineers focus on coding, cleaning up data and implementing the models fine-tuned by the data scientists.
What is the difference between a data scientist and a business/insight/data analyst? Data scientists can code and understand the tools! Why is that important? With the emergence of the new tool sets around data, SQL and point & click skills can only get you so far. If you can do the same in Spark or Cascading your data deep dive will be faster and more accurate than it will ever be in Hive. Understanding your way around R libraries gives you statistical abilities most analysts only dream of. On the other hand, business analysts know their subject area very well and will easily come up with many different subject angles to approach the data.
The focus of a data scientist, what I am looking for when I hire one, should be statistical knowledge and using coding skills for applied mathematics. Yes, there can be the occasional unicorn in a very senior data scientist, but I know few junior or mid-level data scientist who can surpass a data engineer in coding skills. Very few know as much about the business as a proper business analyst.
Which means you end up with something like this:
Data scientists use their advanced statistical skills to help improve the models the data engineers implement and to put proper statistical rigour on the data discovery and analysis the customer is asking for. Essentially the business analyst is just one of many customers – in mobile gaming most of the questions come from game designers and product designers – people with a subject matter expertise very few data scientists can ever reach.
But they don’t have to. Occupying the space between engineering and subject matter experts, data scientists can help both by using skills no one else has without having to be the unicorn.