Data scientists are highly skilled individuals who often have post-graduate qualifications (many with Ph.D.s and post-doctoral experience), in numerate scientific disciplines and an aptitude for statistics.
Experienced data scientists are often hard to find, relatively expensive and if they are enjoying what they are currently working on (and being paid enough to keep them happy), it is likely to be difficult to get them to jump ship.
The choice for those businesses that want to harness the power of data is:
- Do they hire in?
- Do they bring in contractors?
- Or do they support their staff to upskill where possible?
How to decide if your business needs data scientists
I believe most organizations will get significant value from data science for better business decisions, from creating new insight driven applications, to improving customer service or simply reducing cost in unexpected places.
Building an internal capability to identify the opportunity requires time, patience and the commitment of a long term investment, which you will only do if there is a strong business case for it. In all likelihood, any business over a given (yet to be determined) size, could benefit from data science.
To get started, you can test the potential value of data science by contracting out development of proof of concepts that identify opportunities. By contracting out the development of initial proof of concepts and showcasing the capability of delivering new insights, that can drive positive change in the business, a view of what is possible and deliverable, and their cost implications, becomes more realistic.
What is a data scientist?
The phrase “Data Scientist” evokes a broad range of interpretations. There is the absolutely puzzled expression of those who live in a world where data and scientist are just words put together in a sentence, but do not constitute a meaningful expression.
There are business leaders who hear rumors of data-driven paradigms and data changing the world, and though they do not know what a data scientist is, they know that for their business to survive, they should probably learn more and probably get a few on board. There is a hue of different views on the term, including within the data science community!
I'm a data scientist who joined the business world from academia, and think that no one really agrees on what actually makes someone a data scientist. I believe the following definition could cover most data scientists: “A scientist who applies the scientific method to solve business problems, by processing and analyzing data to answer questions which drive changes in the business.”
That definition doesn't mention D3, Python, R, Spark or even SQL as those are just tools. Yet, this could almost be seen as exclusive and self-serving. To get a sense of what others think a data scientist is or should know or should be, see the following:
Though there is some scope in the definition of a data scientist and of the skills required, it can be said that the role involves a fairly comprehensive skillset that is varied – and it is unlikely that a single data scientist will possess all possible skills in the toolshed.
Build an in-house data science team, upskill or outsource?
Limiting the investment in new skills and capabilities can also be achieved by upskilling existing staff. However, the skillset required may prove prohibitive to this approach. A happy medium can be achieved, if contractors are brought in to build intelligent data driven solutions, which can be operated by individuals who do not require the same level of expertise. The analogy being, you do not need to be an experienced software developer to get the most out of your favorite spreadsheet program. An example of this, is the assurance scoring platform we have developed at Capgemini for the public sector, which is capable of delivering assurance scoring which does not require a data scientist to operate it.
Data science can bring transformative insight to business, this is clear and understood. However, the delivery of that transformative insight is not predicated on the existence of an internal capability. As data science serves to answer questions, the first question to consider is, how much data science do we need to answer our question(s)?