Big Data requires smart data scientists

Review of: Big Data at Work
Product by:
Thomas Davenport

Reviewed by:
On 14 February, 2014
Last modified:14 February, 2014


Big Data benefits some, but for most it remains an elusive concept. This book contributes more generalities than technical substance.

Big data, at least today, requires some educated faith. ROI is difficult to define in advance—particularly when it involves new products and services or faster decisions, according to Thomas Davenport in his book Big Data at Work: Dispelling the Myths, Uncovering the Opportunities. Nonetheless, some businesses are getting significant benefits from employing data scientists to work on Big Data, so it definitely seems to be something worth investigating.

Although the idea of Big Data is not precisely defined, the characteristics of Big Data described by the author include unstructured formats, volume of greater than 100 terrabytes, existing in a constant flow rather than a static pool, analysed by machine learning rather than hypothesis, and intended for data-based products rather than internal decision support. These are trends rather than absolutes, as Big Data includes more conventional types of data as well.

The key to deriving maximum advantage from Big Data seems to involve employing the smartest data scientists to analyse the data. Good data scientists are likely to be rare and expensive, given the ideal traits described by the author:

  • Understanding of big data technology architectures and coding
  • Improvisation, evidence-based decision making and action orientation
  • Strong communication and relationship skills, particularly in dealing with senior management
  • High level skills in statistics, visual analytics, machine learning, and analysis of unstructured data
  • Good business sense and focus on commercial value

The book assiduously avoids using technical language, and as a result the book avoids answering some of the questions raised in readers’ minds. For example, the author refers frequently to Hadoop as a preferred technology platform for Big Data, but never really explains how it differs from SQL databases, apart from the fact that it caters for unstructured data (but how?).

The book describes some large businesses such as banks which are making use of Big Data, and some small businesses which are analysing Big Data and using it to create and sell useful information, but never really answers the questions of how a normal business which does not have internal Big Data can get some, or how they could benefit from it, other than by hiring really smart data scientists and hoping that they can think of a way to use Big Data to reduce costs, speed up business processes, or come up with new products or services.

I found more useful ideas for the use of Big Data in Christopher Surdak’s book Data Crush, but this book does provide some interesting insights, particularly into the human elements of Big Data.

Big Data benefits some, but for most it remains an elusive concept. This book contributes more generalities than technical substance.

Leave a Reply

Your email address will not be published. Required fields are marked *