Thursday, November 21, 2013

Data Science and of course Scientists

With the outburst of BigData and unstructured data analysis a new stream of Job role/function called data scientists has become a common thing in many organizations.
The phenomenon is no different than when Zachman wrote a white paper about the concept of " Enterprise Architecture " and there are millions of them now in the IT landscape.
What i have found interesting IMO is , the art of data science has been in practice for a very long time. Do we remember or hear or used this company called SAS from a little technology corridor in Raleigh Durham Cary area. They have been doing analysis on large sets of data for a very long time - spatial, geo, environmental, weather, census, retail etc.
But they never called it Big Data nor their analysts Data Scientists.
I see nothing wrong in this Big Data movement, but i take an issue with a few technology companies who are pushing the marketing spin so hard that its beginning to dilute the value.
Take for example EMC, Netapp, Oracle and IBM and also look at Amazon, SAS and say Pentaho. These 2 classes have very different agendas as you can see, both needed perspectives but the value of the offerings drastically different. The first class is about spinning big data and thereby using to sell Compute, Storage and Databases. The second class about providing real analytics solutions to solve business problems.
As we all know the concept of Hadoop/HDFS and the sprit behind the Google white paper and the concept of Big table is all around splitting workloads into smaller manageable chunks and move compute to where storage resides for organizations who cannot afford big giant boxes to begin with.

Now, how does Oracle even position a product such as a giant compute engineered system offering called Big Data Appliance that is not affordable to begin with.

Is it not counter intuitive ?

On the same token companies like Amazon providing offerings such as EMR and SAS and Pentaho providing a Bigdata hadoop offering along with their core product makes a whole lot of sense because they are in the data mining and churning business and now they got hdfs, hbase,map-reduce and a plethora of tools to go on top of it their lives are becoming easier.

Bottom line as an Architect be prepared to differentiate between Spin vs Value , Fluff vs Stuff.



Sada Rajagopalan

No comments:

Post a Comment