Hadoop: Is It Soup Yet?

Most Hadoop-related inquiries from Forrester clients come to me. These have moved well beyond the “What exactly is Hadoop?” phase to the stage where the dominant query is “Which vendors offer robust Hadoop solutions?”

What I tell Forrester clients is that, yes, Hadoop is real, but that it’s still quite immature. On the “real” side, Hadoop has already been adopted by many companies for extremely scalable analytics in the cloud. On the “immature” side, Hadoop is not ready for broader deployment in enterprise data analytics environments until the following things happen:

  • More enterprise data warehousing (EDW) vendors adopt Hadoop. Of the vendors in my recent Forrester Wave™ for EDW platforms, only IBM and EMC Greenplum have incorporated Hadoop into the core of their solution portfolios. Other leading EDW vendors interface with Hadoop only partially and only at arm’s length. We strongly believe that Hadoop is the nucleus of the next-generation cloud EDW, but that promise is still three to five years from fruition. It’s likely that most EDW vendors will embrace Hadoop more fully in the coming year, with strategic acquisitions the likely route.
  • Early implementers converge on a core Hadoop stack. The companies I’ve interviewed as case studies indicate that the only common element in Hadoop deployments is the use of MapReduce as the modeling abstraction layer. We can’t say Hadoop is ready-to-serve soup until we all agree to swirl some common ingredients into the bubbling broth of every deployment. And the industry should clarify the reference framework within which new Hadoop specs are developed.
  • The Apache community submits Hadoop to a formal standardization process. In the middle of the past decade, the service-oriented architecture (SOA) world didn’t begin to mature until industry groups such as OASIS and WS-I stabilized a core group of specs such as WSDL, SOAP, and the like. The “Big Data” world badly needs a similar push to standardize the core of Hadoop so that vendors and users can have assured cross-platform interoperability. As it is, silos reign in the Hadoop world, with most users rolling their own bespoke deployments and most vendors forking the Apache code base to suit their needs.

Catch me at the Hadoop Summit in Santa Clara later this month. Hadoop isn’t soup yet, but I look forward to dipping my spoon into some tasty concoctions simmering on the industry’s front burner.

Comments

Marketing Evolution

While the paradigm on how to create new software has evolved from primarily silo-driven R and D departments to a broader collaborative effort, the biggest drawback is software marketing has not evolved. If you want Hadoop community editions to be more popular, some standardization is necessary for the corporate decision makers, and we need better marketing paradigms. While code creation is crowdsourced, solution implementation cannot be crowdsourced. Just as open source threatens to disrupt enterprise software, it will lead to newer ways to market software given the hostility of existing status quo.

Standardized Hadoop packaging

Exactly. Just as the EDW appliance arena has long had "high-capacity" vs. "high-performance" packaging options, the commercialized Hadoop vendors should provide, for example, a high-cap option (e.g, HDFS) vs. hi-perf option (e.g., Hbase).

Apache Hadoop is an Apache project

We're certainly glad that Apache Hadoop is enabling much of the new Big Data world of processing, and that many vendors are building such great solutions atop Hadoop software. But I hope that people are also willing to give the appropriate credit to the independent Apache Hadoop PMC that runs the project, as well as to the ASF that hosts so many Apache projects that are used so widely in the computing world.

We also appreciate it when vendors remember to attribute our trademarks, including Apache, Apache Hadoop, Hadoop, and the cute elephant logo.

Trademarks attribution

you mean like this http://www.mapr.com/
who shot the baby elephant. what do you guys do- do you send cease and desist. its a small thing for such awesome software?