If Big Data is the Problem, Then Hadoop May be the Answer
Big Data may be the problem, but Hadoop is the answer. Hadoop is an open-source software framework that enables applications to run across large arrays of nodes, accessing petabytes’ worth of data.
It was originally created by Doug Cutting to support the open-source Nutch search engine project, which is now part of the Apache Lucene text-search library. Hadoop was actually named after Cutting’s son’s toy elephant – a fitting analogy for the big data challenges that lie ahead.
This fall, I have the privilege of moderating a series of webcasts, called “Hadoop Tuesdays,” on the phenomenon sweeping the data management space known as “Hadoop.”
The first session is on Tuesday, Sept. 1, commencing at 1 p.m. Eastern with a chat with James Kobielus, a data management guru from Forrester Research. He’ll discuss what Hadoop is and why it’s so important to enterprises today. The event is co-sponsored by Informatica and Cloudera.
As Jim put it in a recent post, “Hadoop is vendor-agnostic in-database analytics in the cloud, leveraging an open, comprehensive, extensible framework for building complex advanced analytics and data management functions for deployment into cloud computing architectures.”
He adds that “Hadoop has already proven its initial footprint in the enterprise data warehousing (EDW) arena: as a petabyte-scalable staging cloud for unstructured content and embedded execution of advanced analytics.” This is where Hadoop is proving its mettle, he adds.
These days, any organization contemplating a Big Data strategy — one that will deliver value through greatly enhanced analytical capabilities — needs to look at Hadoop.
Leave a Comment
You must be logged in to post a comment.