Hadoop: “The Swiss Army Knife” of the 21st Century

By: | December 20th, 2014

Hadoop won top prize at the Media Guardian Innovation Awards in 2011 and was described as “The Swiss Army Knife of the 21st Century.”

Storing & Processing Up To 25 Petabytes Of Data Reliably

We’ve all heard of “big data”; every day sees the creation of huge amounts of new data, 80% of which is unstructured meaning the information doesn’t come in neat Excel spreadsheets like row and column with names that help a user quickly understand the data.

Unstructured data is like a newspaper article or book in which the only structure available s the grammatical makeup of sentences. Algorithms must be to help identify key concepts, their relationships and there frequency along with other variables in order to assign meaning to the information that is intelligible to humans.

Hadoop, named after a toy elephant owned by Cloudera’s Doug Cutting’s son, is an open source software Cutting became involved in creating Hadoop as data on the Internet exploded and overwhelmed traditional operating systems.

The software batch processes very large’s data sets, up to 25 petabytes, and has been embraced by mainstream corporations to the tune of $14 billion annually, which includes $9 billion for the platform and $5 billion for had to analytic tools. Users can install multiple instances and run on 4,500 computers simultaneously.

Hadoop is an open source project governed by the Apache Software Foundation.

Components of the “Ecosystem”

Note: see full explanations of each term at: Hortonworks

  1. Ambari – manages and monitors  Apache Hadoop clusters
  2. HDFS – Hadoop Distributed File System
  3. Hadoop MapReduce
  4. Hive
  5. HCatalog
  6. HBase
  7. ZooKeeper
  8. Oozie
  9. Pig
  10. Sqoop

These components combine to provide an easy to use dashboard for viewing cluster health and heat maps as well as to monitor and diagnose performance issues.

Leading the industry in providing Apache Hadoop services is Cloudera along with Hortonworks and MapR. The annual Hadoop Summit was held in April, 2015. The North American Hadoop Summit was held in June 2014. See the Opening Video.

For more on Hadoop, visit the website.

Related articles on IndustryTap:

References and related links:

David Russell Schilling

David enjoys writing about high technology and its potential to make life better for all who inhabit planet earth.

More articles from Industry Tap...