Hadoop won top prize at the Media Guardian Innovation Awards in 2011 and was described as “The Swiss Army Knife of the 21st Century.”
Storing & Processing Up To 25 Petabytes Of Data Reliably
We’ve all heard of “big data”; every day sees the creation of huge amounts of new data, 80% of which is unstructured meaning the information doesn’t come in neat Excel spreadsheets like row and column with names that help a user quickly understand the data.
Unstructured data is like a newspaper article or book in which the only structure available s the grammatical makeup of sentences. Algorithms must be to help identify key concepts, their relationships and there frequency along with other variables in order to assign meaning to the information that is intelligible to humans.
Hadoop, named after a toy elephant owned by Cloudera’s Doug Cutting’s son, is an open source software Cutting became involved in creating Hadoop as data on the Internet exploded and overwhelmed traditional operating systems.
The software batch processes very large’s data sets, up to 25 petabytes, and has been embraced by mainstream corporations to the tune of $14 billion annually, which includes $9 billion for the platform and $5 billion for had to analytic tools. Users can install multiple instances and run on 4,500 computers simultaneously.
Hadoop is an open source project governed by the Apache Software Foundation.
Components of the “Ecosystem”
Note: see full explanations of each term at: Hortonworks
- Ambari – manages and monitors Apache Hadoop clusters
- HDFS – Hadoop Distributed File System
- Hadoop MapReduce
These components combine to provide an easy to use dashboard for viewing cluster health and heat maps as well as to monitor and diagnose performance issues.
Leading the industry in providing Apache Hadoop services is Cloudera along with Hortonworks and MapR. The annual Hadoop Summit was held in April, 2015. The North American Hadoop Summit was held in June 2014. See the Opening Video.
For more on Hadoop, visit the website.
Related articles on IndustryTap:
- Free Software Being Used to Steal Money from ATMs
- State-of-the-Art Physics Software Elevates Paper Airplane Design to a New Level
- Smart Voting Software Aims to Replace Politicians with Direct Democracy
References and related links:
- Welcome to Apache™ Hadoop!
- Understanding storage in the Hadoop cluster
- Hadoop: Writing and Running Your First Project | Dr Dobb’s