Showing posts with label Hadoop. Show all posts
Showing posts with label Hadoop. Show all posts

Wednesday, November 5, 2014

Hadoop Tutorial 1 - What is Hadoop?

One of the advantage of Social networking website is  they helps to promote new technologies even though its their need.
 
 Apache Hadoop is an open-source software framework for distributed storage and distributed processing of Big Data on clusters of commodity hardware. Its Hadoop Distributed File System (HDFS) splits files into large blocks (default 64MB or 128MB) and distributes the blocks amongst the nodes in the cluster. For processing the data, the Hadoop Map/Reduce ships code (specifically Jar files) to the nodes that have the required data, and the nodes then process the data in parallel.