Components of Hadoop

  • Hadoop Distributed File System (HDFS):
  • This is the storage layer of Hadoop; it is written in Java.
  • Data is stored in commodity machines in a distributed manner with multiple copies to ensure reliability.
  • Its architecture is inspired by the Google File system.
  • It is horizontally scalable.
  • The cost of setting up the system is quite low, as commodity machines are cheap and easily available.
  • There are no specific nodes assigned to specific services in a Hadoop cluster.
  • MapReduce:
  • It is a programming model that enables distributed big data processing.
  • It provides many libraries, which are collectively known as MapReduce.
  • It comprises of two phases, the Map phase and the Reduce phase:
  • The tasks performed in the Map phase are called Map tasks.
  • The tasks performed in the Reduce phase are known as Reduce tasks.
  • Both of these tasks have dedicated separate slots reserved for them
  • MapReduce programs are natively written in Java, but can also be written in Python, Ruby and C++.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store



Tech enthusiastic, life explorer, single, motivator, blogger, writer, software engineer