In this article , We will learn how to enable rack awareness in hadoop clusters. Assume that cluster has large number of nodes and nodes are placed in more than one rack. If we enable rack awareness , all replicas of block will not be stored in one rack so that we can have at least one replica of block is available for data processing in case of rack failures.
Goal of rack awareness is to improve data availability and decrease network bandwidth.
1) Enabling rack awareness without Apache Ambari.
In old versions of HDP we used to enable rack awareness manually. Latest versions of Apache Ambari supports rack awareness in GUI.
Check the link on how to enable rack awareness manually , You will not require this as most of the latest versions of Apache Ambari are supporting in GUI.
2) Enabling rack awareness using Apache Ambari
Now we are going to see how to enable rack awareness using Apache Ambari . We have a five node cluster and by default we have got all nodes in default-rack.
Now we will modify rack for datanode3.
go to --> hosts in ambari -----> click on host where you want to modify rack------>go to host actions -----> click set rack
Modify rack name to rack-1 and click OK.
Go back to hosts page in Ambari to see rack name for datanode3 is changed.
You can see that nodes are placed in two different racks they are default-rack and rack-1.
3) Confirm rack awareness enabled
We can also confirm from fsck command and also from hdfs dfsadmin -report commands.
The picture below is the output of command hdfs fsck / and it shows number of racks is 2.
Let me know if you have any questions on above article.
Goal of rack awareness is to improve data availability and decrease network bandwidth.
1) Enabling rack awareness without Apache Ambari.
In old versions of HDP we used to enable rack awareness manually. Latest versions of Apache Ambari supports rack awareness in GUI.
Check the link on how to enable rack awareness manually , You will not require this as most of the latest versions of Apache Ambari are supporting in GUI.
2) Enabling rack awareness using Apache Ambari
Now we are going to see how to enable rack awareness using Apache Ambari . We have a five node cluster and by default we have got all nodes in default-rack.
Now we will modify rack for datanode3.
go to --> hosts in ambari -----> click on host where you want to modify rack------>go to host actions -----> click set rack
Modify rack name to rack-1 and click OK.
Go back to hosts page in Ambari to see rack name for datanode3 is changed.
3) Confirm rack awareness enabled
We can also confirm from fsck command and also from hdfs dfsadmin -report commands.
The picture below is the output of command hdfs fsck / and it shows number of racks is 2.
Let me know if you have any questions on above article.
Great blog ! I am impressed with suggestions of author. IT rack
ReplyDeleteFinally found an accessible and detailed guide, thanks!
ReplyDeleteNice Post
ReplyDelete