Sto cercando di capire meglio come funziona Hadoop e sto leggendo
The NameNode is a Single Point of Failure for the HDFS Cluster. HDFS is not currently a High Availability system. When the NameNode goes down, the file system goes offline. There is an optional SecondaryNameNode that can be hosted on a separate machine. It only creates checkpoints of the namespace by merging the edits file into the fsimage file and does not provide any real redundancy. Hadoop 0.21+ has a BackupNameNode that is part of a plan to have an HA name service, but it needs active contributions from the people who want it (i.e. you) to make it Highly Available.
dal link
Quindi perché il NameNode è un singolo punto di errore? Che cosa è male o difficile avere un duplicato completo del NameNode in esecuzione?