hadoop-error:DiskChecker$DiskErrorException: Invalid volume failure config value
程序员文章站
2022-05-30 20:15:42
...
2012-12-17 10:58:59,925 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid volume failure config value: 3 at org.apache.hadoop.hdfs.server.datanode.FSDataset.<init>(FSDataset.java:1025) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:305) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1606) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1546) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1564) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1690) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1707)
新建了几台机器的集群,启动datanode时候报了这个错。
主要原因是因为dfs.datanode.failed.volumes.tolerated 参数配置了3,
这个参数的含义:The number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown.
datanode允许磁盘损坏的个数 ,datanode在启动时候会使用dfs.data.dir下配置的文件夹(用于存储block的),若是有一些不可以用且个数>上面配置的那个 值,这启动失败,代码见:org.apache.hadoop.hdfs.server.datanode.FSDataset
public FSDataset(DataStorage storage, Configuration conf) throws IOException { this.maxBlocksPerDir = conf.getInt("dfs.datanode.numblocks", 64); // The number of volumes required for operation is the total number // of volumes minus the number of failed volumes we can tolerate. final int volFailuresTolerated = conf.getInt("dfs.datanode.failed.volumes.tolerated", 0); String[] dataDirs = conf.getTrimmedStrings(DataNode.DATA_DIR_KEY); int volsConfigured = (dataDirs == null) ? 0 : dataDirs.length; int volsFailed = volsConfigured - storage.getNumStorageDirs(); validVolsRequired = volsConfigured - volFailuresTolerated; if (volFailuresTolerated < 0 || volFailuresTolerated >= volsConfigured) { throw new DiskErrorException("Invalid volume failure " + " config value: " + volFailuresTolerated); }
由于dfs.data.dir只配了一个目录,所以将 dfs.datanode.failed.volumes.tolerated设置为0后,问题解决。