欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

hadoop-error:DiskChecker$DiskErrorException: Invalid volume failure config value

程序员文章站 2022-05-30 20:15:42
...
2012-12-17 10:58:59,925 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid
volume failure  config value: 3
        at org.apache.hadoop.hdfs.server.datanode.FSDataset.<init>(FSDataset.java:1025)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:305)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1606)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1546)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1564)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1690)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1707)

 

新建了几台机器的集群,启动datanode时候报了这个错。

主要原因是因为dfs.datanode.failed.volumes.tolerated 参数配置了3,

这个参数的含义:The number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown.

 

datanode允许磁盘损坏的个数 ,datanode在启动时候会使用dfs.data.dir下配置的文件夹(用于存储block的),若是有一些不可以用且个数>上面配置的那个 值,这启动失败,代码见:org.apache.hadoop.hdfs.server.datanode.FSDataset

 

public FSDataset(DataStorage storage, Configuration conf) throws IOException {
    this.maxBlocksPerDir = conf.getInt("dfs.datanode.numblocks", 64);
    
    // The number of volumes required for operation is the total number 
    // of volumes minus the number of failed volumes we can tolerate.
    final int volFailuresTolerated =
      conf.getInt("dfs.datanode.failed.volumes.tolerated", 0);
    String[] dataDirs = conf.getTrimmedStrings(DataNode.DATA_DIR_KEY);
    int volsConfigured = (dataDirs == null) ? 0 : dataDirs.length;
    int volsFailed = volsConfigured - storage.getNumStorageDirs();
    validVolsRequired = volsConfigured - volFailuresTolerated;

    if (volFailuresTolerated < 0 || volFailuresTolerated >= volsConfigured) {
      throw new DiskErrorException("Invalid volume failure "
          + " config value: " + volFailuresTolerated);
    }

 

由于dfs.data.dir只配了一个目录,所以将 dfs.datanode.failed.volumes.tolerated设置为0后,问题解决。

 

相关标签: hadoop