hive任务提交的相关权限认证详析

程序员文章站 2024-01-29 09:58:52

...

最近在研究Hue，遇到一个问题，在Hive Editor写一个HQL，提交后会报权限错误，类似这样的 Authorizationfailed:Noprivilege'Select'foundforinputs{database:xxx,table:xxx,columnName:xxx}.Useshowgranttogetmoredetails. Hue的登录用户是hadoop,使用cli方

最近在研究Hue，遇到一个问题，在Hive Editor写一个HQL，提交后会报权限错误，类似这样的

Authorization failed:No privilege 'Select' found for inputs {database:xxx, table:xxx, columnName:xxx}. Use show grant to get more details.

Hue的登录用户是hadoop,使用cli方式查询的时候，是没问题的，但是使用Hue连接HiveServer2的方式，就查询不了对应的表了，排除Hue的干扰，使用Beeline来连接HiveServer2，同样报权限的错误，堆栈信息如下图

hive任务提交的相关权限认证详析

根据堆栈信息大概梳理了下源码(只列出比较重要的代码)，Hive提交SQL的权限验证流程如下

    Driver.compile(String command, boolean resetTaskIds){
      if (HiveConf.getBoolVar(conf,
          HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED)) {
        try {
          perfLogger.PerfLogBegin(LOG, PerfLogger.DO_AUTHORIZATION);
          //进行权限校验
          doAuthorization(sem);
        }
     }
     Driver.doAuthorization(BaseSemanticAnalyzer sem){
        //判断op的操作类型为QUERY
        if (op.equals(HiveOperation.CREATETABLE_AS_SELECT)
              || op.equals(HiveOperation.QUERY)) {
            if (cols != null && cols.size() > 0) {
                //进行更具体的验证
                ss.getAuthorizer().authorize(tbl, null, cols,
                    op.getInputRequiredPrivileges(), null);
            }
         }
     }
     BitSetCheckedAuthorizationProvider.authorize(Table table, Partition part, List columns,Privilege[] inputRequiredPriv, Privilege[] outputRequiredPriv){
            //验证用户对DB和Table的权限
            authorizeUserDBAndTable(table, inputRequiredPriv, outputRequiredPriv,inputCheck,outputCheck)
            //验证用户对Table中column的权限
            for (String col : columns) {
                PrincipalPrivilegeSet partColumnPrivileges = hive_db
                      .get_privilege_set(HiveObjectType.COLUMN, table.getDbName(),table.getTableName(),partValues, col,this.getAuthenticator().getUserName(), this.getAuthenticator().getGroupNames());
                authorizePrivileges(partColumnPrivileges, inputRequiredPriv, inputCheck2,
                       outputRequiredPriv, outputCheck2);
            }
     }

Hive的权限验证首先会调用authorizeUserDBAndTable验证用户是否对访问的DB和Table有访问权限，对应到MetaStore的DB_PRIVS和TBL_PRIVS表，在进行验证的时候，会通过thrift与HiveMetaStore进程进行交互来获取MetaStore库中对应表的相关信息。如果用户对更大粒度的资源有访问权限，则会直接返回，不会再继续进行更细粒度的验证，也就是说如果用户对DB有相关的权限，则不会继续验证对Table和Column的访问权限。

查看了下DB_PRIVS表，hadoop用户对访问的数据库是有Select权限的，所以再传统CLI模式下访问是没有问题的。看上述代码也都是在预料之中，因为实际上CLI模式和HiveServer模式的权限验证是一套代码。决定remote debug下，进而发现this.getAuthenticator().getUserName()的值为hive，也即是启动HiveServer2的用户，而不是提交SQL的用户hadoop,顺藤摸瓜，找到了设置authenticator相关属性的代码

    SessionState.start(SessionState startSs) {
        //实例化默认的HadoopDefaultAuthenticator,方法内部，使用ReflectionUtils反射加载类的时候，进而调用了HadoopDefaultAuthenticator.setConf方法
        startSs.authenticator=HiveUtils.getAuthenticator(startSs.getConf(),HiveConf.ConfVars.HIVE_AUTHENTICATOR_MANAGER);
    }
    HadoopDefaultAuthenticator.setConf(Configuration conf){
        ugi = ShimLoader.getHadoopShims().getUGIForConf(conf);
    }
    HadoopShimsSecure.getUGIForConf(Configuration conf) throws IOException {
        return UserGroupInformation.getCurrentUser();
    }
    
UserGroupInformation.getCurrentUser() throws IOException {
    AccessControlContext context = AccessController.getContext();
    Subject subject = Subject.getSubject(context);
    //HiveServer刚启动的时候,subject为空,调用getLoginUser
    if (subject == null || subject.getPrincipals(User.class).isEmpty()) {
      return getLoginUser();
    } else {
      return new UserGroupInformation(subject);
    }
  }
UserGroupInformation.getLoginUser() {
    if (loginUser == null) {
      try {
        Subject subject = new Subject();
        LoginContext login;
        if (isSecurityEnabled()) {
          login = newLoginContext(HadoopConfiguration.USER_KERBEROS_CONFIG_NAME,
              subject, new HadoopConfiguration());
        } else {
          login = newLoginContext(HadoopConfiguration.SIMPLE_CONFIG_NAME, 
              subject, new HadoopConfiguration());
        }
        login.login();
        loginUser = new UserGroupInformation(subject);
        loginUser.setLogin(login);
        loginUser.setAuthenticationMethod(isSecurityEnabled() ?
                                          AuthenticationMethod.KERBEROS :
                                          AuthenticationMethod.SIMPLE);
        loginUser = new UserGroupInformation(login.getSubject());
        String fileLocation = System.getenv(HADOOP_TOKEN_FILE_LOCATION);
        if (fileLocation != null) {
          Credentials cred = Credentials.readTokenStorageFile(
              new File(fileLocation), conf);
          loginUser.addCredentials(cred);
        }
        loginUser.spawnAutoRenewalThreadForUserCreds();
      } catch (LoginException le) {
        LOG.debug("failure to login", le);
        throw new IOException("failure to login", le);
      }
      if (LOG.isDebugEnabled()) {
        LOG.debug("UGI loginUser:"+loginUser);
      }
    }
    return loginUser;
  }

HiveServer刚启动时第一次调用getLoginUser()，loginUser为空，接下来会创建LoginContext并调用其login方法，login方法最终会调用HadoopLoginModule的commit()方法。commit()方法的大致逻辑是这样的

1.如果使用了kerberos，则为kerberos登陆用户

2.如果kerberos用户为空并且没有开启security，则从系统环境变量中取HADOOP_USER_NAME的值

3.如果环境变量中没有设置HADOOP_USER_NAME，则使用系统用户，即启动HiveServer2进程的用户

后续使用的用户即为启动HiveServer2的用户，所以authenticator的UserName属性值即为hive。所以使用hive去查MetaStore的相关权限表的时候，查不到相关的信息，授权不通过。除非授予hive用户相关的权限。解决的办法要么为hive用户赋予相关的权限，可是这样，权限验证就失去了意义。更好的办法实现自己的hive.security.authenticator.manager来实现根据提交SQL的用户去进行权限验证。

相关标签： hive 任务提交相关权限认证详析近在研究 H

上一篇： PHP操作文本数据库实例教程_PHP教程

下一篇： phpnow没有配置文件php.ini，怎么才能修改下传文件大小的限制呢