欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

SKIL/工作流程/添加资源

程序员文章站 2024-01-26 15:04:28
...

添加资源

运行作业需要先将外部资源添加到SKIL的系统中。在添加资源之前,你需要将相关的凭证文件存储在SKIL集群的一个节点中。

 

存储凭证

下面显示了存储每种受支持资源类型的凭据的格式。

注意

For HDFS and YARN, no credentials are required as settings are done locally. You'll have to configure the SPARK_HOME environment variable and point it to the spark root folder for YARN.

对于HDFS和YARN,不需要凭证,因为设置是在本地完成的。你必须配置SPARK_HOME环境变量,并将其指向YARN的spark root文件夹。

{
  "accessKey": "<access_key>",
  "secretKey": "<secret_key>"
}

 

在哪里可以找到凭证?
请访问以下链接以根据你的资源需求获取安全凭据:

  1. AWS S3 and EMR
  2. Azure Storage and HDInsight
  3. Google Storage and Cloud DataProc - 将此信息保存在一个文件中,并给出serviceaccountfile键的路径,如上述代码段中所述。

 

添加资源
存储完资源凭据后,可以使用以下方法添加相应的资源:

  1. CLI
  2. REST端
  3. UI

 

1. CLI

skil resources命令通过CLI管理资源。以下代码段显示了如何添加每种类型的资源:

AWS S3

skil resources create-s3 --name <resource_name> --credentialUri <credentials_uri> --bucketId <bucket_id> --region <region>

AWS EMR

skil resources create-emr --name <resource_name> --credentialUri <credentials_uri> --clusterId <cluster_id> --region <region>

Google Storage

skil resources create-google-storage --name <resource_name> --credentialUri <credentials_uri> --projectId <project_id> --bucketName <bucket_name>

Google Cloud DataProc

skil resources create-dataproc --name <resource_name> --credentialUri <credentials_uri> --projectId <project_id> --sparkClusterName <spark_cluster_name> --region <region>

Azure Storage

skil resources create-azure-storage --name <resource_name> --credentialUri <credentials_uri> --containerName <container_name>

Azure HDInsight

skil resources create-hdinsight --name <resource_name> --credentialUri <credentials_uri> --subscriptionId <subscription_id> --resourceGroupName <resource_group_name> --clusterName <cluster_name>

HDFS

skil resources create-hdfs --name <resource_name> --credentialUri <credentials_uri> --nameNodeHost <name_node_host> --nameNodePort <name_node_port>

YARN

skil resources create-yarn --name <resource_name> --credentialUri <credentials_uri> --localSparkHome <local_spark_home>

 

2. REST 端

使用类似“curl”的工具,你可以通过向http://host:port/resource端点发送post请求来添加资源。通过REST端点添加资源的一般格式如下:

curl -d '<resource_request_data>' -H "Authorization: Bearer <auth_token>" -H "Content-Type: application/json" -X POST http://host:port/resource

注意

你可以通过运行以下curl请求来获取<auth_token>:

curl -d '{"userId":"<userId>", "password":"<password>"}' -H "Content-Type: application/json" -X POST http://localhost:9008/login

其中,<userid>和<password>是登录SKIL的凭据。
对于每种类型的资源,<resource_request_data>将具有以下格式:

AWS S3

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.storage.AzureStorageResourceDetails",
    "containerName":"<container_name>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像"file:///path/to/credentials.json
  "type":"STORAGE",
  "subType":"AzureStorage",
  "credentialId":<credentials_id> // 一个整数
}
  
//你只需要提供credentialsUri或credentialsId

AWS EMR  

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.compute.EMRResourceDetails",
    "clusterId":"<cluster_id>",
    "region":"<region>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
  "type":"COMPUTE",
  "subType":"EMR",
  "credentialId":<credentials_id> // 一个整数
}
  
//你只需要提供credentialsUri或credentialsId

Google Storage  

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.storage.GoogleStorageResourceDetails",
    "projectId":"<project_id>",
    "bucketName":"<bucket_name>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像"file:///path/to/credentials.json
  "type":"STORAGE",
  "subType":"GoogleStorage",
  "credentialId":<credentials_id> // 一个整数
}
  
//你只需要提供credentialsUri或credentialsId

Google Cloud DataProc

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.compute.DataProcResourceDetails",
    "projectId":"<project_id>",
    "region":"<region>",
    "sparkClusterName":"<spark_cluster_name>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
  "type":"COMPUTE",
  "subType":"DataProc",
  "credentialId":<credentials_id> // 一个整数
}
  
//你只需要提供credentialsUri或credentialsId

Azure Storage      

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.storage.AzureStorageResourceDetails",
    "containerName":"<container_name>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
  "type":"STORAGE",
  "subType":"AzureStorage",
  "credentialId":<credentials_id> // 一个整数
}
  //你只需要提供credentialsUri或credentialsId

Azure HDInsight  

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.compute.HDInsightResourceDetails",
    "subscriptionId":"<subscription_id>",
    "resourceGroupName":"<resource_group_name>",
    "clusterName":"<cluster_name>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像"file:///path/to/credentials.json
  "type":"COMPUTE",
  "subType":"HDInsight",
  "credentialId":<credentials_id> //一个整数
}
//你只需要提供credentialsUri或credentialsId

HDFS    

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.storage.HDFSResourceDetails",
    "nameNodeHost":"<name_node_host>",
    "nameNodePort":"<name_node_port>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
  "type":"STORAGE",
  "subType":"HDFS",
  "credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId

YARN

{
  "resourceName":"<resource_name>",
  "resourceDetails": {
  	"@class":"io.skymind.resource.model.subtypes.compute.YARNResourceDetails",
    "localSparkHome":"<local_spark_home>"
  },
  "credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
  "type":"COMPUTE",
  "subType":"YARN",
  "credentialId":<credentials_id> // 一个整数
}
  //你只需要提供credentialsUri或credentialsId

 

注意

如果你已被授予凭证,那么你可以在请求中省略credentialsId,反之亦然。

 

3. UI

你可以通过单击SKIL仪表盘右上角的“齿轮”图标,然后转到“资源(Resources)”来访问添加资源的用户界面:

SKIL/工作流程/添加资源

单击 "添加资源(Add Resource)"来添加资源 :

SKIL/工作流程/添加资源

选择要添加的资源类型:

SKIL/工作流程/添加资源

现在,填写详细信息,最后单击“添加(Add)…”添加所需资源:

SKIL/工作流程/添加资源