Kettle中的循环作业,类似for循环功能
程序员文章站
2022-04-28 09:40:10
...
在Kettle中通过组件之间的组合使用,可以实现程序中for循环的功能,可以循环执行某一个job或者转换,整体流程如下:
其中的JavaScript3是调试时使用的,可不加。
- tra转换:选择要循环作业的表名
实际中可以将本转换改为选择数据库中的所有表。
- JavaScript:获取上一步中的表名,并设置为变量
var prevRow=previous_result.getRows();//获取上一个传递的结果
if (prevRow == null &&(prevRow.size()=0))
{
false;
}else{
parent_job.setVariable("tables", prevRow);//ArrayList存储表名变量,以数组形式保存入table1,table2
parent_job.setVariable("size", prevRow.size());//存储执行表的总数量
parent_job.setVariable("i", 0);//循环控制变量
parent_job.setVariable("TABLENAME",prevRow.get(0)).getString("table_name",""));//获取上一步传入内存的table_name字段值
var subject="日志输出:";
//实例化日志channel对象
var log= new org.pentaho.di.core.logging.LogChannel(subject);
//日志输出
log.logMinimal(prevRow); //这里打印日志输出方便查看
true;
}
- Simple evaluation :条件判断
- tra2:这里是要循环执行的作业或者转换,简单写了一个日志输出做测试
- JavaScript2:循环变量i加1,并更新表名变量
var list_Tables =parent_job.getVariable("tables").replace("[","").replace("]","").replace(" ","").split(","); //将[[table1], [table2]]数组形式变为table1, table2
var size = new Number(parent_job.getVariable("size"));
var i = new Number(parent_job.getVariable("i"))+1;
if(i<size){
parent_job.setVariable("TABLENAME", list_Tables[i]);
}
parent_job.setVariable("i",i);
true;
执行成功日志如下:
2020/04/24 13:47:23 - Spoon - Starting job...
2020/04/24 13:47:23 - Job_loopReadTableName - Start of job execution
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [tra1]
2020/04/24 13:47:23 - tra1 - Using run configuration [Pentaho local]
2020/04/24 13:47:23 - tra1 - Using legacy execution engine
2020/04/24 13:47:23 - ReadTablename - Dispatching started for transformation [ReadTablename]
2020/04/24 13:47:23 - Data grid.0 - Finished processing (I=0, O=0, R=0, W=2, U=0, E=0)
2020/04/24 13:47:23 - Copy rows to result.0 - Finished processing (I=0, O=0, R=2, W=2, U=0, E=0)
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [JavaScript]
2020/04/24 13:47:23 - 日志输出: - [[table1], [table2]]
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [Simple evaluation]
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [tra2]
2020/04/24 13:47:23 - tra2 - Using run configuration [Pentaho local]
2020/04/24 13:47:23 - tra2 - Using legacy execution engine
2020/04/24 13:47:23 - tra_needToDO - Dispatching started for transformation [tra_needToDO]
2020/04/24 13:47:23 - Get variables.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2020/04/24 13:47:23 - Write to log 2.0 -
2020/04/24 13:47:23 - Write to log 2.0 - ------------> Linenr 1------------------------------
2020/04/24 13:47:23 - Write to log 2.0 - tablesname = [table1]
2020/04/24 13:47:23 - Write to log 2.0 -
2020/04/24 13:47:23 - Write to log 2.0 - ====================
2020/04/24 13:47:23 - Write to log 2.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [JavaScript 2]
2020/04/24 13:47:23 - 日志输出333: - table2
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [Simple evaluation]
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [tra2]
2020/04/24 13:47:23 - tra2 - Using run configuration [Pentaho local]
2020/04/24 13:47:23 - tra2 - Using legacy execution engine
2020/04/24 13:47:23 - tra_needToDO - Dispatching started for transformation [tra_needToDO]
2020/04/24 13:47:23 - Get variables.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2020/04/24 13:47:23 - Write to log 2.0 -
2020/04/24 13:47:23 - Write to log 2.0 - ------------> Linenr 1------------------------------
2020/04/24 13:47:23 - Write to log 2.0 - tablesname = table2
2020/04/24 13:47:23 - Write to log 2.0 -
2020/04/24 13:47:23 - Write to log 2.0 - ====================
2020/04/24 13:47:23 - Write to log 2.0 - Finished processing (I=0, O=0, R=1, W=1, U=0, E=0)
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [JavaScript 2]
2020/04/24 13:47:23 - 日志输出333: - table2
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [Simple evaluation]
2020/04/24 13:47:23 - Job_loopReadTableName - Starting entry [Success]
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [Success] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [Simple evaluation] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [JavaScript 2] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [tra2] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [Simple evaluation] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [JavaScript 2] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [tra2] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [Simple evaluation] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [JavaScript] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Finished job entry [tra1] (result=[true])
2020/04/24 13:47:23 - Job_loopReadTableName - Job execution finished
2020/04/24 13:47:23 - Spoon - Job has ended.
上一篇: ambari的Command介绍
下一篇: Apache NiFi用户指南