【oracle教程】Designing ETL Data Flow Mappings【OWB 11g Rele
Designing ETL Data Flow Mappings Purpose This lesson describes how to design mappings and process flows, and use them to define data movement and transformation in Warehouse Builder. When you design a mapping in Warehouse Builder, you use
Designing ETL Data Flow Mappings
Purpose
This lesson describes how to design mappings and process flows, and use them to define data movement and transformation in Warehouse Builder. When you design a mapping in Warehouse Builder, you use the Mapping Editor interface. There is a large set of mapping operators available to you for use within the Mapping Editor. Oracle Warehouse Builder allows you to use activities external to Warehouse Builder (such as e-mail, FTP commands, and operating system executables). External activities will not be covered in this course. Using the Process Flow Editor, you learn how to design process flows that interrelate mappings and other activities.
In this lesson, you learn how to create mappings to extract data from sources, transform it, and load it into targets. It also briefly introduces the Debugging Editor to debug data flows in the Mapping Editor.
Time to Complete:
Approximately 60 minutes
Note: This tutorial and its setup scripts support OWB 11g Release 1, only. Earlier versions of this Oracle By Example tutorial are available for OWB 10g Releases 1 and 2.
Topics
This tutorial covers the following topics:
Overview | ||
Prerequisites | ||
Designing Mappings | ||
Designing Process Flows |
||
Testing and Debugging Mappings | ||
Summary |
Viewing Screenshots
Place the cursor over this icon to load and view all the screenshots for this tutorial. (Caution: Because this action loads all screenshots simultaneously, response time may be slow depending on your Internet connection.)
Note: Alternatively, you can place the cursor over each individual icon in the following steps to load and view only the screenshot associated with that step.
Overview
Now that you have defined source and target modules, you learn how to create the data movement and transformation logic. ETL stands for Extract, Transform, and Load. ETL involves the movement and transformation of data from your sources to your targets. The majority of the design work you do in Oracle Warehouse Builder is defining the ETL logic.Back to Topic List
Prerequisites
Before starting this tutorial, you should have:
1. |
Downloaded the owbdemo_files.zipfile and extracted the files to your computer's hard drive.
|
2. |
Completed Lessons 1, 2, and 3.
|
3. |
Installed Oracle Workflow Server 2.6.4 as explained in the note below, and in step 4. Note: If you are using an Oracle Database 10g Release 2 (10.2.0.3)instance with OWB 11g, install Workflow Server 2.6.4 from the Warehouse Builder 11g installation's OWB 11g Home>\owb\wf\install directory, and invoke wfinstall.csh for Unix, or wfinstall.bat for Windows platforms. You must install Workflow Server in your Oracle database home (not in OWB's home or any other home). If you are using an Oracle Database 11g
instance with OWB 11g, you need to install a version of Workflow Server 2.6.4 with changes for Database 11g. Go to the Database 11g installation's Oracle11g Database Home>\owb\wf\install directory, and invoke wfinstall.csh
for Unix, or wfinstall.bat for Windows platforms. You must install Workflow Server in your Oracledatabase home (not in OWB's home or any other home). |
4. |
To start the installation and configuration script on Windows: Enter the following values:
When you have entered all necessary values, click Submitto start the Workflow configuration process.
When the process is complete, the Workflow Configuration Assistant displays a dialog box with the message "Workflow Configuration has completed successfully". ClickOK . The installation script now exits. Finally, you need to grant the "EXECUTE ANY PROCEDURE" privilege to theowf_mgr user. Connect to SQL Plus as "SYS as sysdba", and issue the command:grant execute any procedure to owf_mgr |
Back to Topic List
Designing Mappings
Mappings extract data from the source, transform the data, and load the data into the target module (EXPENSE_WH), which you defined in the previous lesson.
In this topic, you learn to:
Examine a Predefined Mapping between a Relational Table and a Dimension | ||
Design a Mapping between an External Table and a Dimension | ||
Design a Mapping between an External Table and a Cube |
Examine a Predefined Mapping Between a Relational Table and a Dimension
In this subtopic, you examine the predefined LOAD_CHANNELS mapping between the CHANNELS relational table and the CHANNELS dimension. This example uses a different target module, SALES_WH, and sources and targets different from those you have defined earlier. To become familiar with the Mapping Editor, perform the following steps:
1. |
In the Project Explorer panel of the Design Center, expand OWB_DEMO. ExpandDatabases > Oracle > SALES_WH > Mappings. Double-click LOAD_CHANNELS. The Mapping Editor launches the LOAD_CHANNELS mapping.
|
2. |
You design a mapping in Warehouse Builder using the Mapping Editor interface. As you design a mapping, you select operators from the Mapping Editor Palette panel and place them on the canvas. Tip: If you are not able to see all the objects similar to the following screenshot, then in the Mapping Editor, click Auto-Layout on the toolbar to bring the mapping to its default size. Maximize the operators on the canvas by clicking the maximize icon . In the Mapping Editor, you see the following operators on the canvas: a) A Table operator named CHANNELS_IN b) A Constant operator named TOTALS c) A Dimension operator named CHANNELS_OUT d) Connection lines between attributes
|
3. |
In the LOAD_CHANNELS mapping, click the table operator CHANNELS_IN header. This now becomes the currently selected object. Right-click the table operatorCHANNELS_IN header and select Open Details. This launches the Table Editor in tab format. Each tab helps in performing tasks associated with the operator. For example, the Name tab enables you to specify a name and optional description for the operator. Click Cancel to close Table Editor.
|
4. |
In the LOAD_CHANNELS mapping, ensure that the CHANNELS_IN table operator is selected. (Click the table operator CHANNELS_IN header.) To determine the operator's bound name, scroll down to the Bound Name properties in the Table Operator Properties panel. This operator is bound to the CHANNELS table in the XSALES schema. Tip: Alternatively, move your mouse over the header of the operator. If the operator is currently bound, you will see the complete details in the following format: Operator Note: Bound name is the physical name of the object that is connected to an object operator in a mapping. Bound names are used to reference the object during code generation. |
5. |
In the LOAD_CHANNELS mapping, the dimension operator is named CHANNELS_OUT. The operator is bound to the CHANNELS dimension in the SALES_WH target schema.
|
6. |
In the LOAD_CHANNELS mapping, the constant operator is named TOTALS. The Constant operator enables you to define constant values. Constants can be used anywhere in a mapping. The Constant operator produces a single output group that can contain one or more constant attributes. To add a Constant operator on the canvas, you should drag a constant operator from the Palette. Tip: An example of using a constant operator is to load the value of the current system date into a table operator. To add the attributes to the TOTALS constant object, right-click the constant operator TOTALS header and selectOpen Details. Click the Output Attributes tab. Clicking Add will add attributes to the Totals constant object. As you see in the screenshot two attributes have been added. Click Cancel to close the Constant editor.
|
7. |
In the LOAD_CHANNELS mapping, select the TOT attribute by selecting the TOT attribute in the TOTALS constant operator. On the left in the Mapping Editor, in the Attribute Properties panel, inspect the Expression property. In the Attribute Properties panel, click the field next to the Expression property. Click the button on the right. The Expression Builder is launched. The value of the TOT attribute is "Channels Total." Close the Expression Editor.
|
8. |
In the LOAD_CHANNELS mapping, you see connection lines from:
In this example, you connect operators by connecting individual operator attributes to each other. Connection lines graphically represent how the data flows from a source, through operators, and to a target. Close the Mapping Editor.
|
Back to Topic
Design a Mapping Between an External Table and a Dimension
In this subtopic, you perform a hands-on exercise to create the REL_CATEGORY_MAP mapping that maps an external table called EXPENSE_CATEGORIES_CSV to a dimension called REL_CATEGORY.
1. |
In the Project Explorer, expand OWB_DEMO. Expand Databases > Oracle > EXPENSE_WH > Mappings. Note that you see a REL_TIME_MAP mapping, which was automatically created by OWB when you created the time dimension REL_TIME using the Time Wizard. Right-click Mappings and select New. The Create Mapping window appears.
|
2. |
In the Create Mapping window, enter REL_CATEGORY_MAP as name of the mapping.Click OK. In the Project Explorer panel, you see an entry for the REL_CATEGORY_MAP mapping under the EXPENSE_WH module. The Mapping Editor launches the REL_CATEGORY_MAP mapping.
|
3. |
From the Palette, drag the Dimension Operator to the canvas. The Add Dimension Operator window appears. In the Add Dimension Operator window, navigate to the EXPENSE_WH module and select theREL_CATEGORY dimension. Click OK.
|
4. |
The REL_CATEGORY dimension object is dropped on the canvas.
|
5. |
Alternatively, you can add existing sources and targets to a mapping through the Explorer panel, which is on the left side of the Mapping Editor. In the Explorer panel, ensure that Available Objects tab is selected. ExpandOracle > EXPENSE_WH > External Tables. Drag EXPENSE_CATEGORIES to the canvas. Now there are two objects on the canvas, the EXPENSE_CATEGORIES external table and the REL_CATEGORY dimension. Tip: In the Mapping Editor, click Auto-Layout on the toolbar to change the mapping to its default size. By dragging, move the EXPENSE_CATEGORIES external table to the left of the REL_CATEGORY dimension. Maximize the EXPENSE_CATEGORIES external table by clicking the maximize icon. Similarly, maximize the REL_CATEGORY dimension.
|
6. |
Connect the source (EXPENSE_CATEGORIES external table) to the target (REL_CATEGORY dimension) as follows:
Tip: Connection lines graphically represent how the data flows from a source, through operators, and to a target. Note: Be very careful where you drop your connection lines into the REL_CATEGORY target dimension. That dimension has two levels, CATEGORY and TYPE, each of which has same-named attributes ID, NAME, and DESCRIPTION.
|
7. |
You have just completed designing the mapping. Now get ready to generate the code for the mapping. In the Mapping Editor, from the menu, selectMapping > Generate. You have now generated the code that loads the dimension. Based on the ETL logic that you design in a mapping, Warehouse Builder generates the code required to implement your design. The Generation Results window is displayed. The Generation Results window displays the code that has been generated for you. Close the Mapping Editor.
|
Back to Topic
Design a Mapping Between an External Table and a Cube
In this subtopic, you perform a hands-on exercise to create the REL_EXPENSE_MAP mapping that maps an EXPORT_CSV external table to a REL_EXPENSE cube.
1. |
In the Project Explorer, expand OWB_DEMO. Expand Databases > Oracle > EXPENSE_WH > Mappings.Right-click the Mappings node and select New. The Create Mapping window appears.
|
2. |
In the Create Mapping window, enter REL_EXPENSE_MAP as the name of the mapping.Click OK. In the Project Explorer you see an entry for the REL_EXPENSE_MAP mapping under the EXPENSE_WH module. The Mapping Editor is launched.
|
3. |
In the Explorer panel, on the left of the Mapping Editor, ensure that the Available Objects tab is selected. ExpandOracle > EXPENSE_WH > External Tables. Drag the EXPENSE_DATA external table to the canvas. The EXPENSE_DATA external table object is on the canvas.
|
4. |
In the Explorer panel, ensure that the Available Objects tab is selected. ExpandOracle > EXPENSE_WH > Cubes. Drag the REL_EXPENSE cube to the canvas. Now, there are two objects on the canvas, the EXPENSE_DATA external table operator and the REL_EXPENSE cube operator. Tip: In the Mapping Editor, click Auto-Layout on the toolbar to bring the mapping to its default size. Ensure that the EXPENSE_DATA external table operator is to the left of the REL_EXPENSE cube operator. Maximize the EXPENSE_DATA external table by clicking the maximize icon. In the same way, maximize the REL_EXPENSE cube operator.
|
5. |
In the Mapping Editor, from the Palette panel, drag the Expression operator to the canvas. In this example, the Expression operator is used to transform the EXPENSE_DATE attribute in the EXPENSE_DATA external table from a date data type to a number data type. The target attribute in the cube is a number data type, therefore, this data type conversion is required to match the data type.
|
6. |
Drag a connection line from EXPENSE_DATA.OUTGRP1.EXPENSE_DATE toEXPRESSION.INGRP1.
|
7. |
Right-click the expression operator EXPRESSION header and select Open Details.
|
8. |
In the Expression Editor, click the Output Attributes tab. ClickAdd.
|
9. |
Rename the OUTPUT1 attribute to EXPENSE_DATE. Ensure that the data type is NUMBER. ClickOK.
|
10. |
In the Mapping Editor, click the expression operator EXPRESSION header, and selectOUTGRP1.EXPENSE_DATE. In the Attribute Properties panel, click the field next to theExpression property. Click the button on the right. The Expression Builder is launched.
|
11. |
In the Expression Builder, enter the following statement in Expression for EXPENSE_DATE: TO_NUMBER(TO_CHAR(INGRP1.EXPENSE_DATE , 'YYYYMMDD'), '99999999') Click Validate. Wait for a success signal in the Validation results panel. ClickOK.
|
12. |
Connect the source operator to the target operator as follows:
You have just completed designing the mapping. Now you can generate the code for the mapping.
|
13. |
In the Mapping Editor, from the menu, select Mapping > Generate. Warehouse Builder generates the Tip: To view these generated results at a later time, select Generation Results from the Window menu. From the toolbar, click Save All to commit your work. Close the Mapping Editor.
|
Back to Topic
Back to Topic List
Designing Process Flows
After you create mappings that define the operations for moving data from sources to targets, you can create and define process flows. Process flows interrelate mappings and activities external to Warehouse Builder, for example, e-mail. In OWB, a process flow is used to manage the execution of mappings and related activities, such as uploading a file to the warehouse machine using FTP or notifying the administrator in case of an error in the load process.
The scenario used in this example is very common. The dimensions (REL_TIME and REL_CATEGORY) must be loaded before the cube (REL_EXPENSE), because the cube references the dimension values. Only after all the dimensions are successfully loaded will the loading of the cube start. You now learn how to design the process flow.
The Process Flow Editor includes a variety of activities that you add and then connect with transitions to design a flow. An Activity represents units of work in a process flow. These units of work can involve components internal or external to Warehouse Builder. Transitions indicate the sequence and conditions in which to launch activities.
In this topic, you learn to:
Register the Oracle Workflow User | ||
Design a Process Flow |
Register the Oracle Workflow User
You need to grant specific roles to the Oracle Workflow(OWF) user,owf_mgr, for it to have the privileges to execute a process flow in the Control Center.
You do not have to embed the password for the Control Center in database-links owned by the OWF user. The Control Center user is highly privileged and its password is tightly controlled.
To register the OWF user, you use the Register Warehouse Builder Users wizard from the Security > Users node in Global Explorer panel. To get access to the security node, ensure you log in to the design center as the repository owner.
1. |
If not already logged in, log in to the Design Center with username/password asrep_owner/rep_owner. In the Global Explorer panel, expand the Security node. Right-click theUsers node and select New. Note: You must save or revert your changes before creating a new user. To save previous changes, selectSave All from the Design menu or click on the toolbar. The Create User dialog appears.
|
2. |
On the Create User dialog, you can select from the available list of database users or create a new one that will be automatically registered as Warehouse Builder user. Earlier in this lesson, where you were instructed to install Oracle Workflow Server, you were instructed to use the Oracle Workflow Configuration Assistant and specify owf_mgr as the Workflow Account. This resulted in the creation of owf_mgr as a database user. Select owf_mgr from the Available DB Users list and click> to move it to Selected Users list. Click OK. Notice that owf_mgr user is added to the Security > Users node in the Global Explorer panel.
|
Back to Topic
Design a Process Flow
1. |
Before you can start designing the process flow, you need to create a process flow module that includes a process flow package. Note: Process Flow Modules include Process Flow Packages that include Process Flows. The Process Flow Package is a grouping mechanism that determines which process flows you can interrelate. At run time, you can launch one process flow that launches other Process Flows that exist in the same process flow package. You create a Process Flow Module named MY_PF_MODULE, Process Flow Package named PK, and a Process Flow MY_PROC. The Process Flow Module acts as a container by which you can validate, generate, and deploy a group of Process Flows. In the Project Explorer, expand the OWB_DEMO project, and then expand the Process Flows node. Right-click Process Flow Modules and select New from the menu. The Create Module Wizard is launched. Click Next on the Welcome page.
|
2. |
On the Name and Description page, enter MY_PF_MODULE as the name of the process flow module. ClickNext.
|
3. |
Now you create an Oracle Workflow location for the process flow module, MY_PF_MODULE. Note: You define an Oracle Workflow location to specify where you want to deploy your process flows. The Oracle Workflow location points to a workflow schema (OWF_MGR) that runs in the target database. Warehouse Builder process flows comply with the XML Process Definition Language (XPDL). When you generate a process flow, Warehouse Builder generates an XML file in the XPDL format. You can plug the generated XML file into any workflow engine that follows the XPDL standard. On the Connection Information page, observe that Warehouse Builder creates a location with a default name, MY_PF_MODULE_LOCATION1. ClickEdit to specify the details of the Workflow location. In the Edit Oracle Workflow Location window, change the name of the location toWH_OWF_LOCATION. Specify the connection details to the Oracle Workflow schema. This course uses owf_mgr as the schema name and owf_mgr as the password. If your Oracle Workflow schema has a different name and password, substitute accordingly. Password: owf_mgr Click Test Connection. If the connection is successful, clickOK. If the connection is not successful, check the connection details and try again. After you click OK, you are returned to the wizard's Connection Information page . ClickNext. On the Summary page, review the details and click Finish.
|
4. |
The Create Process Flow Package window is launched. Specify PK as the name and clickOK.
|
5. |
The Create Process Flow window is launched. Specify LOAD_EXPENSE_WH as the name of the process flow. Click OK. Warehouse Builder launches the Process Editor and displays the process flow with a Start activity and an End_Success activity. You now learn how to design the process flow with activities and transitions. |
6. |
In the Process Editor, from the Palette panel, drag the Fork activity to the canvas. You can use the Fork activity to launch multiple, concurrent activities after the completion of an activity. You can also drag the Fork activity from the Selected Objects tab in the Explorer panel.
|
7. |
In the Process Editor, in the Explorer panel, click the Available Objectstab. From the EXPENSE_WH target module, drag REL_CATEGORY_MAP mapping to the canvas.
|
8. |
In the Explorer panel, ensure that the Available Objects tab is selected. Drag theREL_TIME_MAP mapping to the canvas. On the canvas, you see two mapping activities, REL_TIME_MAP and REL_CATEGORY_MAP. Rearrange the mapping activities vertically just to the right of the START activity. Note: On dropping the mapping activities on the canvas, if you get "Unable to determine if the map is a PLSQL or SQLLDR map........" error, then in the Design Center, right-click the appropriate mapping and selectConfigure. In the Configuration Properties window, for Language property, selectPL/SQL.
|
9. |
From the Palette panel, drag an AND activity to the canvas, to the right of the REL_TIME_MAP and REL_CATEGORY_MAP mapping activity.
|
10. |
In the Explorer panel, ensure that the Available Objects tab is selected. Drag theREL_EXPENSE_MAP mapping activity to the canvas. Rearrange the REL_EXPENSE_MAP to the right of the AND Activity.
|
11. |
From the Palette panel, drag an END with ERRORS activity to the canvas. Drag it to the right of the REL_EXPENSE_MAP cube mapping and above the END_SUCCESS activity.
|
12. |
From the Palette panel, drag an END with WARNINGS activity to the canvas. Drag it to the right of the REL_EXPENSE_MAP mapping activity and below the END_SUCCESS activity. The process appears as shown in the screenshot. Now you learn how to connect the activities.
|
13. |
Position your cursor on the Start activity. Ensure the cursor changes to ( ). Drag a line (these lines will from here on be referred to as transitions) from Start activity to the Fork activity. Note: Use transitions to indicate the sequence and conditions in which activities occur in the process flow. You can use transitions to execute an activity based on the completion state of the preceding activity. To drag a transition, the mouse pointer must change to ().
After having linked all the activities with transitions, the process flow will look as shown in the screenshot. Transitions that you learned and added in the above steps are unconditional transitions. By default, transitions are unconditional. The process flow continues after the preceding activity completes, regardless of the ending state of the previous activity. Note: You may get some invalid characters, such as and , as you drag transitions from one activity to another. You can ignore them as it does not cause any issues during deployment or execution.
|
14. |
You can use transitions to execute an activity based on the completion state of the preceding activity. Now, you learn to apply conditions to transitions. Select the transition (the line) from the REL_EXPENSE_MAP mapping activity to the END_SUCCESS activity. Note: In the Object Details panel, the transition name does not matter. In the screenshot below, the transition is named Transition_6. The transition name you see in your working environment can be different from what you see in the screenshot; perhaps you will see a name like Transition_7 or Transition_9. The name difference does not matter.
|
15. |
Again, ensure that the transition (the line) between the REL_EXPENSE_MAP mapping activity and END_SUCCESS is selected. In the Object Details panel, click in the field next to the Condition property. Click the button on the right. The Transition Condition Editor is displayed. In the Transition Condition Editor, select the Enumerated Conditions option button. From the drop-down list, selectSUCCESS. Click OK.
|
16. |
Select the transition between the REL_EXPENSE_MAP mapping activity to the END_ERROR activity. The transition is highlighted. In the Object Details panel, click the field next to property Condition. Click the button. The Transition Condition Editor is displayed. In the Transition Condition Editor, select the Enumerated Conditions option button. From the drop-down list, selectERROR. Click OK. This indicates that the process flow continues if the preceding activity (REL_EXPENSE_MAP) ends in an error.
|
17. |
Select the transition between the REL_EXPENSE_MAP mapping activity to the END_WARNING activity. In the Object Details panel, click in the field next to the Condition property. Click the button. In the Transition Condition Editor, select theEnumerated Conditions option button. From the drop-down list, selectWARNING. Click OK. This indicates that the process flow continues if the preceding activity (REL_EXPENSE_MAP) ends with warnings.
|
18. |
This completes the design of this process flow. Generate the LOAD_EXPENSE_WH process flow by selectingGenerate from the Process Flow menu.The code generated from a process flow definition is in industry standard XML Process Definition Language (XPDL). From Process Flow menu, select Close to exit the process editor. In the Design Center, click Save All on the toolbar to commit your work.
|
Back to Topic
Back to Topic List
Testing and Debugging Mappings
Oracle Warehouse Builder has a graphical mapping debugger that enables you to see the data flow through the mapping in a graphical interface. The mapping debugger is embedded in the Mapping Editor. This div uses text and screenshots (not as hands-on) to demonstrate how to use the debugger to debug mappings.
1. |
These steps 1 through 8 are read-only, not to be performed. Assumptions: The REL_CATEGORY dimension has been deployed to the target schema. The EXPENSE_CATEGORIES exte 声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn核实处理。 相关文章 相关视频 专题推荐
|
网友评论
文明上网理性发言,请遵守 新闻评论服务协议
我要评论