自动化SQL Server Analysis Server表格模型的文档
There has been an ever-growing discomfort on documenting things especially when it’s very dynamic in nature and I was at one time undergoing the same. We have been developing SQL Server Analysis Tabular Model’s, which were quite a many in numbers and documenting 50 or 60 models manually is a big effort. In addition, development always demands change or enhancements that incurs changes on the documentation that is a continuous process. Adding the changes on the documentation is a time consuming process and sometimes loses track or remains inconsistent with the code built which is a critical issue when the system is in production for years.
记录事物的不适感越来越大,尤其是当它本质上非常动态并且我一次经历同样的事物时。 我们一直在开发SQL Server Analysis表格模型,它的数量很多,手动记录50或60个模型是一项巨大的工作。 另外,开发总是要求进行更改或进行增强,从而导致对文档的更改是一个连续的过程。 在文档上添加更改是一个耗时的过程,有时会失去跟踪或与所构建的代码不一致,这在系统投入多年生产时是一个关键问题。
I decided to automate the documentation of the SSAS model, to ease the manual effort here as well as save time. In addition, help the end users to see and review the latest change on the models quickly without even bothering the development team.
我决定自动化SSAS模型的文档,以减轻此处的人工工作并节省时间。 此外,帮助最终用户快速查看和查看模型的最新更改,而无需打扰开发团队。
解 (Solution)
I have used C#.NET to build up the documentation tool referring MSDN help for the libraries used to fetch the detailed properties for the Model. Microsoft has provided Tabular Object Model (TOM) library, which extracts all the metadata and properties for the model from SQL Server 2016 onwards, which is an extension of the AMO library used to extract Multidimensional cube metadata. This library is under Microsoft.AnalysisServices.Tabular.dll assembly.
我已经使用C#.NET来建立文档工具,该文档工具参考了MSDN帮助来获取用于获取模型的详细属性的库。 Microsoft提供了表格对象模型(TOM)库,该库从SQL Server 2016起提取该模型的所有元数据和属性,这是用于提取多维数据集元数据的AMO库的扩展。 该库位于Microsoft.AnalysisServices.Tabular.dll程序集下。
Overview
总览
An SSAS Tabular model is a database that run in-memory or in DirectQuery mode, accessing data directly from backend relational data sources. The model database is nothing but a Json object based definition that can be accessed via TOM object library.
SSAS表格模型是在内存中或DirectQuery模式下运行的数据库,直接从后端关系数据源访问数据。 模型数据库不过是可通过TOM对象库访问的基于Json对象的定义。
Logically, in a Tabular model, everything is driven from a Model which acts like a root that in turn is descendant of a Database (same as in Multidimensional). Here are the different objects exposed via TOM library referenced from MSDN.
从逻辑上讲,在表格模型中,所有内容都是由一个模型驱动的,该模型的行为类似于根,而根又是数据库的后代(与多维模型相同)。 这是通过MSDN引用的TOM库公开的不同对象。
As a part of design documentation, we need mainly four objects, which can be derived from the green highlighted nodes.
作为设计文档的一部分,我们主要需要四个对象,这些对象可以从绿色突出显示的节点派生。
- Tables and their relationships with other tables 表及其与其他表的关系
- Partitions and their corresponding tables 分区及其对应的表
- Columns and their corresponding tables 列及其对应的表
- Perspectives and their corresponding tables 观点及其对应的表格
Input
输入项
Now what should be the input to the tool? In order to make this tool friendly with any end user, the input should be simple like the name of Tabular model, which needs documentation with the server name from where the model is present. It should have capability to generate document for multiple models at one time too.
现在,该工具的输入应该是什么? 为了使此工具对任何最终用户都友好,输入应该像表格式模型的名称那样简单,它需要使用模型所在的服务器名称的文档。 它也应该具有一次生成多个模型的文档的功能。
- Server Name 服务器名称
- Model Names in a csv file CSV文件中的型号名称
Internal Process Overview
内部流程概述
Once we receive the input, the code steps will be as follows
收到输入后,代码步骤将如下所示
Get the Server connectivity for the models to read the metadata.
获取模型的服务器连接性以读取元数据。
-
Connect to the server provided. Ensure sufficient access to the user for server access is available.
- Create a new server object
-
Use the Connect method to connect to server where the input parameter is “Provider = MSOLAP; Data Source = “<servername>”
Server svr = new Server(); svr.Connect("Provider = MSOLAP; Data Source = "<servername>”);
- 创建一个新的服务器对象
- 使用Connect方法连接到输入参数为“ Provider = MSOLAP; 数据源=“ <服务器名称>”
-
Iterate on the models one by one and extract the four main objects mentioned above
-
Read the input csv file where the model names are provided which needs to be documented
string[] allmodels = File.ReadAllLines(<Full input file path>.csv);
- Iterate over the model names and find the model present on the server
- 读取输入的csv文件,其中提供了需要记录的型号名称
string[] allmodels = File.ReadAllLines(<Full input file path>.csv);
- 遍历模型名称并找到服务器上存在的模型
-
Read the input csv file where the model names are provided which needs to be documented
Tables and their relationships with other tables
表及其与其他表的关系
To get the first main output report for Tables, follow the steps below
要获取表格的第一个主要输出报告,请按照以下步骤操作
-
Once the model name is available, iterate on the tables present on the model.
- Fetch all the tables in the model using in mdl.Model.Tables.ToArray() property and iterate over them one by one using a Table object.
- Get the different properties for the table like Logical Table Name which is a Friendly table name or Displayed table name provided to the table in the model using <Table object>.Name property.
- Get the Table Description property which can be a data dictionary definition given for a table using <Table object>.Description property.
- Get the Physical or the Actual table name using the annotations on the model using <Table object>. Annotations[“_TM_ExtProp_DbTableName”].Value property.
- Get the source query of the physical table used in the model using tbl.Annotations[“_TM_ExtProp_QueryDefinition”].Value property. If the table is hidden or not, get that Boolean flag using <Table object>.IsHidden property.
- 使用mdl.Model.Tables.ToArray()属性获取模型中的所有表,并使用Table对象一个接一个地遍历它们。
- 使用<Table object> .Name属性获取表的其他属性,例如逻辑表名(是友好表名)或提供给模型中表的显示表名。
- 使用<Table object> .Description属性获取表描述属性,该属性可以是为表指定的数据字典定义。
- 使用<表对象>使用模型上的注释获取物理表名称或实际表名称。 注释[“ _TM_ExtProp_DbTableName”]。Value属性。
- 使用tbl.Annotations [“ _ TM_ExtProp_QueryDefinition”]。Value属性获取模型中使用的物理表的源查询。 如果表是隐藏的,请使用<Table object> .IsHidden属性获取该布尔标志。
-
If the table is a Calculated Table, use the partition source property
如果表是计算表,请使用分区源属性string ParSource = (tbl.Partitions[0].Source.Partition).Source.ToString(); if (ParSource != "Microsoft.AnalysisServices.Tabular.QueryPartitionSource") { string CalculatedTableExpression = ((Microsoft.AnalysisServices.Tabular.CalculatedPartitionSource)(tbl.Partitions[0].Source.Partition).Source).Expression.ToString(); }
The complete code for step 3 is as below
步骤3的完整代码如下
foreach (Table tbl in mdl.Model.Tables.ToArray()) {
String LogicalTablename = tbl.Name.ToString();
String TableDescription = tbl.Description.ToString();
string PhysicalTablename = tbl.Annotations["_TM_ExtProp_DbTableName"].Value.ToString();
string QueryDefinition = tbl.Annotations["_TM_ExtProp_QueryDefinition"].Value.ToString();
if (tbl.IsHidden) {string IsTableHidden = "Yes"; }
- Compare the “Relationship To table name” with the Logical name extracted in step 1.c.
- If it is same, get the Relationship name, From Table name using <relationship object>.Name and <relationship object>.FromTable.Name property.
- Get the Physical table name of the From Table name using <relationship object>.FromTable.Annotations[“_TM_ExtProp_DbTableName”].Value.property.
- Get the Logical key name for From Table name used in joining the table using <relationship object>.FromColumn.Name property.
- Get the Physical key name for From Table name used in joining the table based on the Column type – Data or Calculated using <relationship object>.((Microsoft.AnalysisServices.Tabular.DataColumn)rel.FromColumn).SourceColumn
- Or
- <relationship object>.((Microsoft.AnalysisServices.Tabular.CalculatedTableCoumn)rel.FromColumn).SourceColumn
- Repeat steps f to I for “Relationship To Table Name” properties.
- Get the cardinality of the relationships using <relationship object>.FromCardinality and <relationship object>.ToCardinality properties.
- One main important property of a relationship is the joining type based on <relationship object>.RelyOnReferentialIntegrity property. If the property value is true then it is an “Inner Join” else “Outer”
- Get the Boolean value of whether the relationship is active or not using <relationship object>.IsActive property.
- 将“与表的关系名称”与在步骤1.c中提取的逻辑名称进行比较。
- 如果相同,请使用<relationship object> .Name和<relationship object> .FromTable.Name属性获取“关系”名称,“来自表”名称。
- 使用<关系对象> .FromTable.Annotations [“ _ TM_ExtProp_DbTableName”]。Value.property获取“发件人”表名称的物理表名。
- 使用<relationship object> .FromColumn.Name属性获取在联接表中使用的“从表”名称的逻辑键名称。
- 获取基于列类型-数据或使用<关系对象>计算的联接表时使用的“从表”名称的物理键名称。(((Microsoft.AnalysisServices.Tabular.DataColumn)rel.FromColumn).SourceColumn
- 要么
- <关系对象>。(((Microsoft.AnalysisServices.Tabular.CalculatedTableCoumn)rel.FromColumn).SourceColumn
- 对“表名称的关系”属性重复步骤f至I。
- 使用<relationship object> .FromCardinality和<relationship object> .ToCardinality属性获取关系的基数。
- 关系的一个主要重要属性是基于<relationship object> .RelyOnReferentialIntegrity属性的连接类型。 如果属性值为true,则为“内部联接”,否则为“外部”
- 使用<relationship object> .IsActive属性获取该关系是否处于活动状态的布尔值。
Complete code for getting the relationships is as follows
获取关系的完整代码如下
foreach (SingleColumnRelationship rel in mdl.Model.Relationships) {
if (rel.ToTable.Name == LogicalTablename) {
string JoinRelationshipName = rel.Name;
string JoinFromTable = rel.FromTable.Name.ToString();
if (rel.FromTable.Annotations.Count > 0) {
string JoinFromPhysicalTable = rel.FromTable.Annotations["_TM_ExtProp_DbTableName"].Value.ToString(); }
String JoinFromTableKey = rel.FromColumn.Name;
if (rel.FromColumn.Type.ToString() == "Data")
JoinFromPhysicalTableKey = ((Microsoft.AnalysisServices.Tabular.DataColumn)rel.FromColumn).SourceColumn.ToString();
else if (rel.FromColumn.Type.ToString() == "CalculatedTableColumn")
JoinFromPhysicalTableKey = ((Microsoft.AnalysisServices.Tabular.CalculatedTableColumn)rel.FromColumn).SourceColumn.ToString();
JoinToTable = rel.ToTable.Name.ToString();
if (rel.ToTable.Annotations.Count>0) {
JoinToPhysicalTable = rel.ToTable.Annotations["_TM_ExtProp_DbTableName"].Value.ToString(); }
JoinToTableKey = rel.ToColumn.Name;
if(rel.ToColumn.Type.ToString() == "Data")
JoinToPhysicalTableKey = ((Microsoft.AnalysisServices.Tabular.DataColumn)rel.ToColumn).SourceColumn.ToString();
else if (rel.ToColumn.Type.ToString() == "CalculatedTableColumn")
JoinToPhysicalTableKey = ((Microsoft.AnalysisServices.Tabular.CalculatedTableColumn)rel.ToColumn).SourceColumn.ToString();
JoinFromCardinality = rel.FromCardinality.ToString();
JoinToCardinality = rel.ToCardinality.ToString();
if (rel.RelyOnReferentialIntegrity == true)
JoinType = "Inner Join";
IsRelationshipActive = rel.IsActive.ToString(); }
Partitions and their corresponding tables
分区及其对应的表
Once the table metadata is extracted, we need to go one more level down is the Partitions. Let’s understand the properties to get the same.
提取表元数据后,我们需要再下一层分区。 让我们了解这些属性以获得相同的结果。
- For each table in the model, there exist a partition which can be at least one or more than one. 对于模型中的每个表,都有一个分区,该分区可以至少一个或多个。
- In the same iterative loop for table, after a single table is fetched get all the partitions for the table using <Table object>.Partitions property 在表的同一迭代循环中,获取单个表后,使用<Table object> .Partitions属性获取表的所有分区
- Get the properties of a partition like partition name using <Partition object>.Name property 使用<Partition object> .Name属性获取分区的属性,如分区名称
-
Get the Partition source which can be of three types using (<Partition object>.Source.Partition).Source property
- Query – Data in this partition is retrieved by executing a query against a DataSource. The DataSource must be a data source defined in the model.bim file.
- Calculated – Data in this partition is populated by executing a calculated expression.
- None – Data in this partition is populated by pushing a rowset of data to the server as part of the Refresh operation.
- 查询–通过对DataSource执行查询来检索此分区中的数据 。 数据源必须是在model.bim文件中定义的数据源。
- 计算的-通过执行计算的表达式来填充此分区中的数据。
- 无–通过将数据行集作为刷新操作的一部分推送到服务器来填充此分区中的数据。
- Get the Query for partition if the partition is a query source using ((Microsoft.AnalysisServices.Tabular.QueryPartitionSource)(<Partition object>.Source.Partition).Source).Query 如果分区是使用((Microsoft.AnalysisServices.Tabular.QueryPartitionSource)(<分区对象> .Source.Partition).Source)的查询源,则获取分区查询。
- Columns will be of two types – Columns means attributes of a table and Measures that means the summarized or calculated value of a column specifically in a fact table. 列将分为两种类型:列表示表的属性,度量表示表中特定于事实表的列的汇总或计算值。
-
Once we have the table in a model, in the same iterative loop we can fetch all the columns using <Table Object>.Columns property.
-
Get all the column properties based on the column types.
- DataColumn – For regular columns in regular tables
- CalculatedColumn – For columns backed by DAX expression
- CalculatedTableColumn – For regular columns in calculated tables
- RowNumberColumn – Special type of column internally created by SSAS for every table
- Based on the column type, fetch the physical column name using ((Microsoft.AnalysisServices.Tabular.DataColumn)<Column Object>).SourceColumn propery. Change the casting to Calculated and CalculatedTableColumn based on the column types.
- Get the logical column name which is a friendly column name given to the column using <Column object>.Name property.
- Get the display folder, which is a logical grouping of different attributes shown in form of folder when the model is browsed using <Column object>.DisplayFolder property.
- Get the Format string, source type, formula properties using “FormatString”,”SourceProviderType”, and “SummarizeBy” properties.
- Get the basic properties like Description, Hidden and Datatype using “Description”,”IsHidden”, and “ColumnDataType” properties.
- 获取基于列类型的所有列属性。
- DataColumn –用于常规表中的常规列
- CalculatedColumn –对于DAX表达式支持的列
- CalculatedTableColumn –用于计算表中的常规列
- RowNumberColumn – SSAS在内部为每个表创建的特殊列类型
- 根据列类型,使用((Microsoft.AnalysisServices.Tabular.DataColumn)<列对象>)。SourceColumn属性获取物理列名称。 根据列类型将类型转换更改为Calculated和CalculatedTableColumn。
- 使用<Column object> .Name属性获取逻辑列名,该逻辑列名是为列指定的友好列名。
- 获取显示文件夹,该显示文件夹是使用<Column object> .DisplayFolder属性浏览模型时以文件夹形式显示的不同属性的逻辑分组。
- 使用“ FormatString”,“ SourceProviderType”和“ SummarizeBy”属性获取格式字符串,源类型,公式属性。
- 使用“ Description”,“ IsHidden”和“ ColumnDataType”属性获取基本属性,例如Description,Hidden和Datatype。
-
Get all the column properties based on the column types.
- After the columns, get the measure metadata from the table object by iterating using <Table object>.Measures property. 在列之后,使用<Table object> .Measures属性进行迭代,从表对象中获取度量元数据。
- Extract all properties like Measure Name, Data Type, Display Folder, Expression which is the calculation used, Format string, description, Hidden using code below 提取所有属性,例如度量名称,数据类型,显示文件夹,用作计算的表达式,格式字符串,描述,使用以下代码隐藏
- Iterate over all perspectives on the model using <Model object>.Model.Perspectives property 使用<Model object> .Model.Perspectives属性遍历模型的所有透视图
- Get the Table name in the perspective using <Perspective object>.PerspectiveTables property 使用<Perspective object> .PerspectiveTables属性获取透视图中的表名称
Complete code for step 2 is as below
步骤2的完整代码如下
foreach (Partition p in tbl.Partitions) {
PartitionName = p.Name.ToString();
string ParSource = (p.Source.Partition).Source.ToString();
if (ParSource == "Microsoft.AnalysisServices.Tabular.QueryPartitionSource") {
PartitionQueryDefinition = ((Microsoft.AnalysisServices.Tabular.QueryPartitionSource)(p.Source.Partition).Source).Query.ToString(); }
Columns and their corresponding tables
列及其对应的表
The next level is to get the column definitions present in the table.
下一级别是获取表中存在的列定义。
Complete code for step 2 is as follows
步骤2的完整代码如下
foreach (Column clm in tbl.Columns) {
string clmtype = clm.Type.ToString();
if (clmtype == "Data" || clmtype == "Calculated" || clmtype == "CalculatedTableColumn") {
if (clm.Type == Microsoft.AnalysisServices.Tabular.ColumnType.Data) {
PhysicalColname = ((Microsoft.AnalysisServices.Tabular.DataColumn)clm).SourceColumn.ToString();
CalculationFormula = clm.SummarizeBy.ToString(); }
else if (clm.Type == Microsoft.AnalysisServices.Tabular.ColumnType.Calculated) {
PhysicalColname = "";
CalculationFormula = (Microsoft.AnalysisServices.Tabular.CalculatedColumn)clm).Expression.ToStr }
else if (clm.Type == Microsoft.AnalysisServices.Tabular.ColumnType.CalculatedTableColumn) {
PhysicalColname = ((Microsoft.AnalysisServices.Tabular.CalculatedTableColumn)clm).SourceColumn.ToString(); }
LogicalColname = clm.Name;
DisplayFolder = clm.DisplayFolder;
FormatString = clm.FormatString;
SourceProviderType = clm.SourceProviderType;
if (clm.IsHidden) { isAttributeHidden = "Yes"; }
ColumnDescription = clm.Description;
ColumnDataType = clm.DataType.ToString(); }
foreach (Measure meas in tbl.Measures) {
MeasureName = meas.Name.ToString();
MeasureDataType = meas.DataType.ToString();
MeasureDisplayFolder = meas.DisplayFolder.ToString();
MeasureExpression = meas.Expression.ToString();
MeasureFormatString = meas.FormatString.ToString();
MeasureDescription = meas.Description.ToString();
if (meas.IsHidden) MeasureIsHidden = "Yes"; }
This completes the Table level iterative loop and continue the same loop until all the tables metadata is extracted and can be saved in a Data Table. The Data table is then exported to excel.
这样就完成了表级别的迭代循环,并继续相同的循环,直到提取了所有表元数据并将其保存在数据表中为止。 然后将数据表导出到excel。
Perspectives and their corresponding tables
观点及其对应的表格
After tables, we can extract the Perspectives on the model which will give an idea which table belongs to which perspective. Here we are extracting only tables in the perspective, but can be extended to get the columns exposed in the perspective also.
在表之后,我们可以提取模型的Perspectives,这将使您知道哪个表属于哪个透视图。 在这里,我们仅提取透视图中的表,但可以对其进行扩展以使透视图中的列也暴露出来。
Here is the complete code to get the perspectives
这是获得观点的完整代码
foreach (Perspective pers in mdl.Model.Perspectives) {
string PerspectiveName = pers.Name;
foreach (PerspectiveTable perstable in pers.PerspectiveTables) {
string LogicalTablename = perstable.Name; }
}
Output
输出量
I have extracted all the above four main objects in different Data Tables and then exported to Excel in different work sheets which provides a good documentation for end users. There are lot of references available to write data to excel, my version is as follow
我已经在不同的数据表中提取了上述所有四个主要对象,然后在不同的工作表中导出到Excel,这为最终用户提供了很好的文档。 有很多可用于将数据写入excel的参考,我的版本如下
Create a method to write data to Excel that takes the datatable as input and writes to different sheets
创建一种将数据写入Excel的方法,该方法将数据表作为输入并写入不同的工作表
Object[] myExcelObject = new object[<Row count>+ 1, <Column count>];
for (int row = 0; row < <Row count>; row++) {
for (int col = 0; col < <Column count>; col++) {
myExcelObject [row + 1, col] = < DataTable.>Rows[row][col]; }
}
_excelRange_S1 = _excelSheet1.get_Range("A1", Missing.Value);
_excelRange_S1 = _excelRange_S1.get_Resize(<Row Count> + 1, <Column Count>);
_excelRange_S1.set_Value(Missing.Value, myExcelObject);
_workBook.SaveAs(fileName, _value, _value,
_value, _value, _value, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange,
_value, _value, _value, _value, null);
_workBook.Close(false, _value, _value);
Screenshot sample of the output file
输出文件的屏幕快照示例
Sheet Model Tables – Tables information and their relationship
图纸模型表–表信息及其关系
Sheet Model Columns – Columns information with their corresponding tables
图纸模型列–列信息及其对应的表
Sheet Partitions – Partitions information with their corresponding tables
工作表分区–分区信息及其对应的表
Sheet Perspectives – Perspectives and their corresponding tables
图纸透视图–透视图及其对应的表格
结论 (Conclusion)
The automation of documentation of the model in an excel file makes it very easy for any type of user to view the model and use it accordingly. The tool is built with all references available from MSDN and can be enhanced further to add further more objects and properties.
excel文件中模型文档的自动化使任何类型的用户都可以轻松地查看模型并相应地使用它。 该工具使用MSDN上所有可用的引用来构建,可以进一步增强以添加更多的对象和属性。
Feel free to contact me on the complete code for the tool or suggest any feedback.
请随时与我联系以获取该工具的完整代码或提出任何反馈意见。
看更多 (See more)
For SSAS cube documentation, consider ApexSQL Doc, a tool that offers the possibility of documenting both Multidimensional and Tabular databases in different output formats.
对于SSAS多维数据集文档,请考虑ApexSQL Doc ,该工具提供了以不同输出格式同时记录多维数据库和表格数据库的可能性。
参考资料 (References)
- Introduction to a Tabular Object model 介绍 一个 表格对象模型
- Tabular Library reference for objects 对象的表格库参考
- Write Data to Excel file 将数据写入Excel文件
翻译自: https://www.sqlshack.com/automate-documentation-of-sql-server-analysis-server-tabular-model/
上一篇: Leetcode19.删除链表的倒数第N个节点(C语言)
下一篇: 第四周作业