BIML XVI – Script and Reuse (Part II)


This post continues a prior post about scripting and asset definition in biml. You can find the first part here.

For clarification, we are using biml to define assets (objects in our BI environment) and then use that definition in our biml code. We also introduce biml script as a method of reusing biml code templates.

Biml – its More than Just a Package Generator

The complete Biml for the file 01_Env.biml is shown below. In summary, it specifies the objects in our environment (connections, databases, schemas and tables). Although SSIS offers us no ability to define and see tables (perhaps you could argue that a database project may), Biml does, and we can define it within the XML.

It is also interesting to note the relationship between tables and their containing database. That is, a table belongs to a schema which (of course) belongs to a database which exists in a connection. The definition of those properties are required if we are to use ‘lower’ level objects. That is, the database requires a connection (and so on).

It would seem intuitive that a table definition require columns. After all, could you have a table without them? However, there is no requirement to specify columns within a biml table and this is demonstrated in the tables below.

Finally, the astute ready will notice that I’ve referred to a tier in my code (note the first line). This has not been done before and the purpose of a tier is to guarantee build order by the biml compiler. Up until now, we have only generated biml in one file because all requirements have existed in that file. When we submit multiple files (we can build more than one biml file at a time) we want to guarantee the execution order and the tier allows this. We couldn’t (after all) use a table before its defined.

 
<#@ template tier="1" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">

  <Connections>
    <OleDbConnection Name="stage" ConnectionString="Provider=SQLOLEDB;Data Source=.\SQL2012;Integrated Security=SSPI;Initial Catalog=stage" />
    <OleDbConnection Name="adventure_works" ConnectionString="Provider=SQLOLEDB;Data Source=.\SQL2012;Integrated Security=SSPI;Initial Catalog=AdventureWorks2012" />
  </Connections>
  
  <Databases>
    <Database ConnectionName="stage" Name="stage"  />
  </Databases>
  
  <Schemas>
    <Schema DatabaseName="stage" Name="dbo"  />
  </Schemas>
  
  <Tables>
    
    <!-- ah yes .... of course i should specify columns 🙂 -->  
    <Table Name="Product" SchemaName="stage.dbo" />
    <Table Name="ProductSubCategory" SchemaName="stage.dbo" />
    
    <Table Name="ProductCategory" SchemaName="stage.dbo">
      <Columns>
        <Column Name="ProductCategoryID" DataType="Int32" />
        <Column Name="Name" DataType="String" Length="50" />
      </Columns>
    </Table>
  
  </Tables>
  
</Biml>

Biml Script, Package Creation and the ASP Comparison

The second file in our biml example defines a package. To reiterate how it works, it uses a generic template (described above) and iterates over each table defined in the first biml file to use the name of the table in the template. For this reason, biml script is often compared to asp – the initial scripting language for html. Actually biml is often described as html with biml script being asp code.

Of course this is only possible in this instance because we have employed a consistent design.

<#@ template tier="2" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
      <Packages>
        <Package Name="build_stage" ConstraintMode="Parallel" ProtectionLevel="EncryptSensitiveWithUserKey">
            <Tasks>
                <# foreach( var _table in RootNode.Tables ) { #>
                <ExecuteSQL Name="TRUNC_<#= _table #>" ConnectionName="stage">
                    <DirectInput>Truncate Table dbo.import_<#= _table #>;</DirectInput>
                </ExecuteSQL>
				
                <Dataflow Name="LOAD_<#= _table #>" >
					 
                    <Transformations>
           
                      <OleDbSource Name="SRC_<#= _table #>" ConnectionName="adventure_works">
                            <DirectInput>Select * from Production.<#= _table #>;
                            </DirectInput>
                        </OleDbSource>  
			
 			  <OleDbDestination Name="TGT_<#= _table #>" ConnectionName="stage">
			  	<ExternalTableOutput Table="dbo.import_<#= _table #>"></ExternalTableOutput>
                        </OleDbDestination> 
                    </Transformations>
			 
                    <PrecedenceConstraints>
                        <Inputs>
                            <Input OutputPathName="TRUNC_<#= _table #>.Output">
                            </Input>
                        </Inputs>
                    </PrecedenceConstraints>

                </Dataflow>

                <#  }#>
            </Tasks>
        </Package>
    </Packages>
</Biml>

In examining the script, we’ll look at 2 items. Firstly the loop identifier/creator, this occurs as the first child of the task node (after all we want to generate the code for each table). This is defined with the statement and is similar to C# code.

foreach( var _table in RootNode.Tables ) {}

If you are not familiar with code, we are just creating a loop over the tables (node) in the biml definition (that is, the environment file). The use of {} means that anything between those parenthesis will get repeated for each table. We may remember back to the first post, where we said there are many root child node types.

We have to have some way of telling the compiler (what actually builds the packages) that this is code (and not biml) so, any code needs to be encompassed in its special tag (ie <# #> ).

Secondly, we need to refer to the table name within each iteration. This is done with and equal sign following the opening tag (that is, <#= #> ) and is just the same as saying print the script value of what’s between the opening (<#=) and closing (#>) tag. Since we have defined the _table variable (in the loop), we can use it here to specify the name.

If your having trouble understanding whats going on, the biml script is just rerunning the template code (the one with Product) and replacing product with the table name (which is defined in the first biml file).

Also note that this file is a second tier which means that it will execute after the first.

Execution

We can execute more than one biml file at a time. Simply, highlight both and select ‘Generate SSIS Packages’. The definition of tiers will solve (an otherwise random execution order) and make sure that the second file can use components that are created in the first.

Conclusion

I think there’s a lot in this post. We’ve looked into using biml to specify our BI assets and then used that definition (or its master data) to create packages. We have also compared Biml to asp and demonstrated how to use script to iterate over template code.

Advertisements

One thought on “BIML XVI – Script and Reuse (Part II)

  1. Pingback: BIML XVI – Script and Reuse (Part I) | Paul te Braak

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s