Temporal Tables != Data Warehouse

With the release of SQL2016 just around the corner, there are an increasing number of posts appearing about new features that mean I don’t have to work.  One post (which motivated me to write this) claimed that this breakthrough would mean we no longer needed to store historical values – an interesting interpretation of a Data Warehouse to say the least.  Microsofts own content (here) recommends that there are huge productivity benefits in Slowly Changing Dimensions because you can compare the same key at two points in time.

The idea’s of temporary tables (recording every change to a table in an table_History table) is a cool feature – don’t get me wrong, I think its great.  However, there is (sadly) a lacking
amount of discussion about how the feature can be incorporated into the Data Warehouse.  Those that blindly follow the sales pitch “I don’t need a data warehouse” because I’ve got history tracking or perhaps “yeah, I’ve got a data warehouse – its a history tracked OLTP” will ensure their users cant get the data they need.

So lets call out some issues with reliance on temporal tables as a data warehouse replacement (and bring some data warehouse assumptions to the surface).  I will focus on the star schema design since most references explicit refer to changing dimensions (however we can apply these ideas to other methodologies).

A fundamental construct of the star schema is the surrogate key (almost as fundamental as the concept of dimension and fact).  Using the surrogate uniquely identifies an instance of a dimension at a point in time and therefore, state of the dimension can be precisely identified for the fact record.  For example, if I sold a product on a date, I need to look up the product dimension and determine which version of the product was applicable on that date.  The products surrogate key (not the Product Id or Code) is used in the fact.  This is the fundamental design of the star schema.  A temporal table does not provide you the capacity to do this – all it can do is provide the data to construct the star. 

How could you solve this with temporary tables?  Well, there may be the thought that you could concatenate the tables primary key and the records start date for uniqueness and then determine what (dimension) record is applicable to a fact record via a query.  Interesting idea but not something I’d take on litely.  I suspect that performance would degenerate so quickly both the BI users and the Ops users (remember that this is occurring on the OLTP) would walk away in droves.  Remember that this has to occur for every record in the fact – (and yes they are those LONG narrow tables)!

So lets leave it to the presentation tool – pick one, Power BI, SSAS, Tableau, Qlik, Jedox, …..  All these products rely on uniqueness between separated tables so we still require the surrogate to enforce and deliver uniqueness.   The star (or at least the principle) is still required.

The real power of the dimension (and to a lesser extent the fact) is that it adds business context that does not exist (or can not be easily calculated). Of course this is my opinion but think about it for a moment. Forget about historic values for a moment – raw information is in the source, if the user wanted that you could give it to them no problem. What the star gives is a modeled perspective of a particular part of the business. Consider a customer dimension – what adds values in analysis? It is often the supplementary data (age group, segment profile, status classification, targeted customer …. ) and all of these things are defined and stored in the dimension. So, if we are going to add value (as data warehousing professionals), we still need the dimension to provide this.

All business applications offer some form of reporting – if you’ve ever seen an information maturity chart, it is the first stage of maturity (see below thanks to an EMC2 slide deck).

big-data-business-model-maturity-chart

Riddle me this then, if the source application (OLTP) provides reporting why do we need a data warehouse? Show reports at a particular point in time? Maybe – (although a lot of users struggle with this and tend to think in current terms). There are a lot of tools that provide adhoc query (OLAP) capabilities so performance the performance of analysis isn’t a real consideration (after all, they could just use an OLAP engine over the OLTP right?).

I think one of the primary reasons is integration. We want to either integrate data from different systems (creating an enterprise view) or we want to supplement current data with with other, richer information (which is really just integration anyway isn’t it). We could even say that business rules and derived information falls into this category.

Here also temporal tables do not negate the need for the data warehouse. The data warehouse is responsible for delivering a consistent, conformed, business verified data that incorporates information from various sources. Nothing changed there (and still the need for a data warehouse).

Finally, lets consider the dimension. That subject orientated view of an object. Its the Product table that tells me everything I need to know about a Product – its category, groupings, margin positions and alike. The dimension is notorious for redundancy and de-normalisation but that’s the price we are prepared to pay for
delivering a single concise view to a user because it breaks down the complexity of the model for the user (they don’t have to combine products to product categories in a query). The idea that we have de-normalise breaks the basic OLTP conventions which force normalisation (after all, we expect 3rd normal form).

The data warehouse is designed to do this work for us and present the view to the user. Essentially, its just another integration problem but one that’s handled (and hidden) by the data warehouse. Our BI tools (again with the presentation layers) may be able to create these consolidations for us however we are still presented with the issue of uniqueness in related table records (that is, we have to identify which category related to a product at a point in time and the BI tools will not do that for us).

So, are temporal tools a replacement for a data warehouse? I think not, sure they may be able to assist with record change tracking (we haven’t discussed the shift in OLTP resource management). Temporary tables are only a tool and I’d be very careful of anyone that claims they could replace a data warehouse.

MySql Import Issues with Tabular and OldGuids

In my last post, I looked at a MySql connection property that causes loads to fail in Power BI and how you can work around that issue (see here).  In this post, I’ll look at those implications in SSAS Tabular.  In this example, I’ve downloaded the MySql ODBC drivers (both x32 and x64) from here and created ODBC connections (x32 and x64 with the same name) to my MySQL database.

Data is initially imported through the Table Wizard where I have to choose the relational source Others ( OLEDB/ODBC).  This brings me to the Table Import Wizard window (below) where I can choose my ODBC Connection.  Testing the connection proves successful.  So you navigate next a few times until you can chose to import from a list of tables or write a query.

image

If you choose to select tables (and any) of the tables have Guids in them, the entire load will fail and you are presented with the following errors.  image

Here the error is raised by the SQL generated because it is not compliant with MySQL (that is the SQL generated is “Select [calls].* from [calls]” and MySql does not like square brackets).

Far enough, so you continue.  Now that the connection has been created in the model all you need to do to import data is Open the existing connection (existing connections in the model can be viewed by clicking the Connections button in the ribbon (image) or by selecting “Existing Connections..” from the Model Menu group).  You can then open that connection (either Double Click it or highlight it and press the Open button) and you are presented with the select from tables or write a query).  This time you specify the query (Select * From Calls).  The query validates however, a new error is presented when you try to load data.  Although not specified, it does relate to the OldGuids property.

image

Now we have to manually edit the connection to specify its properties.  When we do so, we are presented with the following window.  Now I want to build the connection string so specify connection string and select the build option.  Then I can choose the machine data source (btw I set up my ODBC connections as machine sources).

image

Now I am taken to the drivers creation window (as below).  I don’t need to change anything here (since the connection works anyway), so I will accept the defaults and hit OK.  (Note that the OldGuids property is not an option in the details section of the drivers configuration)

image

My Connection String has now been specified (note the additional properties in the window and snippet below) and I can add other properties as I need.

image

Before:

DSN=dsnname;SERVER=server;UID=username;PWD=password;DATABASE=databasename;PORT=3306

After:

DSN=dsnname;SERVER=server;UID=username;PWD=password;DATABASE=databasename;PORT=3306;OldGuids=true;

Then I simply specify my impersonation and I am right to go (noting that I must write a query).

MySql Import Issues with Power BI and OldGuids

One of the great things about self service BI is the ability to model data from different sources.  Recently, I have been using MySql sources against different versions of Tabular Models and have run into some issues.  This post shows those issues and how to get around them in Power BI Desktop designer (well at least how I did).

The import into Power BI Desktop appears straight forward, with MySQL listed as a source.  There are some dependencies on drivers (which are not installed by default) however, the installations of those is quite simple.  So lets move directly to the Import Data screen (as below).

image

After specifying the server, database and connection information, you come up with a familiar list of tables applicable to the database (like the s/s below).  Selecting some tables gave the usual preview however, at one selection, an error arose relating to an OldGuids option in the connection and then, no tables were available in preview mode.

image

So I chose to Edit the M query and are straight away presented with the same error.  Note that the query setting errors on the Navigation step (or Applied Steps) and, had it navigated to source, I could see a list of tables within the database schema.

image

image

The source of the table specifies the database (both server and catalog).  In fact, you can see this in generated M as shown below.

= MySQL.Database(“myservername”, “catalogname”, [ReturnSingleDatabase=true])

To enable OldGuids in the connection string, simply specify it as a property in the source.  Therefore, the ‘Source’ syntax now becomes;

= MySQL.Database(“myservername”, “catalogname”, [ReturnSingleDatabase=true, OldGuids=true])

Depending on the schema of the database, this would have to be done for any table that you want to import which has throws an error.

Now, I cant help feeling that that the simple import wizard creates an unnecessary step in the M script.  After all, why specify the source database and then restrict the output to a single table in it?  The natural way around this seems to would appear to import just a table (via an SQL statement).  The Code for this is more elegant (IMO) however, the requirement for setting the OldGuids property is still required.  This time, when I add data, I’ll specify the query (as an SQL statement).  My import import and error looks like the following;

imageimage

Unfortunately, when I Edit the error here, I am returned to the basic import screen (on the left) and can not do any ‘advanced editing’ (like setting the OldGuids property).  The only option I have here is to specify my query as a custom M Script using a blank query, then use the advanced editor to provide the M syntax for a query.

image

= MySQL.Database(“myservername”, “catalogname”, [Query=”select * from calls”, ReturnSingleDatabase=true, OldGuids=true])

There is some high level documentation on the MS Site about importing from mySQL here and here (including perquisites).

Alternatively

One alternative method to creating a blank query was to force a valid SQL statement to and then edit the query.  Here, I specify that I will get data from a MySql database, and choose to enter the query.  Note my query (below) is simply “Select 1 from dual

image

I can then choose to edit that query which brings me directly to the Power Query editor and I can hack the M from there.  That is, the window below opens.  I can then chose the advance editor option to edit the M (changing the query and adding the OldGuids Option) or type those changes directly into the formula bar.

image

Deconstructing XML Context from SQL Server

Recently I have had a requirement to deconstruct XML in SQL Server and present fragments of that XML to users as field values.  This can be achieved relatively easy using a combination of XQuery/XPath against XML data that stored in fields within an SQL Server table.  We’ll start by looking at some simple cases and then look at more complex XML considerations.

Simple Examples

In my simplest example, I’ll be using a snapshot of XML from books (you can find the full version here) and I have included what I am using below for clarity.

<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
 </catalog>

In our first example, we assume that the XML is included as a chunk in a field and therefore, we can define an XML column with the following SQL.

 
;with mydataset as 
(
	select 1 as row_id, @my_xml  as xml_content
	union all select 2 as row_id, @my_xml  as xml_content
)

Select *
from mydataset 

You’ll notice that the query results look like the following.  Clicking on (a cell) from the xml_content field will open up the XML in a new window.

SNAGHTML485b942

There is no big deal here but, suppose I now want to add to my projection (ie the select) based on the data in the XML fragment, for example, the first books author.  I can use an XPath pattern to navigate to the node (tag) that I am interested in and display the nodes text.  My SQL (selecting from mydataset) would now become;

Select *
, mydataset.xml_content.query('/catalog/book[1]/author/text()') as first_book_author 
from mydataset 

There are really two parts to the field I added.  Firstly we use the .query method to parse an XQuery against the XML column.  You can read more about the query method here and XPath here (actually the XPath link includes some good examples for XPath).

If your not too familiar with XQuery or XPath, there are several things to note in the query that I’ve used (ie the ‘/catalog/book[1]/author/text()’.

First, it follows the document structure and you can select nodes by using the delimited structure starting at the root node.  The first / refers to the root and then we define the nodes of the document that we want to see which are delimited by name (for example a collection of book nodes is defined by /catalog/book/).

Secondly, I can refer to a particular node by specifying its index. This is done with the square brackets after the book node (ie book[1] refers to the first book node – the book with the ID=”bk101”).

Finally, I can extract the nodes value (text) by using the text() function. That’s the text between the the nodes opening and closing tag.

Of course, the return type is XML so if I wanted to convert it to a text type I would have to cast.

cast(mydataset.xml_content.query('/catalog/book[1]/author/text()') as nvarchar(255)) as first_book_author 

Adding Namespaces

For something a little more interesting (and perhaps more practical), I recently needed to extract the some dataset information from an SSRS Server (actually, it was the query for datasets).  Datasets and other information are included in the report server database Catalog table.  Its an interesting table (perhaps one may argue that its not entirely relational) because it includes different types of objects however content data (the XML relating to an object) is stored in an Image format so the first step is to convert it to XML.

Since the table contains different types of data (for different SSRS objects) we can restrict the table by the Type field but for now lets not worry about it. I’ve converted the Content field to XML the the query below. You can see the XML (based on the root nodes value) in the image below

image

<SharedDataSet xmlns:rd="http://schemas.microsoft.com/SQLServer/reporting/reportdesigner" xmlns="http://schemas.microsoft.com/sqlserver/reporting/2010/01/shareddatasetdefinition">
  <DataSet Name="">
    <Query>
      <DataSourceReference>my_connection_name</DataSourceReference>
      <CommandText>my_sqlcommand</CommandText>
    </Query>
    <Fields>
      <Field Name="field_1">
        <DataField>fiedl_1</DataField>
        <rd:TypeName>System.Int32</rd:TypeName>
      </Field>
      <Field Name="field_2">
        <DataField>field_2</DataField>
        <rd:TypeName>System.String</rd:TypeName>
      </Field>
    </Fields>
  </DataSet>
</SharedDataSet>

You may think that the use of an XPATH expression ‘/SharedDataSet/DataSet/Query/CommandText/text() would define the nodes text (for the dataset).  Unfortunately, the query (the field dataset_query in the output) is blank – Try it for yourself.

select 
*
, ssrs_data.my_xml.query('/SharedDataSet/DataSet/Query/CommandText/text()') as dataset_query  

from 
(
	select
		Content  
		, Convert(xml, (CONVERT(varbinary(max), Content))) as my_xml 
	from [dbo].[Catalog]

) as ssrs_data
[/Code]&lt;/p&gt; &lt;p&gt;If we revisit the sample for the XML above, we can wee that it includes namespaces (note the use of xmlns in the root node). Namespaces avoid conflicts in node names so that the same tag (node) could have several meanings within the XML snippet. There are 2 namespaces in our example above, one for rd (the first) and a default value (the second one).&lt;/p&gt; &lt;p&gt;You can also see how the rd namespace is applied in the XML through the tag &lt;rd:TypeName&gt;. If there is no namespace used in XML, we can assume that its the default.&lt;/p&gt; &lt;p&gt;The problem for us is that we have to define the namespaces in our query so our XPath expression can recognize the namespace in the document and we use &lt;a href="https://msdn.microsoft.com/en-AU/library/ms177607.aspx"&gt;NAMESPACES&lt;/a&gt; syntax for this. This CTE type table defines namespaces and there preference in the document. I could be cheeky and define the default namespace (since I am only working with tags in that space) and that’s what i have done below.&lt;/p&gt; &lt;p&gt;
1 
with 
XMLNAMESPACES (	
	DEFAULT N'http://schemas.microsoft.com/sqlserver/reporting/2010/01/shareddatasetdefinition'
			)

select 
*
, ssrs_data.my_xml.query('/SharedDataSet/DataSet/Query/CommandText/text()') as dataset_query  

from 
(
	select
		Content  
		, Convert(xml, (CONVERT(varbinary(max), Content))) as my_xml 
	from [dbo].[Catalog]

) as ssrs_data

However, suppose I want to include the namespace in my query (say for example the first field type), my query now becomes.

with 
XMLNAMESPACES (	
	DEFAULT N'http://schemas.microsoft.com/sqlserver/reporting/2010/01/shareddatasetdefinition', 
	N'http://schemas.microsoft.com/SQLServer/reporting/reportdesigner' as rd
			)

select 
*
, ssrs_data.my_xml.query('/SharedDataSet/DataSet/Query/CommandText/text()') as dataset_query  
, ssrs_data.my_xml.query('/SharedDataSet/DataSet/Fields/Field[1]/rd:TypeName/text()') as first_field_type 
, ssrs_data.my_xml.query('count(/SharedDataSet/DataSet/Fields/Field)') as field_count 
from 
(
	select
		Content  
		, Convert(xml, (CONVERT(varbinary(max), Content))) as my_xml 
	from [dbo].[Catalog]

) as ssrs_data


BIML Annotations

BIML offers a great method to classify and manage code through annotations.  Basically, an annotation is just a tag that provides the ability to include text into our BIML code.  No big deal right?  Well perhaps not, however, if we think about how we can use this feature a little deeper, we’ll see there’s some very powerful capability just under the covers. 

Consider for example, using BIML Script to define a large number of packages using a standard template from different source systems.  We may want to ‘tag’ each package with a tag that specifies the system.  We could a that tag to specify the system within each package and then use the value of the tag later in our script (say for example creating a master package that groups all the child packages of one system).

Lets look at a simple example which follows this scenario. In the script below, I generate 4 packages (which don’t do anything) and have an annotation called ENV in each of them.  The annotation has the Value ‘Test’.  You’ll also notice that I’ve specified this script as tier 1 (meaning that its executed first).

<#@ template language="C#" tier="1" #>
<# string _env="Test"; #>
  <Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
      <# for(int i=1;i<5;i++){ #>
        <Package Name="Package<#= i.ToString() #>" ConstraintMode="Linear">
          <Annotations>
            <Annotation Tag="ENV"><#= _env #></Annotation>
          </Annotations>
        </Package>
        <#} #>
    </Packages>
  </Biml>

The annotation tag (‘ENV’) is designed to add context to the package when the Biml Script is compiled/compiled.  You can’t see this if your using BIDS helper however I’ll show the expanded BIML is shown below.  Notice that each package has the tag ENV.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Packages>
	<Package Name="Package1" ConstraintMode="Linear">
		<Annotations>
			<Annotation Tag="ENV">Test</Annotation>
            </Annotations>
	</Package>
        <Package Name="Package2" ConstraintMode="Linear">
		<Annotations>
		<Annotation Tag="ENV">Test</Annotation>
            </Annotations>
		</Package>
        		<Package Name="Package3" ConstraintMode="Linear">
			<Annotations>
				<Annotation Tag="ENV">Test</Annotation>
            </Annotations>
		</Package>
        		<Package Name="Package4" ConstraintMode="Linear">
			<Annotations>
				<Annotation Tag="ENV">Test</Annotation>
            </Annotations>
		</Package>
            </Packages>
</Biml>

Now, I can now use that tag in other BIML scripts.  So, (for example) suppose I want a master package that execute my ‘Test’ packages.  All I would have to do is iterate over each of my packages (in the solution) and test each to see if they belonged to the ‘Test’ ENV(environment).  If they did contain that ENV value, I just include them in an execute package task.  Here’s how the second script looks;

 
<#@ template language="C#" tier="2" #>
  <Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
      <Package Name="MASTER" ConstraintMode="Linear">
        <Tasks>
          <# foreach (var _package in RootNode.Packages){
					   if (_package.GetTag("ENV")=="Test"){#>
          <ExecutePackage Name="Run <#= _package.Name #>">
            <Package PackageName="<#= _package.Name #>" />
          </ExecutePackage>
          <#} } #>
        </Tasks>
      </Package>
    </Packages>
  </Biml>

In the code above, you can see how I reference the a packages tag using the GetTag method.  Also note that the tier for this package is 2 so its executed after the prior script.  Once compiled, the output is a master package that executes each child (as below);

image

Of course this is a really simple example and you may suggest you could achieve something similar using the package name.  For example you could iterate over each package using the foreach (var _package in RootNode.Packages) above and reference _package.Name. 

This approach may work (of course) however, it relies on your package name fully encapsulating the functionality of all possible annotations – a process that could be very messy.  Suppose, I wanted to include creator, ‘ETL Stage’, ETL Version etc, soon my package name would be unmanageable.

If you are using BIDS helper, you may also consider using a different visual studio project for each ‘package type’, purpose and environment.  Again, this solution may work however, lets consider our original proposition.  We are using exactly the same pattern with different meta-data (that is, the environment).  Wouldn’t it make things simple to maintain this within the one solution?

This is where MIST can really help with development.  I use the same BIML script to generate multiple packages which are all held within the same solution, they are then compiled into different Visual Studio solutions.  Put simply, MIST allows me to keep everything in the one place.

Building the Master Child Execution with BIML

In a previous post we looked at using BIML to execute a stored proc in order to assign a return value to a variable in BIDS.  The use case for this is pretty simple, when we execute a package, we want to get an ID for the execution and then record execution information against that ID.  When we execute packages through a master and child arrangement, we need to do two things.  Firstly, the master package gets an execution id (to record the batch) and secondly, each child package execution gets its own id (relative to the parent) to record its own execution.  This is the classic application of header / detail relationships as seen in the following diagram.

image

If you’d like to build this in your environment, here’s the SQL (I’ve used an ETL_FRAMEWORK schema and included some procs to return return and audit_id)

USE [EXAMPLES]
GO

/* Header Table */
CREATE TABLE [ETL_FRAMEWORK].[MASTER_EXEC](
	[RUN_ID] [int] IDENTITY(1,1) NOT NULL,
	[JOB_NAME] [nvarchar](256) NULL,
	[START_DT] [datetime] NULL DEFAULT (getdate()),
 CONSTRAINT [pk_etl_framework_master_exec] PRIMARY KEY CLUSTERED ([RUN_ID])
);

/* child table */
CREATE TABLE [ETL_FRAMEWORK].[MASTER_CHILD_EXEC](
	[RUN_ID] [int] NULL,
	[CHILD_ID] [int] IDENTITY(1,1) NOT NULL,
	[JOB_NAME] [nvarchar](256) NULL,
	[START_DT] [datetime] NULL DEFAULT (getdate()),
 CONSTRAINT [pk_etl_framework_master_child_exec] PRIMARY KEY CLUSTERED ([CHILD_ID]) 
 );

ALTER TABLE [ETL_FRAMEWORK].[MASTER_CHILD_EXEC]  WITH CHECK ADD  CONSTRAINT [fk_etl_framework_master_child_exec_child_id] FOREIGN KEY([RUN_ID])
REFERENCES [ETL_FRAMEWORK].[MASTER_EXEC] ([RUN_ID])
ON DELETE CASCADE
;

ALTER TABLE [ETL_FRAMEWORK].[MASTER_CHILD_EXEC] CHECK CONSTRAINT [fk_etl_framework_master_child_exec_child_id]
;


/* proc for master_id */
CREATE PROC [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET]
(
  @JOB_NAME NVARCHAR(256)
)
AS
BEGIN
 SET NOCOUNT ON;
 INSERT INTO ETL_FRAMEWORK.MASTER_EXEC(JOB_NAME) VALUES(@JOB_NAME);
 RETURN IDENT_CURRENT('ETL_FRAMEWORK.MASTER_EXEC');
END;

/* proc for child_id (requires master_id) */
CREATE PROC [ETL_FRAMEWORK].[MASTER_CHILD_EXEC_ID_GET]
(
  @RUN_ID INT 
  , @JOB_NAME NVARCHAR(256)
)
AS
BEGIN
 SET NOCOUNT ON;
 INSERT INTO ETL_FRAMEWORK.MASTER_CHILD_EXEC(RUN_ID, JOB_NAME) VALUES(@RUN_ID, @JOB_NAME);
 RETURN IDENT_CURRENT('ETL_FRAMEWORK.MASTER_CHILD_EXEC');
END;

/****************************************************/
/*	Want an example of how it runs ? Try this		*/
/****************************************************/
declare @master_id int; 
declare @child_id int ;;
exec @master_id = [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET] 'TEST'
print @master_id;
exec @child_id = [ETL_FRAMEWORK].[MASTER_CHILD_EXEC_ID_GET] @master_id, 'TEST CHILD';
print @child_id ;

Now to BIML

Our solution is pretty simple – the parent fires first, gaining a id (fire the proc [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET] ) and then passes this variable to the child package.  In the child package, we’ve registered MASTER_AUDIT_KEY as a parameter with the execute package task passing the variable down to the child.  We can see this in the two images below;

image

image

You’ll also notice that I have 2 biml files to build the package(s), these are provided in full as;

00_CHILDREN

<#@ template  language="C#" tier="1" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <OleDbConnection Name="MetaData" ConnectionString="Provider=SQLOLEDB;Data Source=localhost\SQL2014;Integrated Security=SSPI;Initial Catalog=EXAMPLES" />
  </Connections>
  <Packages>
    <Package Name="Child01" ConstraintMode="Linear" ProtectionLevel="EncryptSensitiveWithUserKey" >
      <Parameters>
        <Parameter DataType="Int32" Name="MASTER_AUDIT_KEY">-1</Parameter>
      </Parameters>
      <Variables>
        <Variable DataType="Int32" Name="CHILD_AUDIT_KEY">-1</Variable>
      </Variables>
      <Tasks>
        <ExecuteSQL Name="CHILD_KEY_GET" ConnectionName="MetaData">
          <DirectInput>exec ? = [ETL_FRAMEWORK].[MASTER_CHILD_EXEC_ID_GET] ?, ?</DirectInput>
          <Parameters>
            <Parameter VariableName="User.CHILD_AUDIT_KEY" Name="0" DataType="Int32" Direction="ReturnValue" />
            <Parameter VariableName="MASTER_AUDIT_KEY" Name="1" DataType="Int32" Direction="Input" />
            <Parameter VariableName="System.PackageName" Name="2" DataType="String" Length="50" Direction="Input" />
          </Parameters>
        </ExecuteSQL>
      </Tasks>
    </Package>
  </Packages>
</Biml>

01_MASTER

<#@ template  language="C#" tier="2" #>
  <Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages >
      <Package Name="Master" ConstraintMode="Linear" ProtectionLevel="EncryptSensitiveWithUserKey" >
        <Variables>
          <Variable DataType="Int32" Name="MASTER_AUDIT_KEY">-1</Variable>
        </Variables>
        <Tasks>
          <ExecuteSQL Name="MASTER_KEY_GET" ConnectionName="MetaData">
            <DirectInput>exec ? = [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET] ?</DirectInput>
            <Parameters>
              <Parameter VariableName="User.MASTER_AUDIT_KEY" Name="0" DataType="Int32" Direction="ReturnValue" />
              <Parameter VariableName="System.PackageName" Name="1"  DataType="String" Length="50" Direction="Input" />
            </Parameters>
          </ExecuteSQL>
          <ExecutePackage Name="Exec Child01"  >
            <ExternalProjectPackage Package="Child01.dtsx" />
            <ParameterBindings>
              <ParameterBinding VariableName="User.MASTER_AUDIT_KEY" Name="MASTER_AUDIT_KEY" />
            </ParameterBindings>
          </ExecutePackage>
        </Tasks>
      </Package>
    </Packages>
  </Biml>

There are a few things to remember here so I’ll break each out into its own paragraph.  I’ve been over some of these points before but they are worth inclusion nevertheless.

The 2 BIML files can be executed together by selecting both, right clicking and selecting the ‘Generate SSIS Packages’ from the popup window.  The ability to do this is required because a biml snippet (for example the connection string) is defined in one code file and referenced in the other.  Naturally, the execution order is important (note the connection is defined in child) so we want to compile that (child) package first.  The use of the keyword ‘tier’ in the directive of the file determines the order of compilation for selected packages.

We can define a variable or parameter within in required scope through the tree structure of the XML document.  I’ve scoped these variables to the package (they are declared directly under the package node), however, if I wanted to restrict the scope of a variable to a container, I could declare it within the containers XML nodes.

We pass the parameter from the master package to the child through the parameter bindings tag of the ExecutePackage task.  Its pretty self explanatory (when you look at the code) but this is the focus of the post so its probably worth at least calling it out :)