BIML, Stored Procedures and Return Parameters

Despite the improved logging features that have been incorporated into SQL2012+, I am a great fan of defining your own logging methods.  The most common method for this is having a package recognised by an ‘audit key’ and then using that key as a base point to record information that you want throughout the package.  For example, get the key (at the same time record the package start), then record success, failure, row counts, variable values etc. 

When you use a master package to execute children, you associate the childs execution (key) with a master execution (key) and this opens up some really great logging.  We’ll look at how to pass those keys down to the child in the next post, but for now, lets look at how we generate and store a package key in BIML.  In our environment, we’ve got a ‘master table (MASTER_EXEC) to record (master) package execution and a stored proc (MASTER_EXEC_ID_GET) to return the identity value (when ever we fire the proc).  As a minimum, we pass the package name to that proc so we can identify the package that was run.  You can see the format of the table and the creation scripts below. 


If we wanted to test this proc in SSMS, we’d use some syntax like;

EXEC @RET = [dbo].[MASTER_EXEC_ID_GET] 'Here is Some TEST Name'

Now, for the BIML.  Here, we just have to define a variable in the packages scope to hold the key  (Ive used a variable called MASTER_ID), and execute the proc.  Procedure execution occurs through an ExecuteSQL task and we assign (and capture) the stored procs parameter values through the ExecuteSql tasks parameters node.

<Biml xmlns="<a href=""">"</a>>
  <OleDbConnection Name="BIML_DB" ConnectionString="Provider=SQLNCLI11;Server=(myserver);Integrated Security=SSPI;Database=BIML" />
  <Package Name="Master_3" ConstraintMode="Linear">
    <Variable Name="MASTER_ID" DataType="Int32">-1</Variable>
   <ExecuteSQL Name="AUDIT_KEY" ConnectionName="BIML_DB">
    <DirectInput>exec ? = dbo.MASTER_EXEC_ID_GET ?</DirectInput>
      <Parameter VariableName="User.MASTER_ID" Name="0" DataType="Int32" Direction="ReturnValue" />
      <Parameter VariableName="System.PackageName" Name="1" DataType="AnsiString" Length="256" Direction="Input" />

You can see that the SQL statement (in BIML) mirrors what we would otherwise do in SQL.  The BIML parameter definition does not need to specify the parameters by name (rather by their ordinal position in the SQL statement) and we also specify the direction.  This mimics exactly what we would do in SSIS.

BIML – Importing Text Files

Here’s a BIML snippet for reading data from a text file.  If you’ve read the prior post about pushing data (from a table) to a text file, you’ll see a lot of the same constructs.  There is;

  1. A connection object (ie the FlatFileConnection) that specifies a path to a file.  The connection requires a FileFormat.
  2. A FileFormat which specifies the nature of the file (ie columns, data types, delimiters etc)
  3. The data flow object (ie the Source in the DataFlow which is a FlatFileSource).

I do not think there is too much complexity in the Connection or [DataFlow] Source.  There are really only a few attributes that need to be defined and, if you think about the nature of the task we are trying to achieve, there are all required.

What is the most interesting is the specification of the FlatFileFormat.  This defines the nature of the data that we are connecting to.  Both at a header and detail level.  For example consider the delimiter constructs for the header (HeaderRowDelimiter).  You would think that the delimiter is applicable to the entire file but this is not the case and it also needs to be specified on a column by column basis.  Finally, we also need to specify the new line on the last column.  Perhaps that’s not so intuitive since we specify the row delimiter in the header section of the Format.

So here is what we are trying to import.


And here is the BIML snippet to import it.

<Biml xmlns="">
        <FlatFileConnection Name="my_import" FileFormat="my_values" FilePath="C:\temp\my_values.csv" />
        <FlatFileFormat Name="my_values" ColumnNamesInFirstDataRow="true" RowDelimiter="CRLF" HeaderRowDelimiter="Comma" CodePage="1252" IsUnicode="false">
                <Column Name="ID" DataType="Int32" Delimiter="Comma" />
                <Column Name="Name" DataType="AnsiString" Length="255" Delimiter="Comma" />
                <Column Name="Value" DataType="Double" Delimiter="CRLF"  />
        <Package Name="import_MyValue" ConstraintMode="Linear" ProtectionLevel="EncryptAllWithUserKey">
                <Dataflow Name="load_values">
                        <FlatFileSource Name="src_values" ConnectionName="my_import" />

SSRS – Add a little Pizzazz to SSRS Reports with Row Numbers and Pop-Out Windows

Canned reports are the bread and butter of information delivery – I have read that up to 70% of users can be classified as consumers or the type of person that gets a report and is happy with that. There is no desire to analyse additional datasets, examine tweeter sediment or speculate correlation.  Consumers just need data/information presented them for their day to day activity in an intuitive and navigation friendly and the report needs to be built for their indented use.

In SSRS, I have found 2 common requirements for features that do not come out of the box and this post looks at 2 snippets to create better reports which enhance the user experience. These are Row Numbers to Grids and the creation of Popup windows.

Adding Row Numbers

Row numbers help to provide scale.  As a simple example, imagine you’ve got a report stating that there are some number of products sold and you provide an action to drill through and see what those products are.  If you’ve got a column for row numbers, you can easily reconcile the quantity between the first report and the detail in the second.

You may (naturally enough) think that you can simply use a RowNumber formula and set the scope of the function to the Data Set that drives the control (that is, the table or matrix).  Consider a data set (category_summary) that summarises invoice data by category, product and year.  If we add our expected formula as a row counter (like this);


Using RowNumber() is only applicable when the grain (detail grouping for rows) is the same as the dataset.  For example, I add a matrix with a row group as { Year, Category, Product } as below.  Here RowNumber works just fine



Unfortunately if that detail grouping grain is broken (say for example I use the same data set in a matrix with Category groupings on rows, Year on columns), the RowNumber returns unwanted results showing the last value at the controls grain (as below).  Note that there are 36 rows products in the 1st category in 1996.


In order to get a incremental value for each row, we need to determine the running value over the distinct elements that we are after (Category in our case).  This can be done using the following formula.

=RunningValue(Fields!CategoryID.Value, CountDistinct, "category_summary")

More information about RunningValue can be found here

Using POPUP Windows instead of SSRS Actions

Using an action is a very flexible way to redirect focus.  We can identity coordinates of a cell and then pass those coordinates to another report as parameters.  If we reconsider our matrix of Categories on rows and Years on columns, we may want to the user to drill through to a new report that shows the products and their quantities that make up that value.

The simplest way to to that is specify the report and assign parameter values as those in the matrix (as below). image

Unfortunately, the end user experience for this rather poor and for any action in a report.  When the user clicks on the cell in the published report, the current report is replaced with the new report.  If they want to return (to the original starting position), they are forced to navigate using the browser buttons.  Even when you specify the URL (as opposed to a Go to report action), the behaviour is the same :(

If we want to open the report in a new window, the best way to do this is to ‘trick’ the action into running javascript that does this using method.  You can read more about that function here, but for now, lets look at the basics.

In my report, I’ve created a function that determines the report server URL to open the report that I want.  This includes the ability to pass paramaters based on the position of the cell in the matrix (or table) control.  The function is;

Public Function WindowURL(byval OrderYear as String, ByVal CategoryID as String) as String

return "<a href="http://localhost/reportserver?/reports/ssrs_pizzas/reports/CategoryYearDetail&OrderYear="+ OrderYear + "CategoryID=" + CategoryID  

End Function  

You can see its pretty simple and just appends creates the URL based on the provided OrderYear and CategoryID.  Secondly, i want to specify the javascript to use that URL and open a new window.  The ‘base’ java script takes the form javascript:void“URL”);  I’ll add a second function (as below) to build the full snippet.  You’ll note that I am still planning the pass the OrderYear and CategoryId to the function since it will be called from the cell in the matrix.

Public Function OpenReport(ByVal OrderYear as String, ByVal CategoryID as String) as String  <p>&nbsp; return "javascript.void """ + WindowURL(OrderYear, CategoryID) +""" );"  <p>End Function  <p> <p>

The function (OpenReport) is then called from the matrix cell


Now when we click on a cell, a new window will open.  There only be one new window regardless of the cell that you click.  If you want more control over the pop up window, we can address that in other ways (as below).


One of the great things about using the method is that you can get very fine control over how the window is managed and appears.  The full syntax can be found here.  For me, the most important arguments are the name of the window and its specs. The format for is, name, specs, replace )

The windows name is interesting because a window can only have a single name (forget about the target attribute for the window).  If we recap on the Category, Year example, the name can be used to define how many windows will open (and the conditions for opening a new window).  For example, if our name was simply “Category” the content of popup window would always be replaced (since there is only one ‘Category’ window.

However, we could derive a concatenated name of “Category” and [Category ID].  If this were the case, a new window would be created for each new instance of category… (mind you, you’d still have to click on each unique category) however a new window would open for each category.  The single report can spawn multiple popup windows.  We could take it to a higher level of granularity and include the year (in the name) and have even more windows (potentially) open.

One thing to keep in mind is that the windows name can not contain some special characters.  Hyphens are out :)

Finally, we can also control, the size, position and functionality of the popup window.  For example, the following string opens a small window in the top left corner of the screen without menu or toolbars.  To my way of thinking its just a data driven information window that we want our users to see.



That’s about it.  Out of the box, functionality could be considered unfriendly.  But we can manage and control its functionality using some javascript.

Cleaning Your Database with BIML

Blow it away and start again :)

When we are working in a dynamic environment, we need to push through changes quickly and a prerequisite for this means starting from scratch.  Starting with a blank database and deploying the build from a clean state.  If you can’t drop and recreate the database, then the next best option is to clean it out, delete all the objects and recreate them as part of your build.

You can scrub tables in several ways.  The one that I propose here is a sequential (and dynamic method) that is relatively straight forward (and of course we can look into BIML for it).  The template for the pattern is given in the pseudo code below;

Assume there are tables to delete
While there are tables to delete then
  table_list <- get the list of tables
  foreach table in table_list, try to delete the table
  determine if there are (still) tables to delete

In, SSIS, this looks like the following pattern and I have included a list of variables that I used to run the package.  There some more advanced features of SSIS being used here which I will call out;

  1. We use TABLE_COUNT (initial value 1) to control the continued execution of the package (that is, the WHILE_TABLE_EXISTS container).  This container runs while the TABLE_COUNT value is greater than 1 (or my database still has tables).
  2. The TABLES_IN_DB is an object that holds a list of table names, this is the result of the SQL Command GET_TABLE_LIST.  The purpose here, is to query the database metadata in order to determine a list of names.
  3. The iterator FOREACH_RECORD_IN_TALBE_LIST enumerates over each record in the TALBES_IN_DB (assigning the table name to the variable TABLE_NAME).  Within that container, we generate what SQL to execute (ie the drop command) in the GENERATE_SQL expression by assigning it to the SQL variable.  Then we execute that variable via an Execute Command.  Since we do not want the task to fail if the command does not work (after all there may be some dependencies between tables and execution order).
  4. Finally, after the first batch of executions has run (and hopefully all the tables are deleted), we recount the tables in the database, storing the values in the TABLE_COUNT variable.  Control is then passed back to the WHILE_TABLES_EXIST to see if there are tables in the database and determine whether the process should begin again.



This process is very generic and can be applied in any database.  There’s also no need for BIML Script in code (since we do not rely on the generation of tasks which specifically rely on data).  Here is the full snippet;

<Biml xmlns="">

    <OleDbConnection Name="FM_MYSQL" ConnectionString="provider=MSDataShape;server=foodmart_mysql;uid=foodmart_admin;pwd=whats_my_password?" DelayValidation="true"  />

        <Package Name="01-Clean_MySQL" ProtectionLevel="EncryptAllWithUserKey" ConstraintMode="Linear">

            <!-- these variables are created in the packages scope -->
                <Variable Name="TABLES_IN_DB" DataType="Object">
                <Variable Name="TABLE_COUNT" DataType="Int32">1</Variable>


                <!-- the first container (while records exists) note the evaluation expresssion -->
                <ForLoop Name="WHILE TABLES EXIST" ConstraintMode="Linear">

                    <!-- tasks within the container are contained in a tasks tag -->

                        <!-- get the list of table (names) .. note the record set is assigned to the variable TABLES_IN_DB -->
                        <ExecuteSQL Name="GET_TABLE_LIST" ResultSet="Full" ConnectionName="FM_MYSQL">
                            <DirectInput>SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='foodmart';</DirectInput>
                                <Result Name="0" VariableName="User.TABLES_IN_DB" />

                        <!-- Iterate over each record in the TABLES_IN_DB variable
                            Note how we assign the current record to the TABLE_NAME variable -->
                        <ForEachAdoLoop Name="FOREACH_RECORD_IN_TABLE_LIST" SourceVariableName="User.TABLES_IN_DB" ConstraintMode="Linear">
                                <Variable Name="TABLE_NAME" DataType="String">
                                <Variable Name="SQL" DataType="String">
                                <VariableMapping Name="0" VariableName="User.TABLE_NAME" />


                                <!-- Generate the SQL Statement –>
                                <Expression Name="GENERATE SQL" Expression="@[User::SQL]=&quot;DROP TABLE foodmart.&quot; + @[User::TABLE_NAME]"/>
                                <!-- Execute the SQL Statement (based on the user variable (SQL)) –>

                                <ExecuteSQL Name="DELETE TABLE" ConnectionName="FM_MYSQL" DelayValidation="true" FailPackageOnFailure="false">
                                    <VariableInput VariableName="User.SQL" />



                        <!-- Recount Tables and store in the variable (which is passed back to the parent container)  -->
                        <ExecuteSQL Name="GET_TABLES_COUNT" ResultSet="SingleRow" ConnectionName="FM_MYSQL">
                            <DirectInput>SELECT COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='foodmart';</DirectInput>
                                <Result Name="0" VariableName="User.TABLE_COUNT" />



You might notice that I am using an MYSQL ODBC database connection.  Other features are commented in the code.

As stated, there is no reliance on BIML Script here, so you may ask ‘Why use BIML?’.  The reason for this is that the snippet includes samples of a few controls and how they are configured in different circumstances.

Barry Devlin & the Business un-Intelligence Course

I really enjoy Barry Devlin’s approach to Business Intelligence.  He is an under-appreciated name in the industry (he was one of the originators of ‘Data Warehouse’ way back in the mid-’80s), so when he has something to say, it’s worth a listen – especially when he challenges ‘modern’ perceptions of Data Warehousing and the BI Industry.

And this brings me to the purpose of this post.  For those in the Asia Pacific region, Barry is presenting a course on Big Data Analytics & Reinventing Business Intelligence in Singapore July 6-8th, 2015 (

So, what’s to be expected from the course? You can get the full agenda here (  Briefly, the course covers the complete range of topics on how to design a modern information/data environment, ranging from the fundamental principles of how all types of data should be absorbed and used within the organisation to an overview/assessment of tool choices and the implications of new neurobiological thinking about how decisions are really made.

Having read Barry’s latest book “Business unIntelligence – Insight and Innovation Beyond Analytics and Big Data” (, I suggest that this is a perfect opportunity to investigate the strategic direction of BI (from an organizational or consulting perspective) and address the implementation of BI.

Oh, and by the way – I’m told that foreign attendees get special discounts, all you have to do is ask :)

SSIS – Creating an OLEDB Connection to an ODBC Source and SSIS parameters

This post was originally going to be about using the advanced features of a SSIS OLEDB command to set parameters.  I’ve been using Oracle sources lately and the OLEDB command did not recognise the parameters for my query. 

For example, my in my work environment, the statement (below) just did not recognise the key_field parameter and so I had to manually add it through the components advanced editor.

Update some_table 
set some_field = new_value 
where key_field = ?

We’ll look into the advanced features a little later, however, when I tried to mimic the behaviour (using a MYSQL connection), the only issue I had was creating an OLEDB connection in SSIS.  I have ODBC connections on my machine however, (naturally enough given the components name), the OLEDB command requires an OLEDB connection.

So, to create an OLEDB command (and have it reference an ODBC source), simply use the Native OLE DB\MSDataShape provider and use the ODBC name as the server (as in the image below).  The UserId, Password and catalogue/schema can all be set through the connection managers interface.



Parameter Management under OLEDB

I can now use that OLEDB connection to execute an OLEDB Command.  As discussed above, my original intent was to show how to add a parameter when SSIS did not recognise the parameters.  In my situation, this was against an Oracle source.  In this demo, I’ll use a simple command

update adventure_works_2012.customer 
set territory_id = ? 
where account_number = ? 

and the column mappings tab of the OLEDB command allows me to map fields in my source (input data flow) to my command (ie parameters in the destination).  We can see this in the mapping screen shot below.


You will notice that the names are generic (Param_0 …. ) and refer the the placement of the ? symbols in the source query.  That’s not really user friendly and if i want to give these meaningful names, you would think that I can set them in the Input and Output Properties (using the Name Property as below).  However, this is not the way this UI works and the name defaults to the placement of the ? in the query.  Long story short, you can’t control the names of the parameters.


However, this part of the (advanced UI) does give us a little indication into how to add parameters.  If no parameters are recognised, my can add them through the ‘Add Column’ button on the External Columns node of the Inputs and Outputs tab.  This is what i had to do with my Oracle command.

Data Vault Training – Australia in 2015

The Genesee Academy’s Data Vault training has been announced for 2015 in the Australian region. 

Hans Hultgren will be running classes on Brisbane, Melbourne, Sydney and Auckland in July and August this year (the dates are below with links to the courses and registration). 

It’s a great class (I’ve heard multiple recommendations about this one) so if you interested in Data Vault, why not give attend and learn from someone who literally wrote the book


Jul 13 – 15th 3 Day Certification – Brisbane Australia Registration
Jul 22 – 24th 3 Day Certification – Melbourne Australia Registration
Jul 27 – 28th 2 Day Certification – Sydney Registration
Aug 3 – 4th 3 Day Certification – Auckland Registration