Building the Master Child Execution with BIML

In a previous post we looked at using BIML to execute a stored proc in order to assign a return value to a variable in BIDS.  The use case for this is pretty simple, when we execute a package, we want to get an ID for the execution and then record execution information against that ID.  When we execute packages through a master and child arrangement, we need to do two things.  Firstly, the master package gets an execution id (to record the batch) and secondly, each child package execution gets its own id (relative to the parent) to record its own execution.  This is the classic application of header / detail relationships as seen in the following diagram.

image

If you’d like to build this in your environment, here’s the SQL (I’ve used an ETL_FRAMEWORK schema and included some procs to return return and audit_id)

USE [EXAMPLES]
GO

/* Header Table */
CREATE TABLE [ETL_FRAMEWORK].[MASTER_EXEC](
	[RUN_ID] [int] IDENTITY(1,1) NOT NULL,
	[JOB_NAME] [nvarchar](256) NULL,
	[START_DT] [datetime] NULL DEFAULT (getdate()),
 CONSTRAINT [pk_etl_framework_master_exec] PRIMARY KEY CLUSTERED ([RUN_ID])
);

/* child table */
CREATE TABLE [ETL_FRAMEWORK].[MASTER_CHILD_EXEC](
	[RUN_ID] [int] NULL,
	[CHILD_ID] [int] IDENTITY(1,1) NOT NULL,
	[JOB_NAME] [nvarchar](256) NULL,
	[START_DT] [datetime] NULL DEFAULT (getdate()),
 CONSTRAINT [pk_etl_framework_master_child_exec] PRIMARY KEY CLUSTERED ([CHILD_ID]) 
 );

ALTER TABLE [ETL_FRAMEWORK].[MASTER_CHILD_EXEC]  WITH CHECK ADD  CONSTRAINT [fk_etl_framework_master_child_exec_child_id] FOREIGN KEY([RUN_ID])
REFERENCES [ETL_FRAMEWORK].[MASTER_EXEC] ([RUN_ID])
ON DELETE CASCADE
;

ALTER TABLE [ETL_FRAMEWORK].[MASTER_CHILD_EXEC] CHECK CONSTRAINT [fk_etl_framework_master_child_exec_child_id]
;


/* proc for master_id */
CREATE PROC [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET]
(
  @JOB_NAME NVARCHAR(256)
)
AS
BEGIN
 SET NOCOUNT ON;
 INSERT INTO ETL_FRAMEWORK.MASTER_EXEC(JOB_NAME) VALUES(@JOB_NAME);
 RETURN IDENT_CURRENT('ETL_FRAMEWORK.MASTER_EXEC');
END;

/* proc for child_id (requires master_id) */
CREATE PROC [ETL_FRAMEWORK].[MASTER_CHILD_EXEC_ID_GET]
(
  @RUN_ID INT 
  , @JOB_NAME NVARCHAR(256)
)
AS
BEGIN
 SET NOCOUNT ON;
 INSERT INTO ETL_FRAMEWORK.MASTER_CHILD_EXEC(RUN_ID, JOB_NAME) VALUES(@RUN_ID, @JOB_NAME);
 RETURN IDENT_CURRENT('ETL_FRAMEWORK.MASTER_CHILD_EXEC');
END;

/****************************************************/
/*	Want an example of how it runs ? Try this		*/
/****************************************************/
declare @master_id int; 
declare @child_id int ;;
exec @master_id = [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET] 'TEST'
print @master_id;
exec @child_id = [ETL_FRAMEWORK].[MASTER_CHILD_EXEC_ID_GET] @master_id, 'TEST CHILD';
print @child_id ;

Now to BIML

Our solution is pretty simple – the parent fires first, gaining a id (fire the proc [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET] ) and then passes this variable to the child package.  In the child package, we’ve registered MASTER_AUDIT_KEY as a parameter with the execute package task passing the variable down to the child.  We can see this in the two images below;

image

image

You’ll also notice that I have 2 biml files to build the package(s), these are provided in full as;

00_CHILDREN

<#@ template  language="C#" tier="1" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <OleDbConnection Name="MetaData" ConnectionString="Provider=SQLOLEDB;Data Source=localhost\SQL2014;Integrated Security=SSPI;Initial Catalog=EXAMPLES" />
  </Connections>
  <Packages>
    <Package Name="Child01" ConstraintMode="Linear" ProtectionLevel="EncryptSensitiveWithUserKey" >
      <Parameters>
        <Parameter DataType="Int32" Name="MASTER_AUDIT_KEY">-1</Parameter>
      </Parameters>
      <Variables>
        <Variable DataType="Int32" Name="CHILD_AUDIT_KEY">-1</Variable>
      </Variables>
      <Tasks>
        <ExecuteSQL Name="CHILD_KEY_GET" ConnectionName="MetaData">
          <DirectInput>exec ? = [ETL_FRAMEWORK].[MASTER_CHILD_EXEC_ID_GET] ?, ?</DirectInput>
          <Parameters>
            <Parameter VariableName="User.CHILD_AUDIT_KEY" Name="0" DataType="Int32" Direction="ReturnValue" />
            <Parameter VariableName="MASTER_AUDIT_KEY" Name="1" DataType="Int32" Direction="Input" />
            <Parameter VariableName="System.PackageName" Name="2" DataType="String" Length="50" Direction="Input" />
          </Parameters>
        </ExecuteSQL>
      </Tasks>
    </Package>
  </Packages>
</Biml>

01_MASTER

<#@ template  language="C#" tier="2" #>
  <Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages >
      <Package Name="Master" ConstraintMode="Linear" ProtectionLevel="EncryptSensitiveWithUserKey" >
        <Variables>
          <Variable DataType="Int32" Name="MASTER_AUDIT_KEY">-1</Variable>
        </Variables>
        <Tasks>
          <ExecuteSQL Name="MASTER_KEY_GET" ConnectionName="MetaData">
            <DirectInput>exec ? = [ETL_FRAMEWORK].[MASTER_EXEC_ID_GET] ?</DirectInput>
            <Parameters>
              <Parameter VariableName="User.MASTER_AUDIT_KEY" Name="0" DataType="Int32" Direction="ReturnValue" />
              <Parameter VariableName="System.PackageName" Name="1"  DataType="String" Length="50" Direction="Input" />
            </Parameters>
          </ExecuteSQL>
          <ExecutePackage Name="Exec Child01"  >
            <ExternalProjectPackage Package="Child01.dtsx" />
            <ParameterBindings>
              <ParameterBinding VariableName="User.MASTER_AUDIT_KEY" Name="MASTER_AUDIT_KEY" />
            </ParameterBindings>
          </ExecutePackage>
        </Tasks>
      </Package>
    </Packages>
  </Biml>

There are a few things to remember here so I’ll break each out into its own paragraph.  I’ve been over some of these points before but they are worth inclusion nevertheless.

The 2 BIML files can be executed together by selecting both, right clicking and selecting the ‘Generate SSIS Packages’ from the popup window.  The ability to do this is required because a biml snippet (for example the connection string) is defined in one code file and referenced in the other.  Naturally, the execution order is important (note the connection is defined in child) so we want to compile that (child) package first.  The use of the keyword ‘tier’ in the directive of the file determines the order of compilation for selected packages.

We can define a variable or parameter within in required scope through the tree structure of the XML document.  I’ve scoped these variables to the package (they are declared directly under the package node), however, if I wanted to restrict the scope of a variable to a container, I could declare it within the containers XML nodes.

We pass the parameter from the master package to the child through the parameter bindings tag of the ExecutePackage task.  Its pretty self explanatory (when you look at the code) but this is the focus of the post so its probably worth at least calling it out :)

BIML, Stored Procedures and Return Parameters

Despite the improved logging features that have been incorporated into SQL2012+, I am a great fan of defining your own logging methods.  The most common method for this is having a package recognised by an ‘audit key’ and then using that key as a base point to record information that you want throughout the package.  For example, get the key (at the same time record the package start), then record success, failure, row counts, variable values etc. 

When you use a master package to execute children, you associate the childs execution (key) with a master execution (key) and this opens up some really great logging.  We’ll look at how to pass those keys down to the child in the next post, but for now, lets look at how we generate and store a package key in BIML.  In our environment, we’ve got a ‘master table (MASTER_EXEC) to record (master) package execution and a stored proc (MASTER_EXEC_ID_GET) to return the identity value (when ever we fire the proc).  As a minimum, we pass the package name to that proc so we can identify the package that was run.  You can see the format of the table and the creation scripts below. 

CREATE TABLE DBO.MASTER_EXEC
(
 RUN_ID INT IDENTITY(1,1)
 , JOB_NAME NVARCHAR(256)
 , START_DT DATETIME DEFAULT(GETDATE())
)CREATE PROC DBO.MASTER_EXEC_ID_GET
(
  @JOB_NAME NVARCHAR(256)
)
AS
BEGIN
 SET NOCOUNT ON;
 INSERT INTO DBO.MASTER_EXEC(JOB_NAME) VALUES(@JOB_NAME);
 RETURN IDENT_CURRENT('DBO.MASTER_EXEC');
END

If we wanted to test this proc in SSMS, we’d use some syntax like;


DECLARE @RET INT;
EXEC @RET = [dbo].[MASTER_EXEC_ID_GET] 'Here is Some TEST Name'
PRINT @RET

Now, for the BIML.  Here, we just have to define a variable in the packages scope to hold the key  (Ive used a variable called MASTER_ID), and execute the proc.  Procedure execution occurs through an ExecuteSQL task and we assign (and capture) the stored procs parameter values through the ExecuteSql tasks parameters node.


<Biml xmlns="<a href="http://schemas.varigence.com/biml.xsd"">http://schemas.varigence.com/biml.xsd"</a>>
 <Connections>
  <OleDbConnection Name="BIML_DB" ConnectionString="Provider=SQLNCLI11;Server=(myserver);Integrated Security=SSPI;Database=BIML" />
 </Connections>
 <Packages>
  <Package Name="Master_3" ConstraintMode="Linear">
   <Variables>
    <Variable Name="MASTER_ID" DataType="Int32">-1</Variable>
   </Variables>
  <Tasks>
   <ExecuteSQL Name="AUDIT_KEY" ConnectionName="BIML_DB">
    <DirectInput>exec ? = dbo.MASTER_EXEC_ID_GET ?</DirectInput>
    <Parameters>
      <Parameter VariableName="User.MASTER_ID" Name="0" DataType="Int32" Direction="ReturnValue" />
      <Parameter VariableName="System.PackageName" Name="1" DataType="AnsiString" Length="256" Direction="Input" />
    </Parameters>
  </ExecuteSQL>
 </Tasks>
 </Package>
 </Packages> 
</Biml>

You can see that the SQL statement (in BIML) mirrors what we would otherwise do in SQL.  The BIML parameter definition does not need to specify the parameters by name (rather by their ordinal position in the SQL statement) and we also specify the direction.  This mimics exactly what we would do in SSIS.

BIML – Importing Text Files

Here’s a BIML snippet for reading data from a text file.  If you’ve read the prior post about pushing data (from a table) to a text file, you’ll see a lot of the same constructs.  There is;

  1. A connection object (ie the FlatFileConnection) that specifies a path to a file.  The connection requires a FileFormat.
  2. A FileFormat which specifies the nature of the file (ie columns, data types, delimiters etc)
  3. The data flow object (ie the Source in the DataFlow which is a FlatFileSource).

I do not think there is too much complexity in the Connection or [DataFlow] Source.  There are really only a few attributes that need to be defined and, if you think about the nature of the task we are trying to achieve, there are all required.

What is the most interesting is the specification of the FlatFileFormat.  This defines the nature of the data that we are connecting to.  Both at a header and detail level.  For example consider the delimiter constructs for the header (HeaderRowDelimiter).  You would think that the delimiter is applicable to the entire file but this is not the case and it also needs to be specified on a column by column basis.  Finally, we also need to specify the new line on the last column.  Perhaps that’s not so intuitive since we specify the row delimiter in the header section of the Format.

So here is what we are trying to import.

image

And here is the BIML snippet to import it.


<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <FlatFileConnection Name="my_import" FileFormat="my_values" FilePath="C:\temp\my_values.csv" />
    </Connections>
    <FileFormats>
        <FlatFileFormat Name="my_values" ColumnNamesInFirstDataRow="true" RowDelimiter="CRLF" HeaderRowDelimiter="Comma" CodePage="1252" IsUnicode="false">
            <Columns>
                <Column Name="ID" DataType="Int32" Delimiter="Comma" />
                <Column Name="Name" DataType="AnsiString" Length="255" Delimiter="Comma" />
                <Column Name="Value" DataType="Double" Delimiter="CRLF"  />
            </Columns>
        </FlatFileFormat>
    </FileFormats>
    <Packages>
        <Package Name="import_MyValue" ConstraintMode="Linear" ProtectionLevel="EncryptAllWithUserKey">
            <Tasks>
                <Dataflow Name="load_values">
                    <Transformations>
                        <FlatFileSource Name="src_values" ConnectionName="my_import" />
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
</Biml>

SSRS – Add a little Pizzazz to SSRS Reports with Row Numbers and Pop-Out Windows

Canned reports are the bread and butter of information delivery – I have read that up to 70% of users can be classified as consumers or the type of person that gets a report and is happy with that. There is no desire to analyse additional datasets, examine tweeter sediment or speculate correlation.  Consumers just need data/information presented them for their day to day activity in an intuitive and navigation friendly and the report needs to be built for their indented use.

In SSRS, I have found 2 common requirements for features that do not come out of the box and this post looks at 2 snippets to create better reports which enhance the user experience. These are Row Numbers to Grids and the creation of Popup windows.

Adding Row Numbers

Row numbers help to provide scale.  As a simple example, imagine you’ve got a report stating that there are some number of products sold and you provide an action to drill through and see what those products are.  If you’ve got a column for row numbers, you can easily reconcile the quantity between the first report and the detail in the second.

You may (naturally enough) think that you can simply use a RowNumber formula and set the scope of the function to the Data Set that drives the control (that is, the table or matrix).  Consider a data set (category_summary) that summarises invoice data by category, product and year.  If we add our expected formula as a row counter (like this);

=RowNumber("category_summary")

Using RowNumber() is only applicable when the grain (detail grouping for rows) is the same as the dataset.  For example, I add a matrix with a row group as { Year, Category, Product } as below.  Here RowNumber works just fine

image

image

Unfortunately if that detail grouping grain is broken (say for example I use the same data set in a matrix with Category groupings on rows, Year on columns), the RowNumber returns unwanted results showing the last value at the controls grain (as below).  Note that there are 36 rows products in the 1st category in 1996.

image

In order to get a incremental value for each row, we need to determine the running value over the distinct elements that we are after (Category in our case).  This can be done using the following formula.

=RunningValue(Fields!CategoryID.Value, CountDistinct, "category_summary")

More information about RunningValue can be found here

Using POPUP Windows instead of SSRS Actions

Using an action is a very flexible way to redirect focus.  We can identity coordinates of a cell and then pass those coordinates to another report as parameters.  If we reconsider our matrix of Categories on rows and Years on columns, we may want to the user to drill through to a new report that shows the products and their quantities that make up that value.

The simplest way to to that is specify the report and assign parameter values as those in the matrix (as below). image

Unfortunately, the end user experience for this rather poor and for any action in a report.  When the user clicks on the cell in the published report, the current report is replaced with the new report.  If they want to return (to the original starting position), they are forced to navigate using the browser buttons.  Even when you specify the URL (as opposed to a Go to report action), the behaviour is the same :(

If we want to open the report in a new window, the best way to do this is to ‘trick’ the action into running javascript that does this using window.open method.  You can read more about that function here, but for now, lets look at the basics.

In my report, I’ve created a function that determines the report server URL to open the report that I want.  This includes the ability to pass paramaters based on the position of the cell in the matrix (or table) control.  The function is;

Public Function WindowURL(byval OrderYear as String, ByVal CategoryID as String) as String

return "<a href="http://localhost/reportserver?/reports/ssrs_pizzas/reports/CategoryYearDetail&OrderYear="+ OrderYear + "CategoryID=" + CategoryID  

End Function  

You can see its pretty simple and just appends creates the URL based on the provided OrderYear and CategoryID.  Secondly, i want to specify the javascript to use that URL and open a new window.  The ‘base’ java script takes the form javascript:void window.open(“URL”);  I’ll add a second function (as below) to build the full snippet.  You’ll note that I am still planning the pass the OrderYear and CategoryId to the function since it will be called from the cell in the matrix.

Public Function OpenReport(ByVal OrderYear as String, ByVal CategoryID as String) as String  <p>&nbsp; return "javascript.void window.open( """ + WindowURL(OrderYear, CategoryID) +""" );"  <p>End Function  <p> <p>

The function (OpenReport) is then called from the matrix cell

image

Now when we click on a cell, a new window will open.  There only be one new window regardless of the cell that you click.  If you want more control over the pop up window, we can address that in other ways (as below).

Extending Window.open

One of the great things about using the window.open method is that you can get very fine control over how the window is managed and appears.  The full syntax can be found here.  For me, the most important arguments are the name of the window and its specs. The format for window.open is

window.open(URL, name, specs, replace )

The windows name is interesting because a window can only have a single name (forget about the target attribute for the window).  If we recap on the Category, Year example, the name can be used to define how many windows will open (and the conditions for opening a new window).  For example, if our name was simply “Category” the content of popup window would always be replaced (since there is only one ‘Category’ window.

However, we could derive a concatenated name of “Category” and [Category ID].  If this were the case, a new window would be created for each new instance of category… (mind you, you’d still have to click on each unique category) however a new window would open for each category.  The single report can spawn multiple popup windows.  We could take it to a higher level of granularity and include the year (in the name) and have even more windows (potentially) open.

One thing to keep in mind is that the windows name can not contain some special characters.  Hyphens are out :)

Finally, we can also control, the size, position and functionality of the popup window.  For example, the following string opens a small window in the top left corner of the screen without menu or toolbars.  To my way of thinking its just a data driven information window that we want our users to see.

 
'width=700,height=500,toolbar=0,menubar=0,location=0,status=1,scrollbars=1,resizable=1,left=0,top=0'

Conclusion

That’s about it.  Out of the box, functionality could be considered unfriendly.  But we can manage and control its functionality using some javascript.

Cleaning Your Database with BIML

Blow it away and start again :)

When we are working in a dynamic environment, we need to push through changes quickly and a prerequisite for this means starting from scratch.  Starting with a blank database and deploying the build from a clean state.  If you can’t drop and recreate the database, then the next best option is to clean it out, delete all the objects and recreate them as part of your build.

You can scrub tables in several ways.  The one that I propose here is a sequential (and dynamic method) that is relatively straight forward (and of course we can look into BIML for it).  The template for the pattern is given in the pseudo code below;

Assume there are tables to delete
While there are tables to delete then
  table_list <- get the list of tables
  foreach table in table_list, try to delete the table
  determine if there are (still) tables to delete

In, SSIS, this looks like the following pattern and I have included a list of variables that I used to run the package.  There some more advanced features of SSIS being used here which I will call out;

  1. We use TABLE_COUNT (initial value 1) to control the continued execution of the package (that is, the WHILE_TABLE_EXISTS container).  This container runs while the TABLE_COUNT value is greater than 1 (or my database still has tables).
  2. The TABLES_IN_DB is an object that holds a list of table names, this is the result of the SQL Command GET_TABLE_LIST.  The purpose here, is to query the database metadata in order to determine a list of names.
  3. The iterator FOREACH_RECORD_IN_TALBE_LIST enumerates over each record in the TALBES_IN_DB (assigning the table name to the variable TABLE_NAME).  Within that container, we generate what SQL to execute (ie the drop command) in the GENERATE_SQL expression by assigning it to the SQL variable.  Then we execute that variable via an Execute Command.  Since we do not want the task to fail if the command does not work (after all there may be some dependencies between tables and execution order).
  4. Finally, after the first batch of executions has run (and hopefully all the tables are deleted), we recount the tables in the database, storing the values in the TABLE_COUNT variable.  Control is then passed back to the WHILE_TABLES_EXIST to see if there are tables in the database and determine whether the process should begin again.

image

 In BIML

This process is very generic and can be applied in any database.  There’s also no need for BIML Script in code (since we do not rely on the generation of tasks which specifically rely on data).  Here is the full snippet;


<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<Connections>
    <OleDbConnection Name="FM_MYSQL" ConnectionString="provider=MSDataShape;server=foodmart_mysql;uid=foodmart_admin;pwd=whats_my_password?" DelayValidation="true"  />
</Connections>

    <Packages>
        <Package Name="01-Clean_MySQL" ProtectionLevel="EncryptAllWithUserKey" ConstraintMode="Linear">

            <!-- these variables are created in the packages scope -->
            <Variables>
                <Variable Name="TABLES_IN_DB" DataType="Object">
                </Variable>
                <Variable Name="TABLE_COUNT" DataType="Int32">1</Variable>
            </Variables>

            <Tasks>

                <!-- the first container (while records exists) note the evaluation expresssion -->
                <ForLoop Name="WHILE TABLES EXIST" ConstraintMode="Linear">
                    <LoopTestExpression>@TABLE_COUNT>=1</LoopTestExpression>

                    <!-- tasks within the container are contained in a tasks tag -->
                    <Tasks>

                        <!-- get the list of table (names) .. note the record set is assigned to the variable TABLES_IN_DB -->
                        <ExecuteSQL Name="GET_TABLE_LIST" ResultSet="Full" ConnectionName="FM_MYSQL">
                            <DirectInput>SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='foodmart';</DirectInput>
                            <Results>
                                <Result Name="0" VariableName="User.TABLES_IN_DB" />
                            </Results>
                        </ExecuteSQL>

                        <!-- Iterate over each record in the TABLES_IN_DB variable
                            Note how we assign the current record to the TABLE_NAME variable -->
                        <ForEachAdoLoop Name="FOREACH_RECORD_IN_TABLE_LIST" SourceVariableName="User.TABLES_IN_DB" ConstraintMode="Linear">
                            <Variables>
                                <Variable Name="TABLE_NAME" DataType="String">
                                </Variable>
                                <Variable Name="SQL" DataType="String">
                                </Variable>
                            </Variables>
                            <VariableMappings>
                                <VariableMapping Name="0" VariableName="User.TABLE_NAME" />
                            </VariableMappings>

                            <Tasks>

                                <!-- Generate the SQL Statement –>
                                <Expression Name="GENERATE SQL" Expression="@[User::SQL]=&quot;DROP TABLE foodmart.&quot; + @[User::TABLE_NAME]"/>
                                <!-- Execute the SQL Statement (based on the user variable (SQL)) –>

                                <ExecuteSQL Name="DELETE TABLE" ConnectionName="FM_MYSQL" DelayValidation="true" FailPackageOnFailure="false">
                                    <VariableInput VariableName="User.SQL" />
                                </ExecuteSQL>

                            </Tasks>

                        </ForEachAdoLoop>

                        <!-- Recount Tables and store in the variable (which is passed back to the parent container)  -->
                        <ExecuteSQL Name="GET_TABLES_COUNT" ResultSet="SingleRow" ConnectionName="FM_MYSQL">
                            <DirectInput>SELECT COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='foodmart';</DirectInput>
                            <Results>
                                <Result Name="0" VariableName="User.TABLE_COUNT" />
                            </Results>
                        </ExecuteSQL>

                    </Tasks>
                </ForLoop>
            </Tasks>
        </Package>
    </Packages>

</Biml>

You might notice that I am using an MYSQL ODBC database connection.  Other features are commented in the code.

As stated, there is no reliance on BIML Script here, so you may ask ‘Why use BIML?’.  The reason for this is that the snippet includes samples of a few controls and how they are configured in different circumstances.

Barry Devlin & the Business un-Intelligence Course

I really enjoy Barry Devlin’s approach to Business Intelligence.  He is an under-appreciated name in the industry (he was one of the originators of ‘Data Warehouse’ way back in the mid-’80s), so when he has something to say, it’s worth a listen – especially when he challenges ‘modern’ perceptions of Data Warehousing and the BI Industry.

And this brings me to the purpose of this post.  For those in the Asia Pacific region, Barry is presenting a course on Big Data Analytics & Reinventing Business Intelligence in Singapore July 6-8th, 2015 (http://bit.ly/1E3I90e).

So, what’s to be expected from the course? You can get the full agenda here (http://bit.ly/1c7hHIv).  Briefly, the course covers the complete range of topics on how to design a modern information/data environment, ranging from the fundamental principles of how all types of data should be absorbed and used within the organisation to an overview/assessment of tool choices and the implications of new neurobiological thinking about how decisions are really made.

Having read Barry’s latest book “Business unIntelligence – Insight and Innovation Beyond Analytics and Big Data” (http://bit.ly/BunI_Book), I suggest that this is a perfect opportunity to investigate the strategic direction of BI (from an organizational or consulting perspective) and address the implementation of BI.

Oh, and by the way – I’m told that foreign attendees get special discounts, all you have to do is ask :)

SSIS – Creating an OLEDB Connection to an ODBC Source and SSIS parameters

This post was originally going to be about using the advanced features of a SSIS OLEDB command to set parameters.  I’ve been using Oracle sources lately and the OLEDB command did not recognise the parameters for my query. 

For example, my in my work environment, the statement (below) just did not recognise the key_field parameter and so I had to manually add it through the components advanced editor.

Update some_table 
set some_field = new_value 
where key_field = ?

We’ll look into the advanced features a little later, however, when I tried to mimic the behaviour (using a MYSQL connection), the only issue I had was creating an OLEDB connection in SSIS.  I have ODBC connections on my machine however, (naturally enough given the components name), the OLEDB command requires an OLEDB connection.

So, to create an OLEDB command (and have it reference an ODBC source), simply use the Native OLE DB\MSDataShape provider and use the ODBC name as the server (as in the image below).  The UserId, Password and catalogue/schema can all be set through the connection managers interface.

image

 

Parameter Management under OLEDB

I can now use that OLEDB connection to execute an OLEDB Command.  As discussed above, my original intent was to show how to add a parameter when SSIS did not recognise the parameters.  In my situation, this was against an Oracle source.  In this demo, I’ll use a simple command

update adventure_works_2012.customer 
set territory_id = ? 
where account_number = ? 

and the column mappings tab of the OLEDB command allows me to map fields in my source (input data flow) to my command (ie parameters in the destination).  We can see this in the mapping screen shot below.

image

You will notice that the names are generic (Param_0 …. ) and refer the the placement of the ? symbols in the source query.  That’s not really user friendly and if i want to give these meaningful names, you would think that I can set them in the Input and Output Properties (using the Name Property as below).  However, this is not the way this UI works and the name defaults to the placement of the ? in the query.  Long story short, you can’t control the names of the parameters.

image

However, this part of the (advanced UI) does give us a little indication into how to add parameters.  If no parameters are recognised, my can add them through the ‘Add Column’ button on the External Columns node of the Inputs and Outputs tab.  This is what i had to do with my Oracle command.