BI Success – Does the tail wag the dog?

I am watching recorded event this morning about the keys to Business Intelligence success. The top 4 points that were provided are;

  1. Finding the right (executive sponsor)
  2. Don’t build what they ask for, build what they need
  3. Offer up a buffet (deliver multiple options)
  4. Don’t try to boil the ocean.

Lets be honest, there’s nothing new here – having the right sponsor has been listed as a critical success factor numerous times. So, to paraphrase, the presentation was basically talking about finding the correct person and delivering something that truly works for the business in an incremental approach. Why incremental? Well, I guess, (if I have my skeptics hat on) I’d suggest its better to fail fast with small outcomes so the risk adverse manager doesn’t get tagged with a failed project. On the positive note, it better aligns the business outcomes with whats delivered, reduces inertia to adoption and maintains focus on outcomes (from both the business and those delivering).

What raised my eyebrows was the first choice, finding the right sponsor. This is so important that, without doubt, I’d suggest that any rouge project would eventually be killed if it doesn’t have the right ‘executive’ support (if you don’t think so tell me otherwise). I’ve delivered remarkable shadow IT projects that were canned and aggressively removed by IT departments. For example, I’ve had the business users refuse to use the corporate system and ask IT to build systems that work like the existing one (that was built by my team).

Before you suggest that there is some valid reason why this needs to happen let me assure you there’s not. Some common arguments include statements like its not valid because it lacks IT structure, backups, testing, validation and so on. They did so there’s no technical reason why the tool’s have been killed. So I reflect on the first point that was raised in the webinar – executive sponsorship.

Now to the main point of this post, why is executive sponsorship so important? Some common arguments are that it ensures appropriate resources and organizational focus however, I cant help wondering if these are recursive justifications – somewhat aligned to the chicken and the egg, its hard to truly figure out which comes first. Is there some strategic and leadership focus here or does it reflect more about the structure of organizations?

Could the requirement for executive support be an indication of organizational compliance rather than effective management? For example, if executive John Bloggs is behind the project no one dares speak up against its short comings and so it can’t be a failure.  So, we’ll change the definition of success (scope or cost) because it appeases John Bloggs expectations and influence.

I know that there are cases where focused executives have played an active role in delivery and provided a positive impact to projects but I cant help feeling that, sometimes, the tail is wagging the dog.

Now for the ultra positive and the additional factor which I cant help thinking what comes first.  I’ve been lucky enough to work in a variety of organizations and have seen a lot of different management styles.  However, from the organizational point of view, does it not say something about the organization (and the type of people who work there) when executives actively back the project?  I am sure that something people feel passionate about and work towards will be successful, so is it not the organization rather that the politics that drives success?  That is, if executives are willing to get support the project (rather than treat it like a political football), because that’s the type of people they are, then how can it not be a success?

What are your experiences?

Temporal Tables != Data Warehouse

With the release of SQL2016 just around the corner, there are an increasing number of posts appearing about new features that mean I don’t have to work.  One post (which motivated me to write this) claimed that this breakthrough would mean we no longer needed to store historical values – an interesting interpretation of a Data Warehouse to say the least.  Microsofts own content (here) recommends that there are huge productivity benefits in Slowly Changing Dimensions because you can compare the same key at two points in time.

The idea’s of temporary tables (recording every change to a table in an table_History table) is a cool feature – don’t get me wrong, I think its great.  However, there is (sadly) a lacking
amount of discussion about how the feature can be incorporated into the Data Warehouse.  Those that blindly follow the sales pitch “I don’t need a data warehouse” because I’ve got history tracking or perhaps “yeah, I’ve got a data warehouse – its a history tracked OLTP” will ensure their users cant get the data they need.

So lets call out some issues with reliance on temporal tables as a data warehouse replacement (and bring some data warehouse assumptions to the surface).  I will focus on the star schema design since most references explicit refer to changing dimensions (however we can apply these ideas to other methodologies).

A fundamental construct of the star schema is the surrogate key (almost as fundamental as the concept of dimension and fact).  Using the surrogate uniquely identifies an instance of a dimension at a point in time and therefore, state of the dimension can be precisely identified for the fact record.  For example, if I sold a product on a date, I need to look up the product dimension and determine which version of the product was applicable on that date.  The products surrogate key (not the Product Id or Code) is used in the fact.  This is the fundamental design of the star schema.  A temporal table does not provide you the capacity to do this – all it can do is provide the data to construct the star. 

How could you solve this with temporary tables?  Well, there may be the thought that you could concatenate the tables primary key and the records start date for uniqueness and then determine what (dimension) record is applicable to a fact record via a query.  Interesting idea but not something I’d take on litely.  I suspect that performance would degenerate so quickly both the BI users and the Ops users (remember that this is occurring on the OLTP) would walk away in droves.  Remember that this has to occur for every record in the fact – (and yes they are those LONG narrow tables)!

So lets leave it to the presentation tool – pick one, Power BI, SSAS, Tableau, Qlik, Jedox, …..  All these products rely on uniqueness between separated tables so we still require the surrogate to enforce and deliver uniqueness.   The star (or at least the principle) is still required.

The real power of the dimension (and to a lesser extent the fact) is that it adds business context that does not exist (or can not be easily calculated). Of course this is my opinion but think about it for a moment. Forget about historic values for a moment – raw information is in the source, if the user wanted that you could give it to them no problem. What the star gives is a modeled perspective of a particular part of the business. Consider a customer dimension – what adds values in analysis? It is often the supplementary data (age group, segment profile, status classification, targeted customer …. ) and all of these things are defined and stored in the dimension. So, if we are going to add value (as data warehousing professionals), we still need the dimension to provide this.

All business applications offer some form of reporting – if you’ve ever seen an information maturity chart, it is the first stage of maturity (see below thanks to an EMC2 slide deck).


Riddle me this then, if the source application (OLTP) provides reporting why do we need a data warehouse? Show reports at a particular point in time? Maybe – (although a lot of users struggle with this and tend to think in current terms). There are a lot of tools that provide adhoc query (OLAP) capabilities so performance the performance of analysis isn’t a real consideration (after all, they could just use an OLAP engine over the OLTP right?).

I think one of the primary reasons is integration. We want to either integrate data from different systems (creating an enterprise view) or we want to supplement current data with with other, richer information (which is really just integration anyway isn’t it). We could even say that business rules and derived information falls into this category.

Here also temporal tables do not negate the need for the data warehouse. The data warehouse is responsible for delivering a consistent, conformed, business verified data that incorporates information from various sources. Nothing changed there (and still the need for a data warehouse).

Finally, lets consider the dimension. That subject orientated view of an object. Its the Product table that tells me everything I need to know about a Product – its category, groupings, margin positions and alike. The dimension is notorious for redundancy and de-normalisation but that’s the price we are prepared to pay for
delivering a single concise view to a user because it breaks down the complexity of the model for the user (they don’t have to combine products to product categories in a query). The idea that we have de-normalise breaks the basic OLTP conventions which force normalisation (after all, we expect 3rd normal form).

The data warehouse is designed to do this work for us and present the view to the user. Essentially, its just another integration problem but one that’s handled (and hidden) by the data warehouse. Our BI tools (again with the presentation layers) may be able to create these consolidations for us however we are still presented with the issue of uniqueness in related table records (that is, we have to identify which category related to a product at a point in time and the BI tools will not do that for us).

So, are temporal tools a replacement for a data warehouse? I think not, sure they may be able to assist with record change tracking (we haven’t discussed the shift in OLTP resource management). Temporary tables are only a tool and I’d be very careful of anyone that claims they could replace a data warehouse.

SSRS – Add a little Pizzazz to SSRS Reports with Row Numbers and Pop-Out Windows

Canned reports are the bread and butter of information delivery – I have read that up to 70% of users can be classified as consumers or the type of person that gets a report and is happy with that. There is no desire to analyse additional datasets, examine tweeter sediment or speculate correlation.  Consumers just need data/information presented them for their day to day activity in an intuitive and navigation friendly and the report needs to be built for their indented use.

In SSRS, I have found 2 common requirements for features that do not come out of the box and this post looks at 2 snippets to create better reports which enhance the user experience. These are Row Numbers to Grids and the creation of Popup windows.

Adding Row Numbers

Row numbers help to provide scale.  As a simple example, imagine you’ve got a report stating that there are some number of products sold and you provide an action to drill through and see what those products are.  If you’ve got a column for row numbers, you can easily reconcile the quantity between the first report and the detail in the second.

You may (naturally enough) think that you can simply use a RowNumber formula and set the scope of the function to the Data Set that drives the control (that is, the table or matrix).  Consider a data set (category_summary) that summarises invoice data by category, product and year.  If we add our expected formula as a row counter (like this);


Using RowNumber() is only applicable when the grain (detail grouping for rows) is the same as the dataset.  For example, I add a matrix with a row group as { Year, Category, Product } as below.  Here RowNumber works just fine



Unfortunately if that detail grouping grain is broken (say for example I use the same data set in a matrix with Category groupings on rows, Year on columns), the RowNumber returns unwanted results showing the last value at the controls grain (as below).  Note that there are 36 rows products in the 1st category in 1996.


In order to get a incremental value for each row, we need to determine the running value over the distinct elements that we are after (Category in our case).  This can be done using the following formula.

=RunningValue(Fields!CategoryID.Value, CountDistinct, "category_summary")

More information about RunningValue can be found here

Using POPUP Windows instead of SSRS Actions

Using an action is a very flexible way to redirect focus.  We can identity coordinates of a cell and then pass those coordinates to another report as parameters.  If we reconsider our matrix of Categories on rows and Years on columns, we may want to the user to drill through to a new report that shows the products and their quantities that make up that value.

The simplest way to to that is specify the report and assign parameter values as those in the matrix (as below). image

Unfortunately, the end user experience for this rather poor and for any action in a report.  When the user clicks on the cell in the published report, the current report is replaced with the new report.  If they want to return (to the original starting position), they are forced to navigate using the browser buttons.  Even when you specify the URL (as opposed to a Go to report action), the behaviour is the same 😦

If we want to open the report in a new window, the best way to do this is to ‘trick’ the action into running javascript that does this using method.  You can read more about that function here, but for now, lets look at the basics.

In my report, I’ve created a function that determines the report server URL to open the report that I want.  This includes the ability to pass paramaters based on the position of the cell in the matrix (or table) control.  The function is;

Public Function WindowURL(byval OrderYear as String, ByVal CategoryID as String) as String

return "<a href="http://localhost/reportserver?/reports/ssrs_pizzas/reports/CategoryYearDetail&OrderYear="+ OrderYear + "CategoryID=" + CategoryID  

End Function  

You can see its pretty simple and just appends creates the URL based on the provided OrderYear and CategoryID.  Secondly, i want to specify the javascript to use that URL and open a new window.  The ‘base’ java script takes the form javascript:void“URL”);  I’ll add a second function (as below) to build the full snippet.  You’ll note that I am still planning the pass the OrderYear and CategoryId to the function since it will be called from the cell in the matrix.

Public Function OpenReport(ByVal OrderYear as String, ByVal CategoryID as String) as String  <p>&nbsp; return "javascript.void """ + WindowURL(OrderYear, CategoryID) +""" );"  <p>End Function  <p> <p>

The function (OpenReport) is then called from the matrix cell


Now when we click on a cell, a new window will open.  There only be one new window regardless of the cell that you click.  If you want more control over the pop up window, we can address that in other ways (as below).


One of the great things about using the method is that you can get very fine control over how the window is managed and appears.  The full syntax can be found here.  For me, the most important arguments are the name of the window and its specs. The format for is, name, specs, replace )

The windows name is interesting because a window can only have a single name (forget about the target attribute for the window).  If we recap on the Category, Year example, the name can be used to define how many windows will open (and the conditions for opening a new window).  For example, if our name was simply “Category” the content of popup window would always be replaced (since there is only one ‘Category’ window.

However, we could derive a concatenated name of “Category” and [Category ID].  If this were the case, a new window would be created for each new instance of category… (mind you, you’d still have to click on each unique category) however a new window would open for each category.  The single report can spawn multiple popup windows.  We could take it to a higher level of granularity and include the year (in the name) and have even more windows (potentially) open.

One thing to keep in mind is that the windows name can not contain some special characters.  Hyphens are out 🙂

Finally, we can also control, the size, position and functionality of the popup window.  For example, the following string opens a small window in the top left corner of the screen without menu or toolbars.  To my way of thinking its just a data driven information window that we want our users to see.



That’s about it.  Out of the box, functionality could be considered unfriendly.  But we can manage and control its functionality using some javascript.

Barry Devlin & the Business un-Intelligence Course

I really enjoy Barry Devlin’s approach to Business Intelligence.  He is an under-appreciated name in the industry (he was one of the originators of ‘Data Warehouse’ way back in the mid-’80s), so when he has something to say, it’s worth a listen – especially when he challenges ‘modern’ perceptions of Data Warehousing and the BI Industry.

And this brings me to the purpose of this post.  For those in the Asia Pacific region, Barry is presenting a course on Big Data Analytics & Reinventing Business Intelligence in Singapore July 6-8th, 2015 (

So, what’s to be expected from the course? You can get the full agenda here (  Briefly, the course covers the complete range of topics on how to design a modern information/data environment, ranging from the fundamental principles of how all types of data should be absorbed and used within the organisation to an overview/assessment of tool choices and the implications of new neurobiological thinking about how decisions are really made.

Having read Barry’s latest book “Business unIntelligence – Insight and Innovation Beyond Analytics and Big Data” (, I suggest that this is a perfect opportunity to investigate the strategic direction of BI (from an organizational or consulting perspective) and address the implementation of BI.

Oh, and by the way – I’m told that foreign attendees get special discounts, all you have to do is ask 🙂

Data Vault Training – Australia in 2015

The Genesee Academy’s Data Vault training has been announced for 2015 in the Australian region. 

Hans Hultgren will be running classes on Brisbane, Melbourne, Sydney and Auckland in July and August this year (the dates are below with links to the courses and registration). 

It’s a great class (I’ve heard multiple recommendations about this one) so if you interested in Data Vault, why not give attend and learn from someone who literally wrote the book


Jul 13 – 15th 3 Day Certification – Brisbane Australia Registration
Jul 22 – 24th 3 Day Certification – Melbourne Australia Registration
Jul 27 – 28th 2 Day Certification – Sydney Registration
Aug 3 – 4th 3 Day Certification – Auckland Registration

SSRS Report Parameters from Unfriendly Sources

We have recently developed some SSRS reports against Non SQL Server sources and have run into a few issues regarding the use of multi-value parameters.  Since, we are sure to meet this problem again (and variations of it), this post is just as much of a bookmark for ways around the problems that we’ve encountered.

So what’s the problem?  Lets assume were using an  MySql Source with the same data as AventureWorks, we can to pick a Subcategory and then have our report show a list of products.  This is pretty straight forward.  The products query uses a ? to identify a parameter, and that parameter is assigned a value to in the dataset properties;

SELECT        *
FROM            adventure_works_2012.product
WHERE        (ProductSubCategoryID = ?)

You can see how this is setup in the image below.  The SubCategory dataset is the source for the SubCategory parameter and that is used as a parameter for the Products query.


Unfortunately, this only works if our SubCategory Parameter DOESNOT allow multiple values.  If we enable this as an option for the Parameter, we’ll get an error like the following when we try to view the report.


To make the products query run with multiple parameter values we need to dynamically generate the SQL associated with the parameter.  Actually there are a couple variations of this however they all use the same theme.

So, first, we change our query to recognise the need for multiple values from the parameter;

SELECT     *
FROM            adventure_works_2012.product
WHERE        ProductSubCategoryID in (?)

Then, instead of supplying the parameter value to the dataset, we use an expression that delimits parameter values.


Here, the source type is a integer so we are expecting something like 1,2,3 etc to be passed as the parameter’s value.  If the source type was a string, we’d just have to append quotes around the statement.

= Join(Parameters!SubCategory.Value, "’ ,’")


We can use a few variations of this approach to generate the parameter list.  For example, we could exclude the parameter definition from the Products dataset and specify the query as a function that returns an SQL string.  Using this approach, the Parameter value is still called but the parameter object is not defined in the dataset.

We can also define a some VB code that returns the SQL string (I find this a little more flexible).  Here the string value of the parameter is passed to the VB function.  So here, I’ve created a simple function and added it to the report;

Public Function SQL(ByVal Keys as String) as String
  return “select * from adventure_works_2012.product where ProductSubCategoryID in (” & Keys & “)”
End Function

Now, our dataset is defined by the expression

=Code.SQL( Join(Parameters!SubCategory.Value, ",") )