Creating Tables in Excel with VBA and External Data – Part II

In Part I, we looked at adding a data table to a Workbook which uses an external data source (we used OLAP but the process can be applied to any source). This post looks at manipulating an existing table.

Without VBA, we can manually manage the table – both the its properties and the underlying query. Simply right click on the table and select the Edit Query.. or External Data Properties from the popup menu. Changes are made to the table data are made automatically.

If we chose to edit the query, we can simply overwrite the Command Text (as seen in the image below). These changes well be automatically applied (including resizing the table by rows or columns for a different sized data set) once the OK button is clicked.

For External Data Properties, we can configure how the table reacts with new data. For example, you may notice that, the table accommodates additional rows and columns however, when the query returns a table with fewer rows and columns, the table retains its old sizing (number of columns) and includes a blank columns (for data that previously existed). You can manually resize this (dragging the bounds of the tables border) or set the properties of the table to overwrite existing data. If you want to ensure that this option exists and that new sizes are automatically incorporated into the table – make sure that the check box for Overwrite is marked in External Data Properties.


VBA

Now to the VBA – As commandant Lassard would say “There are many, many, many, many fine reasons to use VBA”. We have so much flexibility but let’s keep it simple, here’s what we I’ve set up.

Cell B1 is data validated based on the cell range D1:D2 – nice and simple. When we change that cell, the table updates for the new country.

In order to determine if the there is a change in or data (the Country selected) we have to create a worksheet event to capture and test the change. I have gone into this in some detail here and the code is below. Note that this needs to be added to the sheet code (not in a separate bas module). All we do is check that our only B1 is updated and then call the refresh subroutine.

Private Sub Worksheet_Change(ByVal Target As Range)

  ‘ update table if cell 1,B is changed
If Target.Cells.Count = 1 And Target.Cells.Row = 1 And Target.Cells.Column = 2 Then UpdateMyTable

End Sub

Now for the updating component – the bit that’s called when cell(B1) is changed. I think this is pretty straight forward but I’ll walk through it anyway. First, the code;

Public Sub UpdateMyTable()

  ‘ ensure that any new changes are reflected in the table dimensions
Sheet1.ListObjects(“Table_abax_sql3”).QueryTable.RefreshStyle = xlOverwriteCells

  ‘ set the comand text
Sheet1.ListObjects(“Table_abax_sql3”).QueryTable.CommandText = NewQuery(Sheet1.Cells(1, 2))
Sheet1.ListObjects(“Table_abax_sql3”).Refresh

End Sub

Private Function NewQuery(Country As String) As String

NewQuery = “select {[Measures].[Reseller Sales Amount] } on 0, ” & _
“[Product].[Category].[Category] on 1 ” & _
“from [Adventure Works] ” & _
“where [Geography].[Geography].[Country].&[” & Country & “]”


End Function

I’ve kept the same format as in the original post. The function NewQuery determines what the MDX should be – based on the provided country. All we is set the tables command to the new mdx (in (QueryTable.CommandText)) and refresh it.

I’ve also set the refresh style so that any changes in the command (grid size) are automatically reflected in the worksheet table.

That’s about the size of it! – I hope you find it useful.

Creating Tables in Excel with VBA and External Data – Part I

This post looks at how we can add a table to an Excel sheet which uses a MDX query as its source. This is a very handy feature to use for a couple reasons;

    1. The table retains the connection OLAP source (hence can be updated by a user at will)
    2. We can use it to extract data from MOLAP or tabular sources (i.e. run MDX or DAX)
    3. We can define complex queries to return a result set that cannot be obtained with a pivot table

Note that most workarounds for creating a table from OLAP sources rely on the creation of the pivot table, its formatting is a tabular source and a copy and paste the values. Hardly an attractive option!

  1. We can use the table!! – (This is really important for certain activities like data mining table analysis)

How to Do It

We’ll look at a simple query from adventure works;

select [Measures].[Reseller Sales Amount] on 0,
[Product].[Category].[Category] on 1
from [Adventure Works]
where [Geography].[Geography].[Country].&[Australia]

and an OLEDB connection string (note the OLEDB specification at the start of the string)

OLEDB;Provider=MSOLAP;Data Source=@server;Initial Catalog=Adventure Works DW 2008R2;

I have incorporated those to strings into 2 functions (MyQuery and MyConnectionString) – this just removes some of the clutter from the code.

Now we just need to use the ListObjects.Add method. The code (now in with all Sub’s and Functions) is pretty much the bare bones you need to add the table. In other posts, I’ll look into higher level of control for the output.

The CODE

The complete code is shown below. Ive included everything so it can simply be pasted into a new VB module

Sub CreateTable()

  With Sheet1.ListObjects.Add(SourceType:=0 _
, Source:=MyConnectionString() _
, Destination:=Range(“$A$1”) _
                            ).QueryTable
.CommandType = xlCmdDefault
.CommandText = MyQuery()
.ListObject.DisplayName = “MyMDXQueryTable”
.Refresh BackgroundQuery:=False
.PreserveColumnInfo = False

  End With

End Sub

Private Function MyQuery() As String

     MyQuery = “select [Measures].[Reseller Sales Amount] on 0, ” & _
“[Product].[Category].[Category] on 1 ” & _
“from [Adventure Works] ” & _
“where [Geography].[Geography].[Country].&[Australia]”

End Function

Private Function MyConnectionString() As String

     MyConnectionString = “OLEDB;Provider=MSOLAP;Data Source=@server;Initial Catalog=Adventure Works DW 2008R2;”

End Function

Walk through

This is pretty much the bare bones approach. As code walk through (see Sub CreateTable), we add the list object specifying its connection string and destination, set the command and refresh info. The only statement that is not entirely necessary is naming the table (see .ListObject.DisplayName) but I tend to think is a good idea because we will want to refer to it by name at a later stage.

Out Come

The code will add a table like the one in the following image. The field names are fully qualified which is not that nice and we will look at how this can be changed in another post. For now, our purpose is to get a table is in the workbook (the purpose of this post) so that it can be used as a table and refreshed.


PS – the code above adds the listobject by reference to the sheet within VBA (see Sheet1.ListObjects). Its probably worthwhile to point out that this is the sheet reference (ie the number of the sheet in the book) and not the name of the sheet.

One more thing – when the query uses refers to a attributes in a hierarchy the OLEDB result set (table) will include parent attributes of the hierarchy as a column. This is nothing to worry about for the moment!

Next – changing the tables query.

Extending Data Sets in SSRS with VB Functions

A common event that occurs rather often is when a report has to be altered to include some set of data that does not exist in the ‘corporate’ data warehouse. This type of situation is easy to imagine in sales environments where a campaign exists (one which was not planned at budget time) and someone* would like to see this data reported against actual sales (and possibly even a comparative to budget). I agree that it would be ideal to have this data ‘corporatized’; however, there are often competing forces and changes in priorities, which means that this data never makes it in. This post looks at how this data can be included into reports.

Consider the situation where and how someone has gone through and created a new budget for adventure works data (as below) and we want to include this in our report.

In order to include this will create a function which specifies what the new value should be given the inputs of year in category. Naturally, in a production environment, these combinations increase but the principle applied is exactly the same. You can read a little bit more about functions here but essentially what we’re trying to do is return some value given inputs. We can then append that to our data set as an additional field (i.e. another measure) and then use that field in the report.

Base Data

The base query for [reseller sales amount] by using categories would be;

select
[Measures].[Reseller Sales Amount] on 0,

[Date].[Calendar].[Calendar Year]
* [Product].[Category].[Category]
on 1

from [Channel Sales]

Creating the Function 20 oh

Having the data in the tabular format is makes it really easy to manipulate in Excel, in fact, we could add another column with a formula to specify what the return value should be given the year and category. In my workbook, the formula would be;

=”if (year_name=”””&A2&””” and category_name =”””&B2&”””) then return ” &D2

And it is apply to every row that we have data for (as in the screenshot below).

Now that we have these formulas, we can embed them in a property construct and add them to the report code. In order to do this you define the property name, the return value type, the input parameters (and their type).

Public Function UserValue(year_name As String, category_name As String) As Double

‘use if statements here
End Function

Note that we are defining a read-only property (UserValue) and specifying a ‘Get’ modifier for the property. All this means is that a function will be created (called UserValue) and require the input variables of year_name and category_name. The function will return a double type. A cut of the complete function looks like this;

Public Function UserValue(year_name As String, category_name As String) As Double
  if (year_name=”CY 2005″ and category_name =”Accessories”) then return 20000
  if (year_name=”CY 2005″ and category_name =”Bikes”) then return 7340000
  if (year_name=”CY 2005″ and category_name =”Clothing”) then return 34000
  …..

  if (year_name=”CY 2005″ and category_name =”Components”) then return 615000
  if (year_name=”CY 2006″ and category_name =”Accessories”) then return 92000
  if (year_name=”CY 2006″ and category_name =”Bikes”) then return 19000000
End Function

As a tip, it may be easier to create this in a text editor before adding it to report code.

Adding the Function to Report Code

There are a few ways to add code to a report, perhaps the easiest is to right click (outside the canvas) and select the report properties from the pop-up menu and then navigate to the code. You could then paste the code directly into the custom code box.

Calling the Function

Functions can be called by editing the expression (that we are interested in). Since we want to append to our original dataset, we edit the dataset and adding new column. Right click on the dataset and select edit. We then add a new calculated field (as below);

The field can be given a name (New_Value) and entered by clicking the expressions button. Note that I can use values from the current dataset as inputs from a function. Just as any other expression editor I simply need to double click on the items that I want to add.

My report now has a new measure (New_Value) which can be added to the report, just as any other measure.

Other Titbit’s and Gotchas

Although this function was added to the dataset, it could also be used as a cell in a matrix or table. The benefit of adding an additional column to the dataset is that the value can be aggregated in your report and used in totals. This would not occur if was added to the control directly.

Also note that the default MDX generated by the query designer in SSRS excludes non-empty data and this may lead to misleading results where no ‘sales’ have occurred, but there is a New_Value amount to be shown. In this situation it is advisable to remove the nonempty criteria query so that you can guarantee that data will be shown (regardless of the results returned through the underlying dataset).

A side benefit of using this technique is that the data then become (somewhat) available to other users.  Once the report is published, it can be consumed as an atom feed.

* The use of the word someone here is intentional because the person making the request for the change to a report does not often have control over the data that is entered into the data warehouse. While there is a business imperative to include this data, it is not seen as an IT imperative. There is simply not enough time to go through the appropriate channels.

Designing Good Dimensions in SSAS

Designing good dimensions is a balancing act between at least two factors. Our dimension design must offer the user good (and intuitive) way to navigate data and it must also perform well in queries. This post looks at achieving that balance and the dangers that assumptions can produce for dimension design and result output.

While we can consider other factors (for example processing speed and property/attribute visibility) as important in our design, I considered these factors the most important because the dimension must be used in an intuitive way by users in order to be successful and, the user experience is driven by quick results (especially in an addhoc analytical environment).

Scenario

If we examine the adventure works database and the [Direct Sales] perspective, we note that the customer dimension has a [Post Code] attribute hierarchy (single level hierarchy) and a [Customer Geography] hierarchy which also shows a level [Postal Code]. This can be seen in the diagram below;

Now consider that someone wishes to know what sales were derived in a postcode. We assume that they could use either of these hierarchies to interrogate sales.

Although this is a trivial example, I have seen this type of question present itself in production environments that have “reported” incorrect data the years. Consider a product hierarchy (which includes colour at a lower level) and an attribute hierarchy [colour] (that just shows colours).

I’ll also point out that I highlighted reported above because the business expectation of the results different from the technical. In this instance, the client thought that they were viewing all “yellow” products when in fact they were viewing only a subset of the yellow products.

The dangers of this issue can also be magnified by method that client tool is used to query cube.

So What’s the Issue?

If we consider the postcode example, we ask ourselves the question what are the sales will postcode “V9”. If this were the case, one might expect to use the [Postal Code] attribute hierarchy and just show “V9”. If created pivot in Excel and filter on V9, I get the following results.

This output is not useful, why? Because I expect to see a single value for V9 ($252K).

I could use the [Postal code] level of the [Customer Geography] dimension by creating a set but this is not that intuitive (and doesn’t change results) and is not the kind of interaction we want to give our users.

From the above results we know that there are two occurrences of the postcode (that is two different areas with the same postcode) which have a value ‘V9’. However, what I want is the total for V9!

This distinction becomes obvious when we look at the members in MDX and the [Postal Code] key definition (as below). Note that the key is a composite key (State, City, Postcode) which is also shown when we look at the MDX member key (by dragging members to the query grid in SSMS).

This type of design is perfectly acceptable when the attribute is used within a dimension hierarchy (as is the case in adventure works). I suggest that the [Postal Code] attribute should be hidden since it is available in hierarchy. Furthermore, the creation of a hierarchy (with relationships between the levels) is a SSAS design best practice. Note that the relationships are defined between these attributes.

The problem is that we have not been able to answer a question about the sales for [Postal Code] V9! What we really need is another attribute that only shows distinct postcode values!

Dangerous Clients

Exposing the attribute hierarchy in this way can be a very dangerous situation, especially when the client uses a name in the generation of MDX. For example consider the first query which shows all members that have a postcode of V9. Both members are shown as we would expect;

select [Measures].[Internet Sales Amount] on 0,

filter(

[Customer].[Postal Code].members

, [Customer].[Postal Code].currentmember.member_caption = ‘V9’

)

on 1

from [Direct Sales]

We could aggregate these results to show the combined result (which is the answer we want);

with
member

[Customer].[Postal Code].[MyAgg] as

aggregate(

            filter([Customer].[Postal Code].members

                    , [Customer].[Postal Code].currentmember.member_caption = ‘V9’

                    )

        )

select     [Measures].[Internet Sales Amount] on 0,

[Customer].[Postal Code].[MyAgg] on 1

from [Direct Sales]

Now consider the situation where the client simply references the member name [V9] in MDX. This is shown below. Note that only the first member is returned which we would naïvely assumed to be the sales for the postcode.

select [Measures].[Internet Sales Amount] on 0,

[Customer].[Postal Code].[V9] on 1

from [Direct Sales]

Conclusion

It is easy to see how useability is an important consideration for attributes. What often surprises most though, are the dangers of clients which purport to show a member value (say V9) and only return 1 member.

Creating Custom Calculations in Excel with Pivot Table Extensions

One of the biggest hurdles that users are faced with when using Pivot Tables is that all measures and dimensions must be defined within the cube. Having the defined data available is really convenient for drag and drop analysis however in most situations, the users requirements can quickly outgrow the bounds of the definition of the cube. I want to see such and such in my data is the usual type of requirement. Sure, developers and report writers can create calculations but what about users – the people that need to use the data? This post looks at OLAP PivotTable Extensions, a tool which increases the usability of pivot tables for end users.

Traditional PivotTables just don’t give the user enough flexibility. Sure Excel 2013 has some nice features for creating calculations in Pivots but I haven’t worked with any corporate client that is seriously considering using 2013 just yet. There is just too much corporate tape to cut through. So let me just say, if your users aren’t using this tool, they should be. You may argue that users could use Power Pivot but, that really defeats the purpose of having a cube doesn’t it? Besides, why should the users have to handle data just to do some simple analysis? Besides that, the addin is free, talk about a lower barrier to entry for self-service BI!

This post looks at creating calculations using the OLAP pivot table extensions addin.

A Quick Overview

Once the add-in has been installed it can be accessed from a pivot table by simply right clicking on that pivot table. For example, consider the pivot table below which shows the [sales amount] and [sales quota] by year in adventure works. To add a custom calculation to this pivot, all I need to do is right click and select “OLAP Pivot Table Extnesions …”

This opens the following window (note that the calculations tab is selected);

Calculations On Measures

The first example that we can use for creating a calculation involves creating a new measure that is based on other measures. For example based on the [Sales Amount] and [Sales Amount Quota], we might like to see what the variance is. We can express this as [Sales Amount] – [Sales Amount Quota] but we should remember two things;

Firstly, there are spaces in in the names so we need to surround the name with square bracket (ie [, ]).

Secondly, the pivot is based on MDX. This is a decorative language and has its own rules. While these rules can sometimes be bypassed (as we will see below), it’s probably the better idea to use the fully qualified name for items that are referring to.

If we consider the requirement for sales variance (lets call it [sales variance]) we could simply create the measure using the following formula [Sales Amount] – [Sales Amount Quota]. This is shown below as;

Once the measure has been defined I can added to the pivot table by clicking the ‘Add to Pivot Table’ button.

Fully Qualified Names

I stated above that MDX has requirements for how we refer to items in the language, and it is a good idea to use the fully qualified name. An exception to this is the requirement to use the fully qualified names for measures. For example if we are in cell C8 and we create a formula that references C5, we can see that Excel adds its own calculation (one that users the formula GETPIVOTDATA).

By examining this formula, we can see that the fully qualified name for the [Sales Amount] measure is [Meaures].[Sales Amount]. Also note that the fully qualified name for [CY 2008] is [Date].[Calendar Year].&[2008].

The use of the GETPIVOTDATA function is a very useful trick that you can use to determine fully qualified name for pivot cells.

Using Static References

One of the restrictions of pivot tables is that it’s very hard to reference individual cells in the pivot. For example, if we wished to show the sales amount for 2005 in the pivot we would generally filter columns or rows to 2005 and add the [sales amount] measure. We can use the pivot table extensions with a static reference to include the 2005 sales amount value as a column in the pivot. All we have to do is to create a new measure for the cell coordinates that we are interested in.

Consider the GETPIVOTDATA formula for the sales amount in 2005, that is;

=GETPIVOTDATA(“[Measures].[Sales Amount]”,$A$1,”[Date].[Calendar Year]”,”[Date].[Calendar Year].&[2005]”)

We can simply create a measure that references this coordinate. This is ( [Sales Amount] , [Date].[Calendar Year].&[2005] )

Note that coordinates should be enclosed in parenthesis and separated by commas.



A really cool thing about creating this new measure is that we can use it by other measures. So we could show the variance compared to 2005 (Sales Amount Measure) by creating a measure that refers to the [Sales Amount 2005] measure. This measure would be;
[Sales Amount] – [Sales Amount 2005]

Note, that we did not have to keep ‘referenced’ measures in the pivot table to use their outcome. For example we can remove the measure [Sales Amount 2005] from the pivot table and see the measure [Sales Variance 2005] in the pivot. Measures are removed just as any pivot measure, by dragging them from the ‘Values’ box in the field selection list.


The use of the word static to this section implies that we will only reference data from 2005 in our calculated measure. More often than not we want to refer to a coordinate in the context of another. This is usually the ‘next year’ or the ‘prior year’. For example we might like to see the [Sales Amount] variance compared to last year. In order to achieve this we will create a measure [Sales Amount LY] and [Sales Variance LY].

Calculations using Relative References

If we recall the GETPIVOTDATA formula for the sales amount in 2005, we can also discover that the ‘2005’ amount relates to a specific dimension, and the hierarchy within that dimension. We can see this in the highlighted sections below;

=GETPIVOTDATA(“[Measures].[Sales Amount]”,$A$1,”[Date].[Calendar Year]“,”[Date].[Calendar Year].&[2005]”)

These both give us the same information, specifically that the hierarchy [Date].[Calendar Year] is being used. Additionally, we should know that each row actually refers to a specific member (for example 2005). If we want to create a measure that uses a relative reference we have two to use the hierarchy name and the relative function.

Arguably the .prevmember function is one of the most useful relative functions available. This function takes the current coordinate (item) and substitutes the prior one for it. Therefore, if we are on the 2006 row, the .prevmember would give us the 2005 coordinate. Also note that because we are using a relative function, the ‘exact’ coordinate is not referred to, rather the hierarchy is used in formulas.

Therefore, we can get the prior year of a row using the following coordinate [Date].[Calendar Year].prevmember

This may seem a little bit complicated at first, and it may help to think that the rows shown are from the [Date].[Calendar Year] hierarchy, rather than specific values. For example [Date].[Calendar Year].&[2005].

If we wish to create a calculated measure to show what sells amount was last year we simply need to use the formula


The measure [Sales Var LY] is simply [Sales Amount] – [Sales Amount LY].

The results of these calculations are shown on screen shot below;


Conclusion
The use of the OLAP PivotTable Extensions offers a huge improvement to the usability of pivot tables. This post is looked at the ability to create custom calculations within the pivot so that the user can create their own analysis.

References
Download the addin here
Prevmember function reference on MSDN

Pivot Filtering with Cell Selection and VBA

This post looks at applying data filters in Excel workbooks with sheet ‘click’ functionality for filtering. The purpose of this is to provide a rich user experience for reporting within Excel. With these types of reports, you present some data in a pivot which is used to as a source filter for other parts of your report (or worksheet). Consider the situation below; when you click on one of the country names in the pivot, the chart is automatically updated to filter for that country. Clicking on any other area of the pivot removes any chart filter.

Why would you do this?

Monthly reporting workbooks are often treated as report packs and designed in lieu of an enterprise reporting system. Even where such an enterprise reporting system are in-place, the reporting environments are usually not flexible enough to provide the business what they need for monthly reporting and so the workbooks are bundled together in Excel. These workbooks can be extended to have highly interactive functionality that mimics high-end reporting tools.

While you could achieve these results with a slicer, the automation of the action may provide a nicer experience for the user because it removes the slicer clutter from the page and allows a direct association with the data being investigated.

How to Do IT

Achieving this functionality is conceptually pretty simple (and soon programmatically simple too J), all we need to do is;

  1. Listen for changes in the cell position on a worksheet.
  2. When a change is detected, we check that the change was to our source pivot
  3. Determine the filter value (ie what value was clicked in the pivot) … and
  4. Apply that value to the slicer for the chart.

These items are now discussed individually.

Work Book Setup

In this example, I am using an Adventure Works connection with a pivot table and pivot chart. The pivot table (the one you click a cell on) is called ‘country_sales’. This uses the customer.country hierarchy from adventure works. The pivot chart has a slicer linked to it (and the name of the slicer is ‘Slicer_Counrty’). Note that the same hierarchy must be used for both the slicer and pivot table.

Listening for changes in the worksheet

Each worksheet has its own code module with defined events. An event is a subroutine that is ‘inbuilt’ into Excel and fired when something happens in Excel (for example, when the user changes the active cell). You can see the worksheets module by double clicking the sheet in the project explorer or with a right-click and ‘View Code’ and see the events by selecting them from the dropdown (as shown below).

When you do this (and choose the event SelectionChange), you’ll see the following subroutine added to your module

Private Sub Worksheet_SelectionChange(ByVal Target As Range)
End Sub

Now, that this has been added, this procedure will fire every time you change selection cells on Sheet1. It also may be worthwhile to note that the Target parameter refers to the range of cells that’s selected (as part of the selection). Because you can select more than 1 cell, we want to check for single cell selection so we simply exit the sub if there is more than on cell in the range.

If (Target.Cells.Count <> 1) Then Exit Sub

Detecting the Source Pivot and the Active Cell Value

There are some pretty handy functions to determine if the active cell relates to a pivot table, and if it does, that the member name is and what its value is. These relate to the activecell and are the properties

.PivotTable

.PivotItem

.PivotField

Thus, in order to determine what the name of a pivot table for the active cell we would simply write;

Dim source_pivot_name As String
source_pivot_name = ActiveCell.PivotTable.Name

The name (MDX value) of a column could be determined by ActiveCell.PivotItem.Name

and the (MDX) name of the attribute
(Dimension.Hierarchy.Level)
determined by  ActiveCell.PivotField.Name

For example, if I click on the France ‘country columns’ in my pivot table, I would get the following values.

ActiveCell.PivotTable.Name “country_sales”
ActiveCell.PivotField.Name “[Customer].[Country].[Country]”
ActiveCell.PivotItem.Name “[Customer].[Country].&[France]”

Note that the pivot values references are to the object model (ie the Pivot object). If the cell your referring to (the activecell) is not part of a pivot table, you’ll get an error. This is pretty easy to catch with some error trapping (see final code).

Assuming that the user clicked on a cell in the pivot (I will leave the checks for the final code) we have all the values that we need and can then set the slicer.

Applying the filter to the Slicer

I have discussed applying how to apply slicer value in VBA in this post. For brevity, I’ll just include the essential items. We simply make a reference to the slicer (SlicerCache) and set its value.

Dim sC As SlicerCache
Set sC = ActiveWorkbook.SlicerCaches(“Slicer_Country”)
sC.VisibleSlicerItemsList = Array(ActiveCell.PivotItem.Name)

If I want to remove the slicers current filter (when no ‘country is needed), I can do that with this code;

sC.ClearManualFilter

Complete Code

The following code demonstrates the complete solution. I have included updating the ‘title’ of the chart and error traps to determine if a ‘non country’ part of the pivot was chosen.

Private Sub Worksheet_SelectionChange(ByVal Target As Range)
  If (Target.Cells.Count <> 1) Then Exit Sub
  Dim source_pivot_name As String
  source_pivot_name = “”
  Dim source_field As String
  source_field = “”
  Dim source_attribute As String
  source_attribute = “”
  Dim sC As SlicerCache
  Set sC = ActiveWorkbook.SlicerCaches(“Slicer_Country”)
  On Error GoTo continue
  ‘try and get the active cells pivot table name
source_pivot_name = ActiveCell.PivotTable.Name
continue:
‘we can only apply a filter if we are on our ‘selection pivot’
If source_pivot_name = “country_sales” Then
  ‘note the name must go first so we can check for a ‘Row Labels’ Position
On Error GoTo continue2
source_attribute = ActiveCell.PivotField.Name
On Error GoTo continue3
source_field = ActiveCell.PivotItem.Name
continue2:
continue3:
‘check we have the correct source
If (Len(source_attribute) > 10 And Left(source_attribute, 10) = “[Measures]”) Or _
(source_field = “” And source_attribute = “[Customer].[Country].[Country]”) Then
‘set to all
sC.ClearManualFilter
Sheet1.Cells(11, 4) = “Category Sales for All Areas”
Else
sC.VisibleSlicerItemsList = Array(source_field)
Sheet1.Cells(11, 4) = “Category Sales for ” & ActiveCell.Value
End If
 End If
End Sub

Conclusion

Adding some interactive functionality to a pivot is a pretty simple exercise which relates to the identification of a cell value and, in turn setting a slicer. This improves the end user experience for excel reporting books.

Viva Excel and black IT J. A little cheeky!

OLAP Rigidity wins against Tabular?

There are many reasons why you might choose a Tabular Model over Multidimensional one. Marco Russo discusses some of the pros of tabular in his blog here which general relate to the flexibility that the tabular model gives you. However, one reason surprising reason for choosing multidimensional may be its rigid dimensional structure and the certainty that this structure gives in changing data environments.

A Multidimensional dimension provides many ways to record information against an attribute. Most importantly these are the attributes KEY, NAME and VALUE which are set under the dimension source properties. For example we can see the [State-Province] attribute from the Geography dimension in Adventure Works as below. The Key is the only property that must be set for an attribute so if no name is specified, the key will automatically be applied as the name.

Note that in the tabular engine, there is no concept of the KEY or Name. These are not specified as native DAX refers to column value(s).

Referring to Attribute Members

When an MDX query refers to members of the attribute it can do so by using the member name or the members unique name. Thus, the two MDX statements are equivalent;

— query using members unique name

select

[Measures].[Internet Sales Amount] on 0,

[Product].[Product Categories].[Category].&[4] on 1

from [Direct Sales]

 

— query using members name

select

[Measures].[Internet Sales Amount] on 0,

[Product].[Product Categories].[Category].[Accessories] on 1

from [Direct Sales]

 

There are many reasons why you would choose the first query over the second, namely, the second can give unexpected values when there is more than one member with the name being sought. That is, if there were two product categories (with different keys) and the same name (as Accessories), then the query would return only the first member. One might expect that the query would aggregate the results but this does not occur.

Once the query has been written, the members unique name is confusing and many people complain that it lacks reference to the underlying data (after all its impossible to know what does .&[4] means anyway). However, it is still the best way to refer to a member and this is the way most client tools generate MDX (using the unique_member_name).

Note that since the tabular engine has no concept of a key for an attribute (you don’t uniquely specify the key or name for an attribute) the MDX equivalent passed to the tabular engine uses what we would otherwise consider the name as the key. Thus, as in the example above, the tabular equivalent for this member is [Product].[Category].[Category].&[Accessories]. To an OLAP (multidimensional) developer, this type of key (ie the name as the key) is generally considered a real NO NO and contravenes best practice.

 

Client Queries and the interesting stuff.

For tabular models, the unique name for a member is generated by the name. Furthermore, most client tools refer to the key in their script. I have even seen some tools hard code the published member for parameters. So, what happens when the underlying attribute name changes? The change could break your report.

We can easily demonstrate how this error would occur using Excel as a client. Suppose I have a pivot based on a multidimensional cube and I convert the simple pivot to formulas (as below). The Accessories member in Cell A2 is hardcoded (the formula is in D2 and uses the key [4] as above). If the name of this member changes, the spread sheet still works fine, and the new name is displayed.

However, for the tabular example (as below), when the hardcoded name is changed, the MDX returns a #N/A error for the CUBEMEMBER function.

In the second picture, the name of the member has changed to [Accessories & Other] however the ‘report’ still relies on the hard coded value .&[Accessories] and the error occurs.

Conclusion

The use of a ‘key’ for each member would offer a way to solve the problem as addressed above and this is only available in multidimensional dimensions. To be fair though, the problem is as much a function of design as it is of structure. A multidimensional dimension would still raise the issue if the attributes name had been used as a key (but no-one does that do they?).

Measure Selection in SSRS and SSAS

Pivoting around the measure improves report usability by adding an additional dimension that can be utilised in reports. A common example of this is a trend report that allows the user to select the measure that they wish to see. There are a few ways to accomplish this in your solution in this post looks at some of them. The methods examined are as a static parameter within the report definition, as a flexible parameter within the report definition, and finally, as a utility dimension.

In this example, we will look at pivoting a monthly trend report around two measures (actual and budget). The output for our reports will look like a screenshot below. Notice that the measure is selectable by the user.

In these reports we have created a measure parameter (named @measure_name) with two possible values as follows;

Parameter Value Parameter Label
=“[Measures].[Native Amt ACT]” Native Amt ACT
=“[Measures].[Native Amt BGT]” Native Amt BGT

Note that the value of the parameter should be included as an expression(otherwise an error will occur).

Static Parameters

Static Parameters transform pivot the data in the data set result. All possible measures I return by the query, and the dataset determines the pivotal value based on a formula. Therefore, in the base query looks like the one below, notice that both [Native Amt ACT] and [Native Amt BGT] are returned;

with
member measures.month_key as
[Dates].[Fin Years].currentmember.member_key
select
{
measures.month_key
,[Measures].[Native Amt ACT]
, [Measures].[Native Amt BGT]
} on 0,
strtomember(@year).children
on 1
from ….

Our data set includes an additional, added field that determines what measure to show (as an iif a function)

=iif(Parameters!measure_name.Label=“Native Amt ACT”
, Fields!Native_Amt_ACT.Value
,Fields!Native_Amt_BGT.Value
   )

In order to have the data set fields visible in the expression editor of the query must be saved and then reopened. Of course our report pivots around the Measure_Val amount.

Flexible Parameters

Flexible parameters are transformed within the query, that is, the parameter is defined as a member (measure) within the query. That is, at query now converts the parameter value @measure_name to a measure which is defined in the query.

with
member measures.month_key as
[Dates].[Fin Years].currentmember.member_key
member measures.measure_val as
strtomember(@measure_name)
select
{
measures.month_key
, Measures.measure_val
} on 0,
strtomember(@year).children
on 1
from
The @measure_name parameter must be included in the query parameters (as below);

Utility Dimensions

Utility dimensions are probably best known for their use in time intelligence and the by-passing the SSAS wizard for time intelligence (see here). Generally, unsophisticated users find individual measures relatively easy to use in their reports or pivots because it allows them to select a measure without thinking about the ‘time dimension’. You can simply stack measures and create calculations by referring to individual measures rather than the tuples created by a measure and a dimension combination.

A measure utility dimension can be created in the following manner;

1. Add a table to the database (or view with data) to produce the desired characteristics (ie, those measure names that we want to allow users to select). As an advanced option, we can also include additional properties to allow the MDX query to return a subset of dimension members.

2. Add the table to the DSV

3. Create a dimension based on the table so that the measure_name field appear as captions for members. There may be a logical case for setting the ‘IsAggregatable’ property to false and set a default member however, there is no need to do so because the user will select the member (or measure value) that they require and the default value for the Measure_Val measure is null.

4. Add the measure (Measure_Val) to cube and scope the intersects for selected [Reporting Measures] definitions.

create
member  currentcube.Measures.[Measure_Val] as null
, visible=0;
scope(Measures.[ Measure_Val]);
scope([Reporting Measures].[Measure Name].&[Actual TM]);
 this = [Measures].[Native Amt ACT];
end scope;
scope( [Reporting Measures].[Measure Name].&[Budget TM]);
 this=[Measures].[Native Amt BGT];
end scope;
end scope;

Now in the report is created, the [Reporting Measures] dimension can be used as a parameter in the report. In fact, we can specify the member values as the code for the ‘measure’ parameter selection. In this case, our @measure_name would have the available value set to query, on the query would return the members of the [Reporting Measures] dimension

with

member [measures].[measure_label] as

[Reporting Measures].[Measure Name].member_caption

member [measures].[measure_value] as
[Reporting Measures].[Measure Name].member_unique_name
select
{
[measures].[measure_label]
, [measures].[measure_value]
} on 0,
[Reporting Measures].[Measure Name].[Measure Name] on 1
From …

The report data set uses the [reporting measures] dimension is a parameter in the query is changed to use the dimension as below. Note that in this situation the Measures.Measure_Val member is a measure within the cube and not a definition within the MDX.

with
member measures.month_key as
[Dates].[Fin Years].currentmember.member_key
select
{
measures.month_key
, Measures.measure_val
} on 0,
strtomember(@year).children
on 1
from
where
( strtomember(@measure_name)
, …
)

The method of creating utility dimension is quite flexible for the SSRS query designer because all values and dimensions can be selected from the within the SSRS query designer without the need to write MDX.

Conclusion

There are a number of options for allowing the user to select measures their reports. The method chosen may depend on the uses ability right custom MDX and the need to duplicate measure selection throughout the reports (or the reporting environment). Additionally the method chosen may depend on the rate of change for the ‘reporting measure’ selection . When utility dimension is used, production versions the cube can easily hide the measure [measure_value] and [Reporting Measure] dimension in this may remove confusion the cube is browsed ad hoc manner.

Introducing DAX Studio

If you have tried to query a tabular model using DAX, you will soon realise that it is not an easy undertaking. Why is it hard? Well, for one, SSMS isn’t currently much of a friend for tabular models. When you connect to a tabular instance with SSMS the model is displayed in a multidimensional format. The only tabular meta-data which you can easily derive are some table names (dimension names), some field names based on attribute hierarchies and the names of measures (without context for the table that they are built in). This post looks at DAX Studio which is an Excel 2010 Addin available on CodePlex. It is designed for DAX and tabular models. When released, the installer can be down loaded from here.

What’s the Issue with SSMS?

SSMS does not show useful metadata for building DAX queries. Further, its functionality is suited to MDX queries with the drag and drop name placement showing the multidimensional naming standard. In this post, I have created a really simple tabular model with three tables as shown below. It has a sales fact and dimensions of Stores and dates and you can see that all the fields from the Sales table are hidden from client tools except for calculated values (‘Sales Amount’ and ‘Store Ratio’).

When I connect to the model in SSMS (as below), the model presents itself as a multidimensional model.

Some of the implications of using SSMS are that;

  1. The object explorer in SSMS shows only tables and you can’t easily determine fields.

     

  2. Visible table fields are shown as attribute hierarchies so (for example) we can surmise that the dates table has fields named Month and Year.

     

  3. When all the fields for a table are hidden (including hierarchies), the table is not shown as a dimension in the multidimensional explorer. This behaviour may be expected but it doesn’t allow you to see the structure for the table. In the object explorer, you can see that there are three tables (dates, Sales and Stores) which are the same tables as in the model. In the query window, tables in the model are exposed as dimensions to the extent that at least one table field (or user hierarchy) is visible. Note that the Sales table is not visible is the query browser because the table has no visible fields in the model.

     

  4. Measures are shown under a display folder with the same name as the table in which the measures were created. The measures defined in the Sales table appear in the Sales display folder. Again, this may be expected however, as a user, I cannot do not see a tabular view of the objects in the model (fields and calculations).

     

  5. Hidden fields and measures are not visible. Again while you may expect this, the nature of DAX may require you to query a hidden field from a table.

     

In general, the main concern for using SSMS is that you cannot see the model structure and metadata. You can derive some of it (for example, through attribute hierarchies), but SSMS you does not allow you convenient access to the model structure. Additionally, when you drag an object which you would expect to be a field (for example store_name in the above model), the field appears with the multidimensional naming convention ([Stores].[store_name]).

What’s the Solution

Dax Studio is an Excel 2010 Addin available on CodePlex that allows you to connect to any tabular model, examine the metadata and query it. Excel is a good launch application because it allows the PowerPivot pro’s to also query their models without the requirement of a tabular server. Once installed, you will notice that the Add-Ins tab on the ribbon appears (if you and DAX Studio is added as an icon). DAX Studio does not form part of the PowerPivot tab.

The tabular model used above is shown through DAX Studio and the model structure is immediately visible from the model explorer (metadata tab as shown below).


Query Results

As an Excel Addin, output are generally targeted at an excel sheet. You can set both the output method and target destination through the run button drop down and output results drop down. These are discussed below.

The Output Method (found with the Run button drop down) allows DAX table results to be outputted as a table (that is an excel table), Static Output (that is a Simple Sheet) or no result output (Validate which validates the command). These sample outputs are shown below.


Table Output

Static Output

Grid Output

 

The Output Target (identified below) specifies what sheet the results will be sent to (where an Excel sheet is the target of the output). By default results are sent to a sheet titled ‘DAXResults’. This is added if the sheet does not exist in your workbook.

Connecting to a Tabular Server

The current server connection is managed through the server icon and the active Model identified through the model drop down. These are both highlighted in the captures below.

When you connect to a server (dialog show below), allowed models are populated.

Server Connection Dialog

 

Connecting to PowerPivot Model

DAX Studio allows you to connect to a tabular model which is contained in a workbook that has a PowerPivot Model. In order to do this, you must launch DAX Studio from where the active cell is a PowerPivot Pivot Table. That is, select a cell in a Pivot Table that is derived from a PowerPivot model and then launch DAX Studio. You will notice that the model is titled Microsoft_SQLServer_AnalysisServer. You can see the outcome of these steps (1, 2 and 3) in the diagram below.

  1. Ensure that a PowerPivot Pivot Table is selected
  2. Launch DAX Studio (this icon is in the Addins Ribbon Tab)
  3. Note the Model name ‘Microsoft_SQLServer_AnalysisServer’

If you have launched DAX Studio from a PowerPivot Pivot Table, you can still connect to a tabular server and then reconnect to the Excel PowerPivot model at a later stage. You may have noticed that the connection dialog (above) specified no PowerPivot models in the Excel Workbook. If launched from a PowerPivot Pivot, this option is available and you can re-query the workbook model without reopening DAX Studio. This is shown in the two screen shots below

When DAX Studio is launched from a PowerPivot Pivot, the connection dialog identifies the workbook as a tabular model. You can still connect to a tabular server and then reconnect back to the workbook. Note the PowerPivot Model option in the connection dialog.

If you do not open DAX studio from a PowerPivot Pivot, a tabular model that is in the workbook is not available for use as a connection.

 

Other Cool Features

One other noteworthy feature of DAX Studio is the inbuilt support for SSAS Dynamic Management Views which can query both to Tabular and OLAP servers. The DMV tab exposes these schemas which (when dragged across) build the full select syntax (as shown below). I reiterate, the DMV queries can be run against existing OLAP servers (which you can connect to with the connection manager).

 

Tabular verse Dimensional Design Concepts

The multidimensional model appears to provide a richer environment for model design. However, for the case of snapshot fact tables, the tabular design may offer a much faster design and refreshing options. This post looks at the snapshot design and how business requirements are modelled in both multidimensional and tabular platforms.

The Snapshot Fact

Snapshot data (or accumulating snapshot fact tables) are usually used to record process driven data where records change over time. Consider, for example, the tables FactResellerSales and FactInternetSales tables in AdventureWorks. Here, the record shows Order Date, Due Date and a Ship Date. When the sale is made, the Order Date is known and the Due date is probably known, however, the ship date can change and is only truly known after the order has shipped. For this type of fact table, data should be added when the sale occurs and updated as information about the sale comes to light (that is, when it is shipped).

Classic Star Modelling

The standard way to model this relationship is through the reuse of the date table. That is, each fact field relates to the same Date.DateKey field in the dates table. This is shown for the tabular and multidimensional design models below

Tabular

Multidimensional

 

Multidimensional Interpretation

When a cube is built from this design, the Dates dimension becomes a role playing dimension and is joined to the fact for each relationship identified in the data source view. Thus, the Dates dimension is reused by the cube and with the dimension names appearing as the name of the fact field (after camel case conversion). Notice that there is a single dimension in the solution however, there appears to be three date dimensions (Order Date, Due Date and Ship Date).

Solution View

Cube View

 

While this approach may provide answers to simple business questions (eg, What is that value / quantity of product shipped on date xyz), the modelling technique fails when the query becomes complicated across restrictive dates. For example, it is not straight forward to determine the quantity ordered, shipped and due on date xyz.

Tabular Interpretation

In contrast to the multidimensional model, the tabular model employees an active relationship as the default join between fact and dimensions and each table appears only once in the ‘dimensional representation’ of the model. For example, there is only one date table in the pivot view.

By default, aggregation functions will use this relationship. In this situation, the active relationship (solid line) is between the OrderDate and the Date table. A sum measure ( sum([OrderQuantity])
) defined without context will therefore show the quantity ordered on each date.

The tabular model also allows functions to specify which relationship will be used as joins between tables. Therefore, the quantity of products shipped on a date can be determined by specifying the relationship between ResellerSales.ShipDateKey and Dates.DateKey. For example,

Ship Quantity:=CALCULATE(sum([OrderQuantity]),USERELATIONSHIP(‘ResellerSales'[ShipDateKey],’Dates'[DateKey]))

This allows the determination of measures that relate to more generic dimensions. For example, we can easily define [Order Quantity], [Ship Quantity], [Due Quantity] which specifies these values by date. This is in direct contrast to the default multidimensional behaviour and allows for more native browsing. For example, the date x value pivot below quickly identifies the sparse nature and trend of adventure works data.

I had never looked at adventure works data like this before. Here we can easily see that products are ordered on the 1st of the month, shipped on the 8th and due on the 13th. There are very few exceptions to this in the fact data.

 

Compromise?

The UDM can be designed to produce this outcome; however, it is not part of the ‘default’ behaviour. One way to achieve this would be to conform all dates into a single field (for example through a union) and specify separate measures for each union join (ie; add the Order Data, then the Ship Data and finally the Due Date data). However, this would require longer load times (since we are effectively staking facts) and increase the measure group size. The tabular approach (in my opinion) is a much nicer compromise.

NB It is also easy to mimic the multidimensional behaviour in tabular. Mulitple date tables are added to the model (one for each fact date) and labelled ‘Ship Date’ , ‘Due Date’, …