Automation of Excel Macros

There is still a huge value of reporting and using Excel for end users.  Excel is simple to work with, allows data access and allows end users to program with minimal learning.  As users develop their skills they inevitably look at automation through macros.  So if we want to turn this into something a little more ‘supported’ (by the corporate IT department) so that its scheduled on some manner, we’ll have to turn to a bit of programming.  So, lets look at how we can do this.  Firstly we’ll look at some rudimentary code and then we’ll look at scheduling.

Code

So here is a snippet to do that (including the reference).  You’ll notice that there are a couple of key features.  There are 2 parameters passed (the file name and the macro), then (in code) we simply open excel, then the workbook, execute the macro and save and close the workbook (and application)

using xls = Microsoft.Office.Interop.Excel;

static void Main(string[] args)
 {

    xls.Application _x = null;
     xls.Workbook _wb = null;

    string _filename = args[0];
     string _macro = args[1];

    _x = new xls.Application();
     _x.Visible = false;
     _wb = _x.Workbooks.Open(_filename);
     _x.Run(_macro);
     _wb.Save();
     _wb.Close();
     _wb = null;
     _x.Quit();
     _x = null;

}

So, if I want to run my code from the command console, you’d just run the following (note the macro is called main).

ExcelManager.exe “this-is-the-excel-workbook-name” “main”

or as a SQL Command, wrap it in a batch and then execute;

EXEC master..xp_CMDShell ‘c:\temp\excel_manager.bat’

A Gotcha

In debug mode, this all worked and the task could be scheduled and executed in windows scheduler.  However, when the the tasked was scheduled (SQL Server agent) or scheduled to run when a user was not logged in there would be be errors.  This related to file not found errors and other (generally other unhelpful and unreliable messages).  It turns out, that you’ll need to ensure that there are some folders required (don’t ask me why).  You may want to check if the following folders exist.  If not, try adding them;

C:\Windows\System32\config\systemprofile\Desktop
C:\Windows\SysWOW64\config\systemprofile\Desktop

An example of one of the type of errors you may get is;

excel_automation_error

Changing Excel Table Connections in VBA

In previous posts, I’ve discussed using VBA to extract connection information from Pivot Tables and hacking a tables command so that you can programmatically change the results of a query. But what if you have a table and you want to change its connection and its command?

This can be achieved pretty easy using the tables .QueryTable object. In fact, we can change the entire table using this class.

So how does it work? Suppose, I’ve inserted some data into a worksheet from SQL Server. The menu path is DATA à From Other Sources à From SQL Server. Next I get the familiar connection windows to connect to a server and then select a table (as below).

I’ll give the table a friendly name (adventure_works) and return the data to a table. I get a connected table which is sourced from a query and can be updated (either right click somewhere in the table and select Refresh from the popup menu or select the refresh button from the TABLE TOOLS à DESIGN (ribbon) menu).

I can also use the DESIGN of the table tools to give my table a decent name by just replacing its name (as in the following image).

Now I want to change the database server (and potentially the query) that my table uses. There are 2 things to keep in mind here. Firstly, we need to change the query’s connection string (no brainer eh) and secondly, the table (in Excel) was added as a table.

As in the previous post, I can refer to the table as a list object in VBA, this is a property of the sheet, so, in order to reference the table I’ll have to use some code like this (note the table was on Sheet2).

Dim li As ListObject

Set li = Sheet2.ListObjects(“aw_employees”)

For the connection string, I can refer to it through the ListObjects QueryTable class. So I could determine what the current connection string or set it by using the following code;

Debug.Print li.QueryTable.Connection ‘retrieve the connectionstring

li.QueryTable.Connection = “OLEDB;Provider=SQLOLEDB.1;…(this is my connection string) ..”

If I then refresh the table (using li.Refresh), one may think that’s the end of the story and our works completed – but that’s not quite true just yet.

Remember that we added the table as an SQL table? Because of this, the ‘query’ for the .querytable actually refers to a table including the SQL database and has its .CommandType as a table. If we debug the code to determine the command (as below) we could see this

After we change the connection, we could set the new table to use employee (without a database) with something simple (like);

li.QueryTable.CommandText = “dbo.DimEmployee”

li.Refresh

Or, we could change the query type of an Command, and provide its text. This would suffice for that;

li.QueryTable.CommandType = xlCmdSql

li.QueryTable.CommandText = “select * from dbo.DimEmployee”

li.Refresh

That’s about it, we can change the tables connection, its query and have these changes reflected in our table.

Flawless Excel in Business Intelligence

Somewhat prompted by some recent posts about the inherent error associated with Excel as a BI tool (and more importantly how some magical tool solves this problem) I thought I’d discuss Excel as a BI tool. In-fact, it’s one of my favourite pondering points “what role should Excel should play business intelligence”

As I’ve stated about it’s a discussion that’s often prompted by claims that a product improves the inherent risk associated with Excel (and to a lesser extent Access). Vendors often suggest that their Excel support mitigates the inherent risk associated with Excel because it adds a corporate security and authorised data capabilities the users trusted tool (often it is argued that Excel the only tool business users want to use or perhaps can use).

There are several flavours presented in the pitch (often several of them are presented in the same discussion) so let’s look at a few of the key selling (or scaring) points;

  1. Excel has inherent risk and is error prone. For this, I’ll take an extreme case since it suites my discussion. The nature of these stories is that Excel caused some monumental loss because of an error in a formula or there was some other error in the spreadsheet. There’s nothing like a bit of fear so make you sit up and pay attention is there? If you have not had this error yet don’t worry because you will … it’s only a matter of time (in fact we’ll even through in some statistics to show just how common this can be). It’s no big deal if you haven’t found it yet because you just haven’t looked hard enough. Infact some vendors even go so far to claim that users are arrogant (now that’s going to mend the bridge between IT and Business). But then again, there’s nothing like creating a bit of division is there – ‘oh yeah now I remember divide and conquer’
  2. The Excel (or Access System) is undocumented and the removal of the Key Person that developed the tool would cause monumental loss to the company. How could any responsible manager allow a system or some reporting tool to operate if it is reliant on a key person … this should never happen should it? If you want to be a good manager (and there’s the no back-out argument), ensure business continuality you need a system that’s IT sanctioned with proper documentation – so who wouldn’t want or need that that?
  3. Oh yes – did I mention that IT hasn’t sanctioned your Excel (or Access) work? They don’t support it and you can’t go to them with problems. This really enforces the risk associated with point 1 & 2. If there’s an issue then you are own your own. This is a very compelling argument that enforces a CYA mentality – perhaps we should rewrite the argument to the words ‘if you can’t transfer blame, you better not take responsibility for it’.
  4. End users should focus on data analysis and not data preparation. Here we are presented with the classic position that too much of time is spent in data preparation and not analysis. Could this possibly be a hint to suggest that an unproductive employee that could get so much more done if only used the tool.
  5. IT takes too long or does not do it right. This is such a common selling point that it is almost taken for granted (especially in the realm of reporting). A pitch that focuses on the ‘corporate lifecycle of reporting’ may go something like this;
    1. The business user describes the report to the IT analyst
    2. Sometime later the IT guys get around to writing the spec for the report and pass it to the dev queue where the report writer creates it, this doesn’t happen straight away though because the report writer is so busy. So say 6 months later the user gets their report.
    3. The report is inevitability wrong and the cycle begins again.
    4. By now the business has moved on and the business user needs a new report.
  6. Users need Excel interaction. If there’s been no justification presented above we need one resounding point – the final nail in the coffin (as it where). The pitch here focuses on user needs. We often hear very compelling statements along the lines that users just want to use Excel because that’s what they know or it’s just the way they work. Picture the preparation of a presentation we the analyst grabs some information from one report, creates a LOOKUP from data from another and then creates a chart to paste into Power Point. Based on this story it’s easy to see why users always are going to use Excel (well so the pitch goes).

I have listed a more than a few points here but I think they can be summarised into a few categories.

  1. Relying on Excel for the manipulation of data is a risky proposition (so you need some involvement of IT to provide corporate data to manipulate) and
  2. Users need Excel.

Now the vendor presents their tool – something to solve all the problems listed above, but how appropriate are such tools in reality? Aside from the idea that they present a very narrow perspective of what BI can do, we will now focus on how the vendors tool satisfies the gaping hole in the end users tool set.

Now (the vendor) present the age of enlightenment in their product … our (their) tool solves all these problems because it;

  1. Is managed by IT –hosted so that there will not be any issues associated with lost or deleted files.
  2. Has gone through a rigorous testing process.
  3. Can be used by anyone. The software is so intuitive that any user can pick it up and use it
  4. Delivers data to users in the format they want, that is, its accessible in Excel.

As convincing as these arguments might appear on the surface there’s a real problem with this line of thinking and we can sum this up in two ways;

Firstly to paraphrase Lyndsay Wise (W|T) – if users need Excel then the BI tool is not doing what it should. While I could go into a long winded discussion, I think the statement is succinct. Where users have to bring data out into Excel to manipulate it (you don’t tend to hear the buzz word Mash Up anymore do you?) then there’s a pretty convincing argument that the sanctioned solution just does cut it (sanctioned as in corporate structure). If we think about it for a minute longer and push the case for Excel we’d realise that what we are really saying is that our business model is so inflexible it needs an additional layer of manipulation in order to provide any real value. This may be the case but is so shouldn’t the problem be attacked at the source and not downstream, that is, when it enters the model and not when it’s spat out?

Now for the second point – one where the tool somehow removes error where there is some form of manipulation required. I am very interested to know exactly how any tool achieves this magical proposition (and have never seen a convincing argument by a vendor). At best, you’ve got a cell location (or multiple) that links back to corporate (and supposedly agreed) data. However, if we look at process of what Excel is being used for, we are faced with exactly the same issue that would be present had they not had a tool. That is, they are free to use the data how they wish. After all, that’s the reason why Excel is so popular – it allows the user (of any skill level) a programmatic interface for working with data. The tool adds no comfort than would otherwise be available from Excel.

If we would like to suggest that the availability of data in Excel is the key to Excel’s woes as a corporate BI tool (and those nasty errors that are so well published) then we are simply adding another layer for users to get wrong. After all the placement of data relies on the end user finding the right information to begin with!

Not surprisingly the argument for safety around Excel as a BI tool resolves in the user and not the tool. Once its placed in Excel, data is at the direction of the user and it is their ability that we seek mitigation against.

 

Extracting Connection Information from Pivot Tables

Extracting Connection Information from Pivot Tables

It’s pretty easy to add pivot tables to worksheets – even easier to add them when the source is a cube. However there are a couple of ways to go about adding them which could make your life easier – that’s one of the things we look at in this post.

When you first create a connection to an OLAP source (and I am including tabular in that), a connection is created in the work book. So just for fun, let’s go through the process and connect to the cube. On the Data tab chose From Analysis Services from the Other Sources dropdown.

We’re asked for a server (not shown) so let’s put in the correct details and then click next. Then we get a list of cubes and perspectives to connect to (I’m using adventure works here so the names appearing in the image below would be familiar to most).

We then have the option to specify a few additional details. Perhaps the most important of these are the filename that stores the connection and the friendly name. Most of the time though we probably discard putting any thought into this and just click next. However in the image below, I’ve specified the filename and the friendly name. Finally we can click Finish and get to work on the pivot table.

The Next Pivot Table

Now when it comes to the next pivot table we can run through the process again – if we chose the same names we’ll get a warning but let’s not go into that. Alternatively, we can just insert a new pivot table and select an existing connection.

You can see this in the screen flow below. When we create a new pivot, we can choose the external source and, then the connection.

There’s one big reason why you’d want to insert a new pivot like this – it’s a lot easier to maintain the workbook when you need to understand where data is coming from (or you need to change data sources).

The Detail – Getting Connection and Pivot Info

It would be a pretty ordinary post if that’s all I had to say (no code after all) – but now the fun starts. Workbooks never finish up as simple and elegant as we’d like – NEVER. It’s easy to add new data sources, new pivots and even pivots of data sourced from other pivots. Throw 20-30 of them in a workbook and your left scratching your head as to where the information is coming from and what do I need to update (say when the data source changes). If you only had one connection it would be easy, but life’s not like that and there’s connections flying around everywhere. So I needed to look at each pivot, determine its source and connection string.

Here’s the code:

Sub GetPivotData()

  Dim wks As Worksheet

  Dim pt As PivotTable

  For Each wks In Worksheets

    For Each pt In wks.PivotTables

      Debug.Print wks.Name & “:” & pt.Name

      Debug.Print “Cache Index : ” & pt.CacheIndex

      GetCache (pt.CacheIndex)

Debug.Print “— break —“

    Next pt

  Next wks

End Sub

 

Function GetCache(Index As Long)

  Dim wb As Workbook

  Set wb = ActiveWorkbook

  Dim pc As PivotCache

  Set pc = wb.PivotCaches(Index)

 

  If pc.OLAP Then

    Debug.Print “Is OLAP : ” & pc.OLAP

    Debug.Print “Connection Name : ” & pc.WorkbookConnection

    Debug.Print “Connection String : ” & pc.Connection

  Else

    Debug.Print “Is OLAP : ” & pc.OLAP

    Debug.Print “Data Range : ” & pc.SourceData

    End If

End Function

 

My Sub (GetPivotData) loops through each sheet on the workbook, getting each pivot on the sheet and then calls a function (GetCache) to get the cache (or connection) information for that pivot. While I could get connection information by just adapting the function to iterate over the slicer caches (as below), I’d like to associate each pivot table with its connection.

Additionally, I want to identify any pivots that are not based on an external (OLAP) source. To do this, I can use the Cache.OLAP property and retrieve the appropriate connection info.

 

 

Creating Tables in Excel with VBA and External Data – Part II

In Part I, we looked at adding a data table to a Workbook which uses an external data source (we used OLAP but the process can be applied to any source). This post looks at manipulating an existing table.

Without VBA, we can manually manage the table – both the its properties and the underlying query. Simply right click on the table and select the Edit Query.. or External Data Properties from the popup menu. Changes are made to the table data are made automatically.

If we chose to edit the query, we can simply overwrite the Command Text (as seen in the image below). These changes well be automatically applied (including resizing the table by rows or columns for a different sized data set) once the OK button is clicked.

For External Data Properties, we can configure how the table reacts with new data. For example, you may notice that, the table accommodates additional rows and columns however, when the query returns a table with fewer rows and columns, the table retains its old sizing (number of columns) and includes a blank columns (for data that previously existed). You can manually resize this (dragging the bounds of the tables border) or set the properties of the table to overwrite existing data. If you want to ensure that this option exists and that new sizes are automatically incorporated into the table – make sure that the check box for Overwrite is marked in External Data Properties.


VBA

Now to the VBA – As commandant Lassard would say “There are many, many, many, many fine reasons to use VBA”. We have so much flexibility but let’s keep it simple, here’s what we I’ve set up.

Cell B1 is data validated based on the cell range D1:D2 – nice and simple. When we change that cell, the table updates for the new country.

In order to determine if the there is a change in or data (the Country selected) we have to create a worksheet event to capture and test the change. I have gone into this in some detail here and the code is below. Note that this needs to be added to the sheet code (not in a separate bas module). All we do is check that our only B1 is updated and then call the refresh subroutine.

Private Sub Worksheet_Change(ByVal Target As Range)

  ‘ update table if cell 1,B is changed
If Target.Cells.Count = 1 And Target.Cells.Row = 1 And Target.Cells.Column = 2 Then UpdateMyTable

End Sub

Now for the updating component – the bit that’s called when cell(B1) is changed. I think this is pretty straight forward but I’ll walk through it anyway. First, the code;

Public Sub UpdateMyTable()

  ‘ ensure that any new changes are reflected in the table dimensions
Sheet1.ListObjects(“Table_abax_sql3”).QueryTable.RefreshStyle = xlOverwriteCells

  ‘ set the comand text
Sheet1.ListObjects(“Table_abax_sql3”).QueryTable.CommandText = NewQuery(Sheet1.Cells(1, 2))
Sheet1.ListObjects(“Table_abax_sql3”).Refresh

End Sub

Private Function NewQuery(Country As String) As String

NewQuery = “select {[Measures].[Reseller Sales Amount] } on 0, ” & _
“[Product].[Category].[Category] on 1 ” & _
“from [Adventure Works] ” & _
“where [Geography].[Geography].[Country].&[” & Country & “]”


End Function

I’ve kept the same format as in the original post. The function NewQuery determines what the MDX should be – based on the provided country. All we is set the tables command to the new mdx (in (QueryTable.CommandText)) and refresh it.

I’ve also set the refresh style so that any changes in the command (grid size) are automatically reflected in the worksheet table.

That’s about the size of it! – I hope you find it useful.

Creating Tables in Excel with VBA and External Data – Part I

This post looks at how we can add a table to an Excel sheet which uses a MDX query as its source. This is a very handy feature to use for a couple reasons;

    1. The table retains the connection OLAP source (hence can be updated by a user at will)
    2. We can use it to extract data from MOLAP or tabular sources (i.e. run MDX or DAX)
    3. We can define complex queries to return a result set that cannot be obtained with a pivot table

Note that most workarounds for creating a table from OLAP sources rely on the creation of the pivot table, its formatting is a tabular source and a copy and paste the values. Hardly an attractive option!

  1. We can use the table!! – (This is really important for certain activities like data mining table analysis)

How to Do It

We’ll look at a simple query from adventure works;

select [Measures].[Reseller Sales Amount] on 0,
[Product].[Category].[Category] on 1
from [Adventure Works]
where [Geography].[Geography].[Country].&[Australia]

and an OLEDB connection string (note the OLEDB specification at the start of the string)

OLEDB;Provider=MSOLAP;Data Source=@server;Initial Catalog=Adventure Works DW 2008R2;

I have incorporated those to strings into 2 functions (MyQuery and MyConnectionString) – this just removes some of the clutter from the code.

Now we just need to use the ListObjects.Add method. The code (now in with all Sub’s and Functions) is pretty much the bare bones you need to add the table. In other posts, I’ll look into higher level of control for the output.

The CODE

The complete code is shown below. Ive included everything so it can simply be pasted into a new VB module

Sub CreateTable()

  With Sheet1.ListObjects.Add(SourceType:=0 _
, Source:=MyConnectionString() _
, Destination:=Range(“$A$1”) _
                            ).QueryTable
.CommandType = xlCmdDefault
.CommandText = MyQuery()
.ListObject.DisplayName = “MyMDXQueryTable”
.Refresh BackgroundQuery:=False
.PreserveColumnInfo = False

  End With

End Sub

Private Function MyQuery() As String

     MyQuery = “select [Measures].[Reseller Sales Amount] on 0, ” & _
“[Product].[Category].[Category] on 1 ” & _
“from [Adventure Works] ” & _
“where [Geography].[Geography].[Country].&[Australia]”

End Function

Private Function MyConnectionString() As String

     MyConnectionString = “OLEDB;Provider=MSOLAP;Data Source=@server;Initial Catalog=Adventure Works DW 2008R2;”

End Function

Walk through

This is pretty much the bare bones approach. As code walk through (see Sub CreateTable), we add the list object specifying its connection string and destination, set the command and refresh info. The only statement that is not entirely necessary is naming the table (see .ListObject.DisplayName) but I tend to think is a good idea because we will want to refer to it by name at a later stage.

Out Come

The code will add a table like the one in the following image. The field names are fully qualified which is not that nice and we will look at how this can be changed in another post. For now, our purpose is to get a table is in the workbook (the purpose of this post) so that it can be used as a table and refreshed.


PS – the code above adds the listobject by reference to the sheet within VBA (see Sheet1.ListObjects). Its probably worthwhile to point out that this is the sheet reference (ie the number of the sheet in the book) and not the name of the sheet.

One more thing – when the query uses refers to a attributes in a hierarchy the OLEDB result set (table) will include parent attributes of the hierarchy as a column. This is nothing to worry about for the moment!

Next – changing the tables query.

Microsoft Tabular Modeling Cookbook

I am pleased to announce that my tabular modelling book is finished.  The title is succinct – ‘Microsoft Tabular Modeling Cookbook’ and its available from packt (and others) at http://www.packtpub.com/microsoft-tabular-modeling-cookbook/book

There is (naturally) a lot in the book – but it is designed to get you up and running fast. We look at all flavours of modelling – both power pivot and SSAS and pretty much everything in between.

Of course any comments and feedback is welcome!