The Last Non-Empty Dilemma TABULAR Style

In a previous post, I addressed the function of the last non empty aggregation in SSAS (OLAP). Specifically, I looked at what happens when there are ‘missing’ chucks of data in the fact table and how the OLAP engine natively carries forward the balance to an aggregated time member without regard for the what is the ‘true’ last date. That is, we expect the last date of stock to be applied for all aggregated date values regardless of whether data exists or not. This post looks at the LNE calculation in the tabular model.

Generally, we expect the stock schema to take the form of the fact (balances) surrounded by the dimension table. The fact table (Stock) holds the [Quantity on Hand] (on_hand) for a given date with the Dates table showing a natural hierarchy of Year, Month and Date.

Summing the column on_hand will allow the calculation of daily balances (we will be able to show correct balances when a date is selected) however, if we use any aggregated Date field, the [Stock on Hand] value will be incorrect because the formula adds the amount for all days filtered by the date restriction.

For example, if we consider a very small subset of data (shown as ‘Source Data’) and [Stock on Hand] defined as

Stock On Hand:=Sum([on_hand])

We can see that Jan 2011 for Product A (20)=15+4+1 and for product B (2011)=10+8

Source Data [Stock On Hand] using Sum

Clearly, the addition for aggregate time members does not give us what we want.

The next thought is to change the stock filter context to the max of time. That is, return the sum of stock for the last date in the selected time period. For example;

Stock On Hand:=CALCULATE(sum([on_hand]), FILTER(ALL(Dates[date]), Dates[date]=max(Dates[date])))

The problem with this approach is that the aggregate values will only show for the last date in the month (or year). This is not a problem when we have full year worth of data in the fact and the maximum of the date filter is the last date in the period. But in this case we don’t and usually the most recent data will not be complete for the year. For Product A in the picture below, the total for Jan-2011 is correct (1) because the maximum date for Jan is 31-Jan and has data (see source). However, in the year context (ie 2011) the maximum date is 31-Dec-2011 and there is no stock for this and therefore the sum is empty.

Source Data [Stock On Hand] using max(date.date)

To solve this problem, it is important to remember that the filter context applies a filter to the fact. So, for the filter context of Jan-2011 (say when Jan-2011 is on a row), the data in the fact is filtered for all the dates in Jan. What we are really interested in therefore is the last date in the fact for the current filter context.

Stock On Hand:=CALCULATE(sum([on_hand]), filter(Stock, Stock[date_key]=Max(Stock[date_key])))

In this formula, we recognise that the filter context has been applied to the fact table and use this to derive the maximum date for the current filter period and then use this to further filter the fact.

Source Data [Stock On Hand] using max(Stock.date)

Alternate Approach using Transactions

In addition to using the quantity of stock as a field in the fact, we may also consider using the transactional amount. In this case the use of summing against the Date table filter context works because we are interested in summing all data less than the max date of the filter context. In this situation, our formula would be;

Stock On Hand:=CALCULATE(sum([qty_mvt]), FILTER(Stock, Stock[date_key]<=max(Dates[date])))

Attribute Discretisation and Bin Names

Attribute discretization (grouping or binning) can be used to group a large number of attributes into a single member. For example, we could create a group on the income attribute of the customer in AdventureWorks so that similar ranged incomes are grouped together. This post looks at how to implement attribute discretization and more importantly how to control the name of the attribute bins once they are created.

Creating Bins

Attribute Groups (or bins) can be created on any attribute through setting the DiscretizationBucketCount and DiscretizationMethod properties. Once the attribute is discretised, the detail values are no longer available as members for browsing and the detail data is placed aggregated into bins.

There are three methods available (AUTOMATIC, CLUSTERS and EQUALAREAS) which are discussed here. In this example, we will simply create 4 groups of EQUALAREAS on the [Yearly Income] attribute as shown in the picture below.

After processing, the [Yearly Income] attribute shows four values with the bounds of the group indentified in the name.

Before After Discretization

Controlling Bin Names

The bin names are created based on the bounds of the bins created. However, we may wish to control how the member names are displayed by adding some text so that the names are more user friendly.

This can be achieved by formatting the name the attribute, so we set the name property of the attribute (to ‘Yearly Income’) and specify the format content.The content allows the use of place holders that can retrieve information about the bins created. For example, we can retrieve the value of the bins lowest and highest value (or the next bins lowest value).

Furthermore, the format allows a different construct for the first, last and intermediate bins.

The naming template can specify the format for all members or, alternatively specify a template for each level (first, intermediate, last). For example, the template takes the form

<First definition> [;<Intermediate definition>;<Last definition>]

If the Intermediate and Last templates are not supplied, the first definition is applied to all members. Thus, the template

Between %{Bucket Min} and %{Bucket Max}
will produce the outcome shown below;

Before After Discretization First Template

An example of specifying each template level is;

Less Than %{Next bucket first member};Between %{Bucket Min} and %{Bucket Max}; Greater than %{Previous Bucket Max}

Which produces (the final result)

After Discretization First Template All Template

The allowed values for place holder (variables) are;

Variable Description
%{First bucket member} The member name of the first member to be included in the   current member group.
%{Last bucket member} The member name of the last member to be included in the   current member group.
%{Previous bucket last member} The member name of the last member assigned to the   previous member group.
%{Next bucket first member} The member name of the first member to be assigned to the   next member group.
%{Bucket Min} The minimum value of the members to be assigned to the   current member group.
%{Bucket Max} The maximum value of the members to be assigned to the   current member group.
%{Previous Bucket Max} The maximum value of the members to be assigned to the   previous member group.
%{Next Bucket Min} The minimum value of the members to be assigned to the   next member group.
Other Options
The use of discretised groups can reduce the amount of data that is shown to the user, and by using a naming template, we can have strict control over how these bins are displayed to the user. Further reading on TechNet can be found here.

Alternatively, groups can be created by including an additional attribute in the dimension. For example we could include a calculated member in the table or DSV or change the view to achieve the same effect (provided we knew what we wanted in advance). Another method may involve the use of a lookup table that specifies bands and their applicable values.